This is a discussion on SAN vs Internal Disks within the Pgsql Performance forums, part of the PostgreSQL category; --> Hi, We are currently running our DB on a DualCore, Dual Proc 3.Ghz Xeon, 8GB RAM, 4x SAS 146 ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, We are currently running our DB on a DualCore, Dual Proc 3.Ghz Xeon, 8GB RAM, 4x SAS 146 GB 15K RPM on RAID 5. The current data size is about 50GB, but we want to purchase the hardware to scale to about 1TB as we think our business will need to support that much soon. - Currently we have a 80% read and 20% write perecntages. - Currently with this configuration the Database is showing signs of over-loading. - Auto-vaccum, etc run on this database, vaccum full runs nightly. - Currently CPU loads are about 20%, memory utilization is full (but this is also due to linux caching disk blocks) and IO waits are frequent. - We have a load of about 400 queries per second Now we are considering to purchase our own servers and in the process are facing the usual dilemmas. First I'll list out what machine we have decided to use: 2x Quad Xeon 2.4 Ghz (4-way only 2 populated right now) 32 GB RAM OS Only storage - 2x SCSI 146 GB 15k RPM on RAID-1 (Data Storage mentioned below) We have already decided to split our database into 3 machines on the basis on disjoint sets of data. So we will be purchasing three of these boxes. HELP 1: Does something look wrong with above configuration, I know there will be small differences b/w opetron/xeon. But do you think there is something against going for 2.4Ghz Quad Xeons (clovertown i think)? HELP 2: The main confusion is with regards to Data Storage. We have the option of going for: A: IBM N-3700 SAN Box, having 12x FC 300GB disks, Partitioned into 3 disks into RAID-4 for WAL/backup, and 9 disks on RAID-DP for data, 2 hot spare. We are also considering similar solution from EMC - CX310C. B: Go for Internal of DAS based storage. Here for each server we should be able to have: 2x disks on RAID-1 for logs, 6x disks on RAID-10 for tablespace1 and 6x disks on RAID-10 for tablespace2. Or maybe 12x disks on RAID-10 single table-space. What do I think? Well.. SAN wins on manageability, replication (say to a DR site), backup, etc... DAS wins on cost But for a moment keeping these aside, i wanted to discuss, purely on performance side which one is a winner? It feels like internal-disks will perform better, but need to understand a rough magnitude of difference in performance to see if its worth loosing the manageability features. Also if we choose to go with DAS, what would be the best tool to do async replication to DR site and maybe even as a extra plus a second read-only DB server to distribute select loads. Regards, Azad |
| |||
| On 9/6/07, Harsh Azad <harsh.azad@gmail.com> wrote: > Hi, > > We are currently running our DB on a DualCore, Dual Proc 3.Ghz Xeon, 8GB > RAM, 4x SAS 146 GB 15K RPM on RAID 5. > > The current data size is about 50GB, but we want to purchase the hardware to > scale to about 1TB as we think our business will need to support that much > soon. > - Currently we have a 80% read and 20% write percentages. For this type load, you should be running on RAID10 not RAID5. Or, if you must use RAID 5, use more disks and have a battery backed caching RAID controller known to perform well with RAID5 and large arrays. > - Currently with this configuration the Database is showing signs of > over-loading. On I/O or CPU? If you're running out of CPU, then look to increasing CPU horsepower and tuning postgresql. If I/O then you need to look into a faster I/O subsystem. > - Auto-vaccum, etc run on this database, vaccum full runs nightly. Generally speaking, if you need to run vacuum fulls, you're doing something wrong. Is there a reason you're running vacuum full or is this just precautionary. vacuum full can bloat your indexes, so you shouldn't run it regularly. reindexing might be a better choice if you do need to regularly shrink your db. The better option is to monitor your fsm usage and adjust fsm settings / autovacuum settings as necessary. > - Currently CPU loads are about 20%, memory utilization is full (but this > is also due to linux caching disk blocks) and IO waits are frequent. > - We have a load of about 400 queries per second What does vmstat et. al. say about CPU versus I/O wait? > Now we are considering to purchase our own servers and in the process are > facing the usual dilemmas. First I'll list out what machine we have decided > to use: > 2x Quad Xeon 2.4 Ghz (4-way only 2 populated right now) > 32 GB RAM > OS Only storage - 2x SCSI 146 GB 15k RPM on RAID-1 > (Data Storage mentioned below) > > We have already decided to split our database into 3 machines on the basis > on disjoint sets of data. So we will be purchasing three of these boxes. > > HELP 1: Does something look wrong with above configuration, I know there > will be small differences b/w opetron/xeon. But do you think there is > something against going for 2.4Ghz Quad Xeons (clovertown i think)? Look like good machines, plenty fo memory. > HELP 2: The main confusion is with regards to Data Storage. We have the > option of going for: > > A: IBM N-3700 SAN Box, having 12x FC 300GB disks, Partitioned into 3 disks > into RAID-4 for WAL/backup, and 9 disks on RAID-DP for data, 2 hot spare. We > are also considering similar solution from EMC - CX310C. > > B: Go for Internal of DAS based storage. Here for each server we should be > able to have: 2x disks on RAID-1 for logs, 6x disks on RAID-10 for > tablespace1 and 6x disks on RAID-10 for tablespace2. Or maybe 12x disks on > RAID-10 single table-space. > > What do I think? Well.. > SAN wins on manageability, replication (say to a DR site), backup, etc... > DAS wins on cost The problem with SAN is that it's apparently very easy to build a big expensive system that performs poorly. We've seen reports of such here on the lists a few times. I would definitely demand an evaluation period from your supplier to make sure it performs well if you go SAN. > But for a moment keeping these aside, i wanted to discuss, purely on > performance side which one is a winner? It feels like internal-disks will > perform better, but need to understand a rough magnitude of difference in > performance to see if its worth loosing the manageability features. That really really really depends. The quality of RAID controllers for either setup is very important, as is the driver support, etc... All things being even, I'd lean towards the local storage. > Also if we choose to go with DAS, what would be the best tool to do async > replication to DR site and maybe even as a extra plus a second read-only DB > server to distribute select loads. Look at slony, or PITR with continuous recovery. Of those two, I've only used Slony in production, and I was very happy with it's performance, and it was very easy to write a bash script to monitor the replication for failures. ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Thanks Mark. If I replicate a snapshot of Data and log files (basically the entire PG data directory) and I maintain same version of postgres on both servers, it should work right? I am also thinking that having SAN storage will provide me with facility of keeping a warm standby DB. By just shutting one server down and starting the other mounting the same File system I should be able to bing my DB up when the primary inccurs a physical failure. I'm only considering SAN storage for this feature - has anyone ever used SAN for replication and warm standy-by on Postgres? Regards, Harsh On 9/6/07, Mark Lewis <mark.lewis@mir3.com> wrote: > > On Thu, 2007-09-06 at 18:05 +0530, Harsh Azad wrote: > > Hi, > > > > We are currently running our DB on a DualCore, Dual Proc 3.Ghz Xeon, > > 8GB RAM, 4x SAS 146 GB 15K RPM on RAID 5. > > > > The current data size is about 50GB, but we want to purchase the > > hardware to scale to about 1TB as we think our business will need to > > support that much soon. > > - Currently we have a 80% read and 20% write perecntages. > > - Currently with this configuration the Database is showing signs of > > over-loading. > > - Auto-vaccum, etc run on this database, vaccum full runs nightly. > > - Currently CPU loads are about 20%, memory utilization is full (but > > this is also due to linux caching disk blocks) and IO waits are > > frequent. > > - We have a load of about 400 queries per second > > > > Now we are considering to purchase our own servers and in the process > > are facing the usual dilemmas. First I'll list out what machine we > > have decided to use: > > 2x Quad Xeon 2.4 Ghz (4-way only 2 populated right now) > > 32 GB RAM > > OS Only storage - 2x SCSI 146 GB 15k RPM on RAID-1 > > (Data Storage mentioned below) > > > > We have already decided to split our database into 3 machines on the > > basis on disjoint sets of data. So we will be purchasing three of > > these boxes. > > > > HELP 1: Does something look wrong with above configuration, I know > > there will be small differences b/w opetron/xeon. But do you think > > there is something against going for 2.4Ghz Quad Xeons (clovertown i > > think)? > > > > HELP 2: The main confusion is with regards to Data Storage. We have > > the option of going for: > > > > A: IBM N-3700 SAN Box, having 12x FC 300GB disks, Partitioned into 3 > > disks into RAID-4 for WAL/backup, and 9 disks on RAID-DP for data, 2 > > hot spare. We are also considering similar solution from EMC - > > CX310C. > > > > B: Go for Internal of DAS based storage. Here for each server we > > should be able to have: 2x disks on RAID-1 for logs, 6x disks on > > RAID-10 for tablespace1 and 6x disks on RAID-10 for tablespace2. Or > > maybe 12x disks on RAID-10 single table-space. > > > > What do I think? Well.. > > SAN wins on manageability, replication (say to a DR site), backup, > > etc... > > DAS wins on cost > > > > But for a moment keeping these aside, i wanted to discuss, purely on > > performance side which one is a winner? It feels like internal-disks > > will perform better, but need to understand a rough magnitude of > > difference in performance to see if its worth loosing the > > manageability features. > > > > Also if we choose to go with DAS, what would be the best tool to do > > async replication to DR site and maybe even as a extra plus a second > > read-only DB server to distribute select loads. > > Sounds like a good candidate for Slony replication for backups / > read-only slaves. > > I haven't seen a SAN yet whose DR / replication facilities are on par > with a good database replication solution. My impression is that those > facilities are mostly for file servers, mail servers, etc. It would be > difficult for a SAN to properly replicate a database given the strict > ordering, size and consistency requirements for the data files. Not > impossible, but in my limited experience I haven't found one that I > trust to do it reliably either, vendor boastings to the contrary > notwithstanding. (Hint: make sure you know exactly what your vendor's > definition of the term 'snapshot' really means). > > So before you invest in a SAN, make sure that you're actually going to > be able to (and want to) use all the nice management features you're > paying for. We have some SAN's that are basically acting just as > expensive external RAID arrays because we do the database > replication/backup in software anyway. > > -- Mark Lewis > -- Harsh Azad ======================= Harsh.Azad@gmail.com |
| |||
| Thanks Scott, we have now requested IBM/EMC to provide test machines. Interestingly since you mentioned the importance of Raid controllers and the drivers; we are planning to use Cent OS 5 for hosting the DB. Firstly, I could only find postgres 8.1.x RPM for CentOS 5, could not find any RPM for 8.2.4. Is there any 8.2.4 RPM for CentOS 5? Secondly, would investing into Redhat enterprise edition give any performance advantage? I know all the SAN boxes are only certified on RHEL and not CentOS. Or since CentOS is similar to RHEL it would be fine? Regards, Harsh On 9/6/07, Scott Marlowe <scott.marlowe@gmail.com> wrote: > > On 9/6/07, Harsh Azad <harsh.azad@gmail.com> wrote: > > Hi, > > > > We are currently running our DB on a DualCore, Dual Proc 3.Ghz Xeon, 8GB > > RAM, 4x SAS 146 GB 15K RPM on RAID 5. > > > > The current data size is about 50GB, but we want to purchase the > hardware to > > scale to about 1TB as we think our business will need to support that > much > > soon. > > - Currently we have a 80% read and 20% write percentages. > > For this type load, you should be running on RAID10 not RAID5. Or, if > you must use RAID 5, use more disks and have a battery backed caching > RAID controller known to perform well with RAID5 and large arrays. > > > - Currently with this configuration the Database is showing signs of > > over-loading. > > On I/O or CPU? If you're running out of CPU, then look to increasing > CPU horsepower and tuning postgresql. > If I/O then you need to look into a faster I/O subsystem. > > > - Auto-vaccum, etc run on this database, vaccum full runs nightly. > > Generally speaking, if you need to run vacuum fulls, you're doing > something wrong. Is there a reason you're running vacuum full or is > this just precautionary. vacuum full can bloat your indexes, so you > shouldn't run it regularly. reindexing might be a better choice if > you do need to regularly shrink your db. The better option is to > monitor your fsm usage and adjust fsm settings / autovacuum settings > as necessary. > > > - Currently CPU loads are about 20%, memory utilization is full (but > this > > is also due to linux caching disk blocks) and IO waits are frequent. > > - We have a load of about 400 queries per second > > What does vmstat et. al. say about CPU versus I/O wait? > > > Now we are considering to purchase our own servers and in the process > are > > facing the usual dilemmas. First I'll list out what machine we have > decided > > to use: > > 2x Quad Xeon 2.4 Ghz (4-way only 2 populated right now) > > 32 GB RAM > > OS Only storage - 2x SCSI 146 GB 15k RPM on RAID-1 > > (Data Storage mentioned below) > > > > We have already decided to split our database into 3 machines on the > basis > > on disjoint sets of data. So we will be purchasing three of these boxes. > > > > HELP 1: Does something look wrong with above configuration, I know there > > will be small differences b/w opetron/xeon. But do you think there is > > something against going for 2.4Ghz Quad Xeons (clovertown i think)? > > Look like good machines, plenty fo memory. > > > HELP 2: The main confusion is with regards to Data Storage. We have the > > option of going for: > > > > A: IBM N-3700 SAN Box, having 12x FC 300GB disks, Partitioned into 3 > disks > > into RAID-4 for WAL/backup, and 9 disks on RAID-DP for data, 2 hot > spare. We > > are also considering similar solution from EMC - CX310C. > > > > B: Go for Internal of DAS based storage. Here for each server we should > be > > able to have: 2x disks on RAID-1 for logs, 6x disks on RAID-10 for > > tablespace1 and 6x disks on RAID-10 for tablespace2. Or maybe 12x disks > on > > RAID-10 single table-space. > > > > What do I think? Well.. > > SAN wins on manageability, replication (say to a DR site), backup, > etc... > > DAS wins on cost > > The problem with SAN is that it's apparently very easy to build a big > expensive system that performs poorly. We've seen reports of such > here on the lists a few times. I would definitely demand an > evaluation period from your supplier to make sure it performs well if > you go SAN. > > > But for a moment keeping these aside, i wanted to discuss, purely on > > performance side which one is a winner? It feels like internal-disks > will > > perform better, but need to understand a rough magnitude of difference > in > > performance to see if its worth loosing the manageability features. > > That really really really depends. The quality of RAID controllers > for either setup is very important, as is the driver support, etc... > All things being even, I'd lean towards the local storage. > > > Also if we choose to go with DAS, what would be the best tool to do > async > > replication to DR site and maybe even as a extra plus a second read-only > DB > > server to distribute select loads. > > Look at slony, or PITR with continuous recovery. Of those two, I've > only used Slony in production, and I was very happy with it's > performance, and it was very easy to write a bash script to monitor > the replication for failures. > -- Harsh Azad ======================= Harsh.Azad@gmail.com |
| |||
| On Thu, 2007-09-06 at 18:05 +0530, Harsh Azad wrote: > Hi, > > We are currently running our DB on a DualCore, Dual Proc 3.Ghz Xeon, > 8GB RAM, 4x SAS 146 GB 15K RPM on RAID 5. > > The current data size is about 50GB, but we want to purchase the > hardware to scale to about 1TB as we think our business will need to > support that much soon. > - Currently we have a 80% read and 20% write perecntages. > - Currently with this configuration the Database is showing signs of > over-loading. > - Auto-vaccum, etc run on this database, vaccum full runs nightly. > - Currently CPU loads are about 20%, memory utilization is full (but > this is also due to linux caching disk blocks) and IO waits are > frequent. > - We have a load of about 400 queries per second > > Now we are considering to purchase our own servers and in the process > are facing the usual dilemmas. First I'll list out what machine we > have decided to use: > 2x Quad Xeon 2.4 Ghz (4-way only 2 populated right now) > 32 GB RAM > OS Only storage - 2x SCSI 146 GB 15k RPM on RAID-1 > (Data Storage mentioned below) > > We have already decided to split our database into 3 machines on the > basis on disjoint sets of data. So we will be purchasing three of > these boxes. > > HELP 1: Does something look wrong with above configuration, I know > there will be small differences b/w opetron/xeon. But do you think > there is something against going for 2.4Ghz Quad Xeons (clovertown i > think)? > > HELP 2: The main confusion is with regards to Data Storage. We have > the option of going for: > > A: IBM N-3700 SAN Box, having 12x FC 300GB disks, Partitioned into 3 > disks into RAID-4 for WAL/backup, and 9 disks on RAID-DP for data, 2 > hot spare. We are also considering similar solution from EMC - > CX310C. > > B: Go for Internal of DAS based storage. Here for each server we > should be able to have: 2x disks on RAID-1 for logs, 6x disks on > RAID-10 for tablespace1 and 6x disks on RAID-10 for tablespace2. Or > maybe 12x disks on RAID-10 single table-space. > > What do I think? Well.. > SAN wins on manageability, replication (say to a DR site), backup, > etc... > DAS wins on cost > > But for a moment keeping these aside, i wanted to discuss, purely on > performance side which one is a winner? It feels like internal-disks > will perform better, but need to understand a rough magnitude of > difference in performance to see if its worth loosing the > manageability features. > > Also if we choose to go with DAS, what would be the best tool to do > async replication to DR site and maybe even as a extra plus a second > read-only DB server to distribute select loads. Sounds like a good candidate for Slony replication for backups / read-only slaves. I haven't seen a SAN yet whose DR / replication facilities are on par with a good database replication solution. My impression is that those facilities are mostly for file servers, mail servers, etc. It would be difficult for a SAN to properly replicate a database given the strict ordering, size and consistency requirements for the data files. Not impossible, but in my limited experience I haven't found one that I trust to do it reliably either, vendor boastings to the contrary notwithstanding. (Hint: make sure you know exactly what your vendor's definition of the term 'snapshot' really means). So before you invest in a SAN, make sure that you're actually going to be able to (and want to) use all the nice management features you're paying for. We have some SAN's that are basically acting just as expensive external RAID arrays because we do the database replication/backup in software anyway. -- Mark Lewis ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Harsh Azad wrote: > Thanks Scott, we have now requested IBM/EMC to provide test machines. > Interestingly since you mentioned the importance of Raid controllers and the > drivers; we are planning to use Cent OS 5 for hosting the DB. > > Firstly, I could only find postgres 8.1.x RPM for CentOS 5, could not find > any RPM for 8.2.4. Is there any 8.2.4 RPM for CentOS 5? Look under the RHEL section of ftp.postgresql.org Joshua D. Drake > > Secondly, would investing into Redhat enterprise edition give any > performance advantage? I know all the SAN boxes are only certified on RHEL > and not CentOS. Or since CentOS is similar to RHEL it would be fine? > > Regards, > Harsh > > On 9/6/07, Scott Marlowe <scott.marlowe@gmail.com> wrote: >> On 9/6/07, Harsh Azad <harsh.azad@gmail.com> wrote: >>> Hi, >>> >>> We are currently running our DB on a DualCore, Dual Proc 3.Ghz Xeon, 8GB >>> RAM, 4x SAS 146 GB 15K RPM on RAID 5. >>> >>> The current data size is about 50GB, but we want to purchase the >> hardware to >>> scale to about 1TB as we think our business will need to support that >> much >>> soon. >>> - Currently we have a 80% read and 20% write percentages. >> For this type load, you should be running on RAID10 not RAID5. Or, if >> you must use RAID 5, use more disks and have a battery backed caching >> RAID controller known to perform well with RAID5 and large arrays. >> >>> - Currently with this configuration the Database is showing signs of >>> over-loading. >> On I/O or CPU? If you're running out of CPU, then look to increasing >> CPU horsepower and tuning postgresql. >> If I/O then you need to look into a faster I/O subsystem. >> >>> - Auto-vaccum, etc run on this database, vaccum full runs nightly. >> Generally speaking, if you need to run vacuum fulls, you're doing >> something wrong. Is there a reason you're running vacuum full or is >> this just precautionary. vacuum full can bloat your indexes, so you >> shouldn't run it regularly. reindexing might be a better choice if >> you do need to regularly shrink your db. The better option is to >> monitor your fsm usage and adjust fsm settings / autovacuum settings >> as necessary. >> >>> - Currently CPU loads are about 20%, memory utilization is full (but >> this >>> is also due to linux caching disk blocks) and IO waits are frequent. >>> - We have a load of about 400 queries per second >> What does vmstat et. al. say about CPU versus I/O wait? >> >>> Now we are considering to purchase our own servers and in the process >> are >>> facing the usual dilemmas. First I'll list out what machine we have >> decided >>> to use: >>> 2x Quad Xeon 2.4 Ghz (4-way only 2 populated right now) >>> 32 GB RAM >>> OS Only storage - 2x SCSI 146 GB 15k RPM on RAID-1 >>> (Data Storage mentioned below) >>> >>> We have already decided to split our database into 3 machines on the >> basis >>> on disjoint sets of data. So we will be purchasing three of these boxes. >>> >>> HELP 1: Does something look wrong with above configuration, I know there >>> will be small differences b/w opetron/xeon. But do you think there is >>> something against going for 2.4Ghz Quad Xeons (clovertown i think)? >> Look like good machines, plenty fo memory. >> >>> HELP 2: The main confusion is with regards to Data Storage. We have the >>> option of going for: >>> >>> A: IBM N-3700 SAN Box, having 12x FC 300GB disks, Partitioned into 3 >> disks >>> into RAID-4 for WAL/backup, and 9 disks on RAID-DP for data, 2 hot >> spare. We >>> are also considering similar solution from EMC - CX310C. >>> >>> B: Go for Internal of DAS based storage. Here for each server we should >> be >>> able to have: 2x disks on RAID-1 for logs, 6x disks on RAID-10 for >>> tablespace1 and 6x disks on RAID-10 for tablespace2. Or maybe 12x disks >> on >>> RAID-10 single table-space. >>> >>> What do I think? Well.. >>> SAN wins on manageability, replication (say to a DR site), backup, >> etc... >>> DAS wins on cost >> The problem with SAN is that it's apparently very easy to build a big >> expensive system that performs poorly. We've seen reports of such >> here on the lists a few times. I would definitely demand an >> evaluation period from your supplier to make sure it performs well if >> you go SAN. >> >>> But for a moment keeping these aside, i wanted to discuss, purely on >>> performance side which one is a winner? It feels like internal-disks >> will >>> perform better, but need to understand a rough magnitude of difference >> in >>> performance to see if its worth loosing the manageability features. >> That really really really depends. The quality of RAID controllers >> for either setup is very important, as is the driver support, etc... >> All things being even, I'd lean towards the local storage. >> >>> Also if we choose to go with DAS, what would be the best tool to do >> async >>> replication to DR site and maybe even as a extra plus a second read-only >> DB >>> server to distribute select loads. >> Look at slony, or PITR with continuous recovery. Of those two, I've >> only used Slony in production, and I was very happy with it's >> performance, and it was very easy to write a bash script to monitor >> the replication for failures. >> > > > - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFG4DT2ATb/zqfZUUQRAoppAJ9Pj+/nDtDd/XhzMdRkjXcGHHuaeACfRTfV wE8+ErUXuVnXmlchYvCPgu8= =TihW -----END PGP SIGNATURE----- ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| On 6-9-2007 14:35 Harsh Azad wrote: > 2x Quad Xeon 2.4 Ghz (4-way only 2 populated right now) I don't understand this sentence. You seem to imply you might be able to fit more processors in your system? Currently the only Quad Core's you can buy are dual-processor processors, unless you already got a quote for a system that yields the new Intel "Tigerton" processors. I.e. if they are clovertown's they are indeed Intel Core-architecture processors, but you won't be able to fit more than 2 in the system and get 8 cores in a system. If they are Tigerton, I'm a bit surprised you got a quote for that, although HP seems to offer a system for those. If they are the old dual-core MP's (70xx or 71xx), you don't want those... > 32 GB RAM > OS Only storage - 2x SCSI 146 GB 15k RPM on RAID-1 > (Data Storage mentioned below) I doubt you need 15k-rpm drives for OS... But that won't matter much on the total cost. > HELP 1: Does something look wrong with above configuration, I know there > will be small differences b/w opetron/xeon. But do you think there is > something against going for 2.4Ghz Quad Xeons (clovertown i think)? Apart from your implication that you may be able to stick more processors in it: no, not to me. Two Quad Core Xeons were even faster than 8 dual core opterons in our benchmarks, although that might also indicate limited OS-, postgres or underlying I/O-scaling. Obviously the new AMD Barcelona-line of processors (coming next week orso) and the new Intel Quad Core's DP (Penryn?) and MP (Tigerton) may be interesting to look at, I don't know how soon systems will be available with those processors (HP seems to offer a tigerton-server). > B: Go for Internal of DAS based storage. Here for each server we should > be able to have: 2x disks on RAID-1 for logs, 6x disks on RAID-10 for > tablespace1 and 6x disks on RAID-10 for tablespace2. Or maybe 12x disks > on RAID-10 single table-space. You don't necessarily need to use internal disks for DAS, since you can also link an external SAS-enclosure either with or without an integrated raid-controller (IBM, Sun, Dell, HP and others have options for that), and those are able to be expanded to either multiple enclosures tied to eachother or to a controller in the server. Those may also be usable in a warm-standby-scenario and may be quite a bit cheaper than FC-hardware. > But for a moment keeping these aside, i wanted to discuss, purely on > performance side which one is a winner? It feels like internal-disks > will perform better, but need to understand a rough magnitude of > difference in performance to see if its worth loosing the manageability > features. As said, you don't necessarily need real internal disks, since SAS can be used with external enclosures as well, still being DAS. I have no idea what difference you will or may see between those in terms of performance. It probably largely depends on the raid-controller available, afaik the disks will be mostly the same. And it might depend on your available bandwidth, external SAS offers you a 4port-connection allowing for a 12Gbit-connection between a disk-enclosure and a controller. While - as I understand it - even expensive SAN-controllers only offer dual-ported, 8Gbit connections? What's more important is probably the amount of disks and raid-cache you can buy in the SAN vs DAS-scenario. If you can buy 24 disks when going for DAS vs only 12 whith SAN... But then again, I'm no real storage expert, we only have two Dell MD1000 DAS-units at our site. Best regards and good luck, Arjen ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| On 9/6/07, Harsh Azad <harsh.azad@gmail.com> wrote: > Thanks Scott, we have now requested IBM/EMC to provide test machines. > Interestingly since you mentioned the importance of Raid controllers and the > drivers; we are planning to use Cent OS 5 for hosting the DB. What RAID controllers have you looked at. Seems the two most popular in terms of performance here have been Areca and 3Ware / Escalade. LSI seems to come in a pretty close third. Adaptec is to be avoided as are cheap RAID controllers (i.e. promise etc...) battery backed cache is a must, and the bigger the better. > Firstly, I could only find postgres 8.1.x RPM for CentOS 5, could not find > any RPM for 8.2.4. Is there any 8.2.4 RPM for CentOS 5? > > Secondly, would investing into Redhat enterprise edition give any > performance advantage? I know all the SAN boxes are only certified on RHEL > and not CentOS. Or since CentOS is similar to RHEL it would be fine? for all intents and purposes, CentOS and RHEL are the same OS, so any pgsql rpm for one should pretty much work for the other. At the worst, you might have to get a srpm and rebuild it for CentOS / White Box. ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| Hi, How about the Dell Perc 5/i card, 512MB battery backed cache or IBM ServeRAID-8k Adapter? I hope I am sending relevant information here, I am not too well versed with RAID controllers. Regards, Harsh On 9/6/07, Scott Marlowe <scott.marlowe@gmail.com> wrote: > > On 9/6/07, Harsh Azad <harsh.azad@gmail.com> wrote: > > Thanks Scott, we have now requested IBM/EMC to provide test machines. > > Interestingly since you mentioned the importance of Raid controllers and > the > > drivers; we are planning to use Cent OS 5 for hosting the DB. > > What RAID controllers have you looked at. Seems the two most popular > in terms of performance here have been Areca and 3Ware / Escalade. > LSI seems to come in a pretty close third. Adaptec is to be avoided > as are cheap RAID controllers (i.e. promise etc...) battery backed > cache is a must, and the bigger the better. > > > Firstly, I could only find postgres 8.1.x RPM for CentOS 5, could not > find > > any RPM for 8.2.4. Is there any 8.2.4 RPM for CentOS 5? > > > > Secondly, would investing into Redhat enterprise edition give any > > performance advantage? I know all the SAN boxes are only certified on > RHEL > > and not CentOS. Or since CentOS is similar to RHEL it would be fine? > > for all intents and purposes, CentOS and RHEL are the same OS, so any > pgsql rpm for one should pretty much work for the other. At the > worst, you might have to get a srpm and rebuild it for CentOS / White > Box. > -- Harsh Azad ======================= Harsh.Azad@gmail.com |
| ||||
| On Thu, 2007-09-06 at 22:28 +0530, Harsh Azad wrote: > Thanks Mark. > > If I replicate a snapshot of Data and log files (basically the entire > PG data directory) and I maintain same version of postgres on both > servers, it should work right? > > I am also thinking that having SAN storage will provide me with > facility of keeping a warm standby DB. By just shutting one server > down and starting the other mounting the same File system I should be > able to bing my DB up when the primary inccurs a physical failure. > > I'm only considering SAN storage for this feature - has anyone ever > used SAN for replication and warm standy-by on Postgres? > > Regards, > Harsh We used to use a SAN for warm standby of a database, but with Oracle and not PG. It worked kinda sorta, except for occasional crashes due to buggy drivers. But after going through the exercise, we realized that we hadn't gained anything over just doing master/slave replication between two servers, except that it was more expensive, had a tendency to expose buggy drivers, had a single point of failure in the SAN array, failover took longer and we couldn't use the warm standby server to perform read-only queries. So we reverted back and just used the SAN as expensive DAS and set up a separate box for DB replication. So if that's the only reason you're considering a SAN, then I'd advise you to spend the extra money on more DAS disks. Maybe I'm jaded by past experiences, but the only real use case I can see to justify a SAN for a database would be something like Oracle RAC, but I'm not aware of any PG equivalent to that. -- Mark Lewis ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |