Unix Technical Forum

SAN vs Internal Disks

This is a discussion on SAN vs Internal Disks within the Pgsql Performance forums, part of the PostgreSQL category; --> Hi, We are currently running our DB on a DualCore, Dual Proc 3.Ghz Xeon, 8GB RAM, 4x SAS 146 ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 11:29 AM
Harsh Azad
 
Posts: n/a
Default SAN vs Internal Disks

Hi,

We are currently running our DB on a DualCore, Dual Proc 3.Ghz Xeon, 8GB
RAM, 4x SAS 146 GB 15K RPM on RAID 5.

The current data size is about 50GB, but we want to purchase the hardware to
scale to about 1TB as we think our business will need to support that much
soon.
- Currently we have a 80% read and 20% write perecntages.
- Currently with this configuration the Database is showing signs of
over-loading.
- Auto-vaccum, etc run on this database, vaccum full runs nightly.
- Currently CPU loads are about 20%, memory utilization is full (but this is
also due to linux caching disk blocks) and IO waits are frequent.
- We have a load of about 400 queries per second

Now we are considering to purchase our own servers and in the process are
facing the usual dilemmas. First I'll list out what machine we have decided
to use:
2x Quad Xeon 2.4 Ghz (4-way only 2 populated right now)
32 GB RAM
OS Only storage - 2x SCSI 146 GB 15k RPM on RAID-1
(Data Storage mentioned below)

We have already decided to split our database into 3 machines on the basis
on disjoint sets of data. So we will be purchasing three of these boxes.

HELP 1: Does something look wrong with above configuration, I know there
will be small differences b/w opetron/xeon. But do you think there is
something against going for 2.4Ghz Quad Xeons (clovertown i think)?

HELP 2: The main confusion is with regards to Data Storage. We have the
option of going for:

A: IBM N-3700 SAN Box, having 12x FC 300GB disks, Partitioned into 3 disks
into RAID-4 for WAL/backup, and 9 disks on RAID-DP for data, 2 hot spare. We
are also considering similar solution from EMC - CX310C.

B: Go for Internal of DAS based storage. Here for each server we should be
able to have: 2x disks on RAID-1 for logs, 6x disks on RAID-10 for
tablespace1 and 6x disks on RAID-10 for tablespace2. Or maybe 12x disks on
RAID-10 single table-space.

What do I think? Well..
SAN wins on manageability, replication (say to a DR site), backup, etc...
DAS wins on cost

But for a moment keeping these aside, i wanted to discuss, purely on
performance side which one is a winner? It feels like internal-disks will
perform better, but need to understand a rough magnitude of difference in
performance to see if its worth loosing the manageability features.

Also if we choose to go with DAS, what would be the best tool to do async
replication to DR site and maybe even as a extra plus a second read-only DB
server to distribute select loads.

Regards,
Azad

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 11:29 AM
Scott Marlowe
 
Posts: n/a
Default Re: SAN vs Internal Disks

On 9/6/07, Harsh Azad <harsh.azad@gmail.com> wrote:
> Hi,
>
> We are currently running our DB on a DualCore, Dual Proc 3.Ghz Xeon, 8GB
> RAM, 4x SAS 146 GB 15K RPM on RAID 5.
>
> The current data size is about 50GB, but we want to purchase the hardware to
> scale to about 1TB as we think our business will need to support that much
> soon.
> - Currently we have a 80% read and 20% write percentages.


For this type load, you should be running on RAID10 not RAID5. Or, if
you must use RAID 5, use more disks and have a battery backed caching
RAID controller known to perform well with RAID5 and large arrays.

> - Currently with this configuration the Database is showing signs of
> over-loading.


On I/O or CPU? If you're running out of CPU, then look to increasing
CPU horsepower and tuning postgresql.
If I/O then you need to look into a faster I/O subsystem.

> - Auto-vaccum, etc run on this database, vaccum full runs nightly.


Generally speaking, if you need to run vacuum fulls, you're doing
something wrong. Is there a reason you're running vacuum full or is
this just precautionary. vacuum full can bloat your indexes, so you
shouldn't run it regularly. reindexing might be a better choice if
you do need to regularly shrink your db. The better option is to
monitor your fsm usage and adjust fsm settings / autovacuum settings
as necessary.

> - Currently CPU loads are about 20%, memory utilization is full (but this
> is also due to linux caching disk blocks) and IO waits are frequent.
> - We have a load of about 400 queries per second


What does vmstat et. al. say about CPU versus I/O wait?

> Now we are considering to purchase our own servers and in the process are
> facing the usual dilemmas. First I'll list out what machine we have decided
> to use:
> 2x Quad Xeon 2.4 Ghz (4-way only 2 populated right now)
> 32 GB RAM
> OS Only storage - 2x SCSI 146 GB 15k RPM on RAID-1
> (Data Storage mentioned below)
>
> We have already decided to split our database into 3 machines on the basis
> on disjoint sets of data. So we will be purchasing three of these boxes.
>
> HELP 1: Does something look wrong with above configuration, I know there
> will be small differences b/w opetron/xeon. But do you think there is
> something against going for 2.4Ghz Quad Xeons (clovertown i think)?


Look like good machines, plenty fo memory.

> HELP 2: The main confusion is with regards to Data Storage. We have the
> option of going for:
>
> A: IBM N-3700 SAN Box, having 12x FC 300GB disks, Partitioned into 3 disks
> into RAID-4 for WAL/backup, and 9 disks on RAID-DP for data, 2 hot spare. We
> are also considering similar solution from EMC - CX310C.
>
> B: Go for Internal of DAS based storage. Here for each server we should be
> able to have: 2x disks on RAID-1 for logs, 6x disks on RAID-10 for
> tablespace1 and 6x disks on RAID-10 for tablespace2. Or maybe 12x disks on
> RAID-10 single table-space.
>
> What do I think? Well..
> SAN wins on manageability, replication (say to a DR site), backup, etc...
> DAS wins on cost


The problem with SAN is that it's apparently very easy to build a big
expensive system that performs poorly. We've seen reports of such
here on the lists a few times. I would definitely demand an
evaluation period from your supplier to make sure it performs well if
you go SAN.

> But for a moment keeping these aside, i wanted to discuss, purely on
> performance side which one is a winner? It feels like internal-disks will
> perform better, but need to understand a rough magnitude of difference in
> performance to see if its worth loosing the manageability features.


That really really really depends. The quality of RAID controllers
for either setup is very important, as is the driver support, etc...
All things being even, I'd lean towards the local storage.

> Also if we choose to go with DAS, what would be the best tool to do async
> replication to DR site and maybe even as a extra plus a second read-only DB
> server to distribute select loads.


Look at slony, or PITR with continuous recovery. Of those two, I've
only used Slony in production, and I was very happy with it's
performance, and it was very easy to write a bash script to monitor
the replication for failures.

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-19-2008, 11:29 AM
Harsh Azad
 
Posts: n/a
Default Re: SAN vs Internal Disks

Thanks Mark.

If I replicate a snapshot of Data and log files (basically the entire PG
data directory) and I maintain same version of postgres on both servers, it
should work right?

I am also thinking that having SAN storage will provide me with facility of
keeping a warm standby DB. By just shutting one server down and starting the
other mounting the same File system I should be able to bing my DB up when
the primary inccurs a physical failure.

I'm only considering SAN storage for this feature - has anyone ever used SAN
for replication and warm standy-by on Postgres?

Regards,
Harsh

On 9/6/07, Mark Lewis <mark.lewis@mir3.com> wrote:
>
> On Thu, 2007-09-06 at 18:05 +0530, Harsh Azad wrote:
> > Hi,
> >
> > We are currently running our DB on a DualCore, Dual Proc 3.Ghz Xeon,
> > 8GB RAM, 4x SAS 146 GB 15K RPM on RAID 5.
> >
> > The current data size is about 50GB, but we want to purchase the
> > hardware to scale to about 1TB as we think our business will need to
> > support that much soon.
> > - Currently we have a 80% read and 20% write perecntages.
> > - Currently with this configuration the Database is showing signs of
> > over-loading.
> > - Auto-vaccum, etc run on this database, vaccum full runs nightly.
> > - Currently CPU loads are about 20%, memory utilization is full (but
> > this is also due to linux caching disk blocks) and IO waits are
> > frequent.
> > - We have a load of about 400 queries per second
> >
> > Now we are considering to purchase our own servers and in the process
> > are facing the usual dilemmas. First I'll list out what machine we
> > have decided to use:
> > 2x Quad Xeon 2.4 Ghz (4-way only 2 populated right now)
> > 32 GB RAM
> > OS Only storage - 2x SCSI 146 GB 15k RPM on RAID-1
> > (Data Storage mentioned below)
> >
> > We have already decided to split our database into 3 machines on the
> > basis on disjoint sets of data. So we will be purchasing three of
> > these boxes.
> >
> > HELP 1: Does something look wrong with above configuration, I know
> > there will be small differences b/w opetron/xeon. But do you think
> > there is something against going for 2.4Ghz Quad Xeons (clovertown i
> > think)?
> >
> > HELP 2: The main confusion is with regards to Data Storage. We have
> > the option of going for:
> >
> > A: IBM N-3700 SAN Box, having 12x FC 300GB disks, Partitioned into 3
> > disks into RAID-4 for WAL/backup, and 9 disks on RAID-DP for data, 2
> > hot spare. We are also considering similar solution from EMC -
> > CX310C.
> >
> > B: Go for Internal of DAS based storage. Here for each server we
> > should be able to have: 2x disks on RAID-1 for logs, 6x disks on
> > RAID-10 for tablespace1 and 6x disks on RAID-10 for tablespace2. Or
> > maybe 12x disks on RAID-10 single table-space.
> >
> > What do I think? Well..
> > SAN wins on manageability, replication (say to a DR site), backup,
> > etc...
> > DAS wins on cost
> >
> > But for a moment keeping these aside, i wanted to discuss, purely on
> > performance side which one is a winner? It feels like internal-disks
> > will perform better, but need to understand a rough magnitude of
> > difference in performance to see if its worth loosing the
> > manageability features.
> >
> > Also if we choose to go with DAS, what would be the best tool to do
> > async replication to DR site and maybe even as a extra plus a second
> > read-only DB server to distribute select loads.

>
> Sounds like a good candidate for Slony replication for backups /
> read-only slaves.
>
> I haven't seen a SAN yet whose DR / replication facilities are on par
> with a good database replication solution. My impression is that those
> facilities are mostly for file servers, mail servers, etc. It would be
> difficult for a SAN to properly replicate a database given the strict
> ordering, size and consistency requirements for the data files. Not
> impossible, but in my limited experience I haven't found one that I
> trust to do it reliably either, vendor boastings to the contrary
> notwithstanding. (Hint: make sure you know exactly what your vendor's
> definition of the term 'snapshot' really means).
>
> So before you invest in a SAN, make sure that you're actually going to
> be able to (and want to) use all the nice management features you're
> paying for. We have some SAN's that are basically acting just as
> expensive external RAID arrays because we do the database
> replication/backup in software anyway.
>
> -- Mark Lewis
>




--
Harsh Azad
=======================
Harsh.Azad@gmail.com

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-19-2008, 11:29 AM
Harsh Azad
 
Posts: n/a
Default Re: SAN vs Internal Disks

Thanks Scott, we have now requested IBM/EMC to provide test machines.
Interestingly since you mentioned the importance of Raid controllers and the
drivers; we are planning to use Cent OS 5 for hosting the DB.

Firstly, I could only find postgres 8.1.x RPM for CentOS 5, could not find
any RPM for 8.2.4. Is there any 8.2.4 RPM for CentOS 5?

Secondly, would investing into Redhat enterprise edition give any
performance advantage? I know all the SAN boxes are only certified on RHEL
and not CentOS. Or since CentOS is similar to RHEL it would be fine?

Regards,
Harsh

On 9/6/07, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>
> On 9/6/07, Harsh Azad <harsh.azad@gmail.com> wrote:
> > Hi,
> >
> > We are currently running our DB on a DualCore, Dual Proc 3.Ghz Xeon, 8GB
> > RAM, 4x SAS 146 GB 15K RPM on RAID 5.
> >
> > The current data size is about 50GB, but we want to purchase the

> hardware to
> > scale to about 1TB as we think our business will need to support that

> much
> > soon.
> > - Currently we have a 80% read and 20% write percentages.

>
> For this type load, you should be running on RAID10 not RAID5. Or, if
> you must use RAID 5, use more disks and have a battery backed caching
> RAID controller known to perform well with RAID5 and large arrays.
>
> > - Currently with this configuration the Database is showing signs of
> > over-loading.

>
> On I/O or CPU? If you're running out of CPU, then look to increasing
> CPU horsepower and tuning postgresql.
> If I/O then you need to look into a faster I/O subsystem.
>
> > - Auto-vaccum, etc run on this database, vaccum full runs nightly.

>
> Generally speaking, if you need to run vacuum fulls, you're doing
> something wrong. Is there a reason you're running vacuum full or is
> this just precautionary. vacuum full can bloat your indexes, so you
> shouldn't run it regularly. reindexing might be a better choice if
> you do need to regularly shrink your db. The better option is to
> monitor your fsm usage and adjust fsm settings / autovacuum settings
> as necessary.
>
> > - Currently CPU loads are about 20%, memory utilization is full (but

> this
> > is also due to linux caching disk blocks) and IO waits are frequent.
> > - We have a load of about 400 queries per second

>
> What does vmstat et. al. say about CPU versus I/O wait?
>
> > Now we are considering to purchase our own servers and in the process

> are
> > facing the usual dilemmas. First I'll list out what machine we have

> decided
> > to use:
> > 2x Quad Xeon 2.4 Ghz (4-way only 2 populated right now)
> > 32 GB RAM
> > OS Only storage - 2x SCSI 146 GB 15k RPM on RAID-1
> > (Data Storage mentioned below)
> >
> > We have already decided to split our database into 3 machines on the

> basis
> > on disjoint sets of data. So we will be purchasing three of these boxes.
> >
> > HELP 1: Does something look wrong with above configuration, I know there
> > will be small differences b/w opetron/xeon. But do you think there is
> > something against going for 2.4Ghz Quad Xeons (clovertown i think)?

>
> Look like good machines, plenty fo memory.
>
> > HELP 2: The main confusion is with regards to Data Storage. We have the
> > option of going for:
> >
> > A: IBM N-3700 SAN Box, having 12x FC 300GB disks, Partitioned into 3

> disks
> > into RAID-4 for WAL/backup, and 9 disks on RAID-DP for data, 2 hot

> spare. We
> > are also considering similar solution from EMC - CX310C.
> >
> > B: Go for Internal of DAS based storage. Here for each server we should

> be
> > able to have: 2x disks on RAID-1 for logs, 6x disks on RAID-10 for
> > tablespace1 and 6x disks on RAID-10 for tablespace2. Or maybe 12x disks

> on
> > RAID-10 single table-space.
> >
> > What do I think? Well..
> > SAN wins on manageability, replication (say to a DR site), backup,

> etc...
> > DAS wins on cost

>
> The problem with SAN is that it's apparently very easy to build a big
> expensive system that performs poorly. We've seen reports of such
> here on the lists a few times. I would definitely demand an
> evaluation period from your supplier to make sure it performs well if
> you go SAN.
>
> > But for a moment keeping these aside, i wanted to discuss, purely on
> > performance side which one is a winner? It feels like internal-disks

> will
> > perform better, but need to understand a rough magnitude of difference

> in
> > performance to see if its worth loosing the manageability features.

>
> That really really really depends. The quality of RAID controllers
> for either setup is very important, as is the driver support, etc...
> All things being even, I'd lean towards the local storage.
>
> > Also if we choose to go with DAS, what would be the best tool to do

> async
> > replication to DR site and maybe even as a extra plus a second read-only

> DB
> > server to distribute select loads.

>
> Look at slony, or PITR with continuous recovery. Of those two, I've
> only used Slony in production, and I was very happy with it's
> performance, and it was very easy to write a bash script to monitor
> the replication for failures.
>




--
Harsh Azad
=======================
Harsh.Azad@gmail.com

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-19-2008, 11:29 AM
Mark Lewis
 
Posts: n/a
Default Re: SAN vs Internal Disks

On Thu, 2007-09-06 at 18:05 +0530, Harsh Azad wrote:
> Hi,
>
> We are currently running our DB on a DualCore, Dual Proc 3.Ghz Xeon,
> 8GB RAM, 4x SAS 146 GB 15K RPM on RAID 5.
>
> The current data size is about 50GB, but we want to purchase the
> hardware to scale to about 1TB as we think our business will need to
> support that much soon.
> - Currently we have a 80% read and 20% write perecntages.
> - Currently with this configuration the Database is showing signs of
> over-loading.
> - Auto-vaccum, etc run on this database, vaccum full runs nightly.
> - Currently CPU loads are about 20%, memory utilization is full (but
> this is also due to linux caching disk blocks) and IO waits are
> frequent.
> - We have a load of about 400 queries per second
>
> Now we are considering to purchase our own servers and in the process
> are facing the usual dilemmas. First I'll list out what machine we
> have decided to use:
> 2x Quad Xeon 2.4 Ghz (4-way only 2 populated right now)
> 32 GB RAM
> OS Only storage - 2x SCSI 146 GB 15k RPM on RAID-1
> (Data Storage mentioned below)
>
> We have already decided to split our database into 3 machines on the
> basis on disjoint sets of data. So we will be purchasing three of
> these boxes.
>
> HELP 1: Does something look wrong with above configuration, I know
> there will be small differences b/w opetron/xeon. But do you think
> there is something against going for 2.4Ghz Quad Xeons (clovertown i
> think)?
>
> HELP 2: The main confusion is with regards to Data Storage. We have
> the option of going for:
>
> A: IBM N-3700 SAN Box, having 12x FC 300GB disks, Partitioned into 3
> disks into RAID-4 for WAL/backup, and 9 disks on RAID-DP for data, 2
> hot spare. We are also considering similar solution from EMC -
> CX310C.
>
> B: Go for Internal of DAS based storage. Here for each server we
> should be able to have: 2x disks on RAID-1 for logs, 6x disks on
> RAID-10 for tablespace1 and 6x disks on RAID-10 for tablespace2. Or
> maybe 12x disks on RAID-10 single table-space.
>
> What do I think? Well..
> SAN wins on manageability, replication (say to a DR site), backup,
> etc...
> DAS wins on cost
>
> But for a moment keeping these aside, i wanted to discuss, purely on
> performance side which one is a winner? It feels like internal-disks
> will perform better, but need to understand a rough magnitude of
> difference in performance to see if its worth loosing the
> manageability features.
>
> Also if we choose to go with DAS, what would be the best tool to do
> async replication to DR site and maybe even as a extra plus a second
> read-only DB server to distribute select loads.


Sounds like a good candidate for Slony replication for backups /
read-only slaves.

I haven't seen a SAN yet whose DR / replication facilities are on par
with a good database replication solution. My impression is that those
facilities are mostly for file servers, mail servers, etc. It would be
difficult for a SAN to properly replicate a database given the strict
ordering, size and consistency requirements for the data files. Not
impossible, but in my limited experience I haven't found one that I
trust to do it reliably either, vendor boastings to the contrary
notwithstanding. (Hint: make sure you know exactly what your vendor's
definition of the term 'snapshot' really means).

So before you invest in a SAN, make sure that you're actually going to
be able to (and want to) use all the nice management features you're
paying for. We have some SAN's that are basically acting just as
expensive external RAID arrays because we do the database
replication/backup in software anyway.

-- Mark Lewis



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-19-2008, 11:29 AM
Joshua D. Drake
 
Posts: n/a
Default Re: SAN vs Internal Disks

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Harsh Azad wrote:
> Thanks Scott, we have now requested IBM/EMC to provide test machines.
> Interestingly since you mentioned the importance of Raid controllers and the
> drivers; we are planning to use Cent OS 5 for hosting the DB.
>
> Firstly, I could only find postgres 8.1.x RPM for CentOS 5, could not find
> any RPM for 8.2.4. Is there any 8.2.4 RPM for CentOS 5?


Look under the RHEL section of ftp.postgresql.org

Joshua D. Drake

>
> Secondly, would investing into Redhat enterprise edition give any
> performance advantage? I know all the SAN boxes are only certified on RHEL
> and not CentOS. Or since CentOS is similar to RHEL it would be fine?
>
> Regards,
> Harsh
>
> On 9/6/07, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>> On 9/6/07, Harsh Azad <harsh.azad@gmail.com> wrote:
>>> Hi,
>>>
>>> We are currently running our DB on a DualCore, Dual Proc 3.Ghz Xeon, 8GB
>>> RAM, 4x SAS 146 GB 15K RPM on RAID 5.
>>>
>>> The current data size is about 50GB, but we want to purchase the

>> hardware to
>>> scale to about 1TB as we think our business will need to support that

>> much
>>> soon.
>>> - Currently we have a 80% read and 20% write percentages.

>> For this type load, you should be running on RAID10 not RAID5. Or, if
>> you must use RAID 5, use more disks and have a battery backed caching
>> RAID controller known to perform well with RAID5 and large arrays.
>>
>>> - Currently with this configuration the Database is showing signs of
>>> over-loading.

>> On I/O or CPU? If you're running out of CPU, then look to increasing
>> CPU horsepower and tuning postgresql.
>> If I/O then you need to look into a faster I/O subsystem.
>>
>>> - Auto-vaccum, etc run on this database, vaccum full runs nightly.

>> Generally speaking, if you need to run vacuum fulls, you're doing
>> something wrong. Is there a reason you're running vacuum full or is
>> this just precautionary. vacuum full can bloat your indexes, so you
>> shouldn't run it regularly. reindexing might be a better choice if
>> you do need to regularly shrink your db. The better option is to
>> monitor your fsm usage and adjust fsm settings / autovacuum settings
>> as necessary.
>>
>>> - Currently CPU loads are about 20%, memory utilization is full (but

>> this
>>> is also due to linux caching disk blocks) and IO waits are frequent.
>>> - We have a load of about 400 queries per second

>> What does vmstat et. al. say about CPU versus I/O wait?
>>
>>> Now we are considering to purchase our own servers and in the process

>> are
>>> facing the usual dilemmas. First I'll list out what machine we have

>> decided
>>> to use:
>>> 2x Quad Xeon 2.4 Ghz (4-way only 2 populated right now)
>>> 32 GB RAM
>>> OS Only storage - 2x SCSI 146 GB 15k RPM on RAID-1
>>> (Data Storage mentioned below)
>>>
>>> We have already decided to split our database into 3 machines on the

>> basis
>>> on disjoint sets of data. So we will be purchasing three of these boxes.
>>>
>>> HELP 1: Does something look wrong with above configuration, I know there
>>> will be small differences b/w opetron/xeon. But do you think there is
>>> something against going for 2.4Ghz Quad Xeons (clovertown i think)?

>> Look like good machines, plenty fo memory.
>>
>>> HELP 2: The main confusion is with regards to Data Storage. We have the
>>> option of going for:
>>>
>>> A: IBM N-3700 SAN Box, having 12x FC 300GB disks, Partitioned into 3

>> disks
>>> into RAID-4 for WAL/backup, and 9 disks on RAID-DP for data, 2 hot

>> spare. We
>>> are also considering similar solution from EMC - CX310C.
>>>
>>> B: Go for Internal of DAS based storage. Here for each server we should

>> be
>>> able to have: 2x disks on RAID-1 for logs, 6x disks on RAID-10 for
>>> tablespace1 and 6x disks on RAID-10 for tablespace2. Or maybe 12x disks

>> on
>>> RAID-10 single table-space.
>>>
>>> What do I think? Well..
>>> SAN wins on manageability, replication (say to a DR site), backup,

>> etc...
>>> DAS wins on cost

>> The problem with SAN is that it's apparently very easy to build a big
>> expensive system that performs poorly. We've seen reports of such
>> here on the lists a few times. I would definitely demand an
>> evaluation period from your supplier to make sure it performs well if
>> you go SAN.
>>
>>> But for a moment keeping these aside, i wanted to discuss, purely on
>>> performance side which one is a winner? It feels like internal-disks

>> will
>>> perform better, but need to understand a rough magnitude of difference

>> in
>>> performance to see if its worth loosing the manageability features.

>> That really really really depends. The quality of RAID controllers
>> for either setup is very important, as is the driver support, etc...
>> All things being even, I'd lean towards the local storage.
>>
>>> Also if we choose to go with DAS, what would be the best tool to do

>> async
>>> replication to DR site and maybe even as a extra plus a second read-only

>> DB
>>> server to distribute select loads.

>> Look at slony, or PITR with continuous recovery. Of those two, I've
>> only used Slony in production, and I was very happy with it's
>> performance, and it was very easy to write a bash script to monitor
>> the replication for failures.
>>

>
>
>



- --

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997 http://www.commandprompt.com/
UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG4DT2ATb/zqfZUUQRAoppAJ9Pj+/nDtDd/XhzMdRkjXcGHHuaeACfRTfV
wE8+ErUXuVnXmlchYvCPgu8=
=TihW
-----END PGP SIGNATURE-----

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-19-2008, 11:29 AM
Arjen van der Meijden
 
Posts: n/a
Default Re: SAN vs Internal Disks

On 6-9-2007 14:35 Harsh Azad wrote:
> 2x Quad Xeon 2.4 Ghz (4-way only 2 populated right now)


I don't understand this sentence. You seem to imply you might be able to
fit more processors in your system?
Currently the only Quad Core's you can buy are dual-processor
processors, unless you already got a quote for a system that yields the
new Intel "Tigerton" processors.
I.e. if they are clovertown's they are indeed Intel Core-architecture
processors, but you won't be able to fit more than 2 in the system and
get 8 cores in a system.
If they are Tigerton, I'm a bit surprised you got a quote for that,
although HP seems to offer a system for those. If they are the old
dual-core MP's (70xx or 71xx), you don't want those...

> 32 GB RAM
> OS Only storage - 2x SCSI 146 GB 15k RPM on RAID-1
> (Data Storage mentioned below)


I doubt you need 15k-rpm drives for OS... But that won't matter much on
the total cost.

> HELP 1: Does something look wrong with above configuration, I know there
> will be small differences b/w opetron/xeon. But do you think there is
> something against going for 2.4Ghz Quad Xeons (clovertown i think)?


Apart from your implication that you may be able to stick more
processors in it: no, not to me. Two Quad Core Xeons were even faster
than 8 dual core opterons in our benchmarks, although that might also
indicate limited OS-, postgres or underlying I/O-scaling.
Obviously the new AMD Barcelona-line of processors (coming next week
orso) and the new Intel Quad Core's DP (Penryn?) and MP (Tigerton) may
be interesting to look at, I don't know how soon systems will be
available with those processors (HP seems to offer a tigerton-server).

> B: Go for Internal of DAS based storage. Here for each server we should
> be able to have: 2x disks on RAID-1 for logs, 6x disks on RAID-10 for
> tablespace1 and 6x disks on RAID-10 for tablespace2. Or maybe 12x disks
> on RAID-10 single table-space.


You don't necessarily need to use internal disks for DAS, since you can
also link an external SAS-enclosure either with or without an integrated
raid-controller (IBM, Sun, Dell, HP and others have options for that),
and those are able to be expanded to either multiple enclosures tied to
eachother or to a controller in the server.
Those may also be usable in a warm-standby-scenario and may be quite a
bit cheaper than FC-hardware.

> But for a moment keeping these aside, i wanted to discuss, purely on
> performance side which one is a winner? It feels like internal-disks
> will perform better, but need to understand a rough magnitude of
> difference in performance to see if its worth loosing the manageability
> features.


As said, you don't necessarily need real internal disks, since SAS can
be used with external enclosures as well, still being DAS. I have no
idea what difference you will or may see between those in terms of
performance. It probably largely depends on the raid-controller
available, afaik the disks will be mostly the same. And it might depend
on your available bandwidth, external SAS offers you a 4port-connection
allowing for a 12Gbit-connection between a disk-enclosure and a
controller. While - as I understand it - even expensive SAN-controllers
only offer dual-ported, 8Gbit connections?
What's more important is probably the amount of disks and raid-cache you
can buy in the SAN vs DAS-scenario. If you can buy 24 disks when going
for DAS vs only 12 whith SAN...

But then again, I'm no real storage expert, we only have two Dell MD1000
DAS-units at our site.

Best regards and good luck,

Arjen

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-19-2008, 11:29 AM
Scott Marlowe
 
Posts: n/a
Default Re: SAN vs Internal Disks

On 9/6/07, Harsh Azad <harsh.azad@gmail.com> wrote:
> Thanks Scott, we have now requested IBM/EMC to provide test machines.
> Interestingly since you mentioned the importance of Raid controllers and the
> drivers; we are planning to use Cent OS 5 for hosting the DB.


What RAID controllers have you looked at. Seems the two most popular
in terms of performance here have been Areca and 3Ware / Escalade.
LSI seems to come in a pretty close third. Adaptec is to be avoided
as are cheap RAID controllers (i.e. promise etc...) battery backed
cache is a must, and the bigger the better.

> Firstly, I could only find postgres 8.1.x RPM for CentOS 5, could not find
> any RPM for 8.2.4. Is there any 8.2.4 RPM for CentOS 5?
>
> Secondly, would investing into Redhat enterprise edition give any
> performance advantage? I know all the SAN boxes are only certified on RHEL
> and not CentOS. Or since CentOS is similar to RHEL it would be fine?


for all intents and purposes, CentOS and RHEL are the same OS, so any
pgsql rpm for one should pretty much work for the other. At the
worst, you might have to get a srpm and rebuild it for CentOS / White
Box.

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-19-2008, 11:29 AM
Harsh Azad
 
Posts: n/a
Default Re: SAN vs Internal Disks

Hi,

How about the Dell Perc 5/i card, 512MB battery backed cache or IBM
ServeRAID-8k Adapter?

I hope I am sending relevant information here, I am not too well versed with
RAID controllers.

Regards,
Harsh

On 9/6/07, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>
> On 9/6/07, Harsh Azad <harsh.azad@gmail.com> wrote:
> > Thanks Scott, we have now requested IBM/EMC to provide test machines.
> > Interestingly since you mentioned the importance of Raid controllers and

> the
> > drivers; we are planning to use Cent OS 5 for hosting the DB.

>
> What RAID controllers have you looked at. Seems the two most popular
> in terms of performance here have been Areca and 3Ware / Escalade.
> LSI seems to come in a pretty close third. Adaptec is to be avoided
> as are cheap RAID controllers (i.e. promise etc...) battery backed
> cache is a must, and the bigger the better.
>
> > Firstly, I could only find postgres 8.1.x RPM for CentOS 5, could not

> find
> > any RPM for 8.2.4. Is there any 8.2.4 RPM for CentOS 5?
> >
> > Secondly, would investing into Redhat enterprise edition give any
> > performance advantage? I know all the SAN boxes are only certified on

> RHEL
> > and not CentOS. Or since CentOS is similar to RHEL it would be fine?

>
> for all intents and purposes, CentOS and RHEL are the same OS, so any
> pgsql rpm for one should pretty much work for the other. At the
> worst, you might have to get a srpm and rebuild it for CentOS / White
> Box.
>




--
Harsh Azad
=======================
Harsh.Azad@gmail.com

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-19-2008, 11:29 AM
Mark Lewis
 
Posts: n/a
Default Re: SAN vs Internal Disks

On Thu, 2007-09-06 at 22:28 +0530, Harsh Azad wrote:
> Thanks Mark.
>
> If I replicate a snapshot of Data and log files (basically the entire
> PG data directory) and I maintain same version of postgres on both
> servers, it should work right?
>
> I am also thinking that having SAN storage will provide me with
> facility of keeping a warm standby DB. By just shutting one server
> down and starting the other mounting the same File system I should be
> able to bing my DB up when the primary inccurs a physical failure.
>
> I'm only considering SAN storage for this feature - has anyone ever
> used SAN for replication and warm standy-by on Postgres?
>
> Regards,
> Harsh



We used to use a SAN for warm standby of a database, but with Oracle and
not PG. It worked kinda sorta, except for occasional crashes due to
buggy drivers.

But after going through the exercise, we realized that we hadn't gained
anything over just doing master/slave replication between two servers,
except that it was more expensive, had a tendency to expose buggy
drivers, had a single point of failure in the SAN array, failover took
longer and we couldn't use the warm standby server to perform read-only
queries. So we reverted back and just used the SAN as expensive DAS and
set up a separate box for DB replication.

So if that's the only reason you're considering a SAN, then I'd advise
you to spend the extra money on more DAS disks.

Maybe I'm jaded by past experiences, but the only real use case I can
see to justify a SAN for a database would be something like Oracle RAC,
but I'm not aware of any PG equivalent to that.

-- Mark Lewis

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 05:46 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com