Unix Technical Forum

Bad iostat numbers

This is a discussion on Bad iostat numbers within the Pgsql Performance forums, part of the PostgreSQL category; --> The problem I see with software raid is the issue of a battery backed unit: If the computer loses ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #21 (permalink)  
Old 04-19-2008, 09:50 AM
Alex Turner
 
Posts: n/a
Default Re: Bad iostat numbers

The problem I see with software raid is the issue of a battery backed unit:
If the computer loses power, then the 'cache' which is held in system
memory, goes away, and fubars your RAID.

Alex

On 12/5/06, Michael Stone <mstone+postgres@mathom.us> wrote:
>
> On Tue, Dec 05, 2006 at 01:21:38AM -0500, Alex Turner wrote:
> >My other and most important point is that I can't find any solid
> >recommendations for a SCSI card that can perform optimally in Linux or
> >*BSD. Off by a factor of 3x is pretty sad IMHO. (and yes, we know the
> >Adaptec cards suck worse, that doesn't bring us to a _good_ card).

>
> This gets back to my point about terminology. As a SCSI HBA the Adaptec
> is decent: I can sustain about 300MB/s off a single channel of the
> 39320A using an external RAID controller. As a RAID controller I can't
> even imagine using the Adaptec; I'm fairly certain they put that
> "functionality" on there just so they could charge more for the card. It
> may be that there's not much market for on-board SCSI RAID controllers;
> between SATA on the low end and SAS & FC on the high end, there isn't a
> whole lotta space left for SCSI. I definitely don't think much
> R&D is going into SCSI controllers any more, compared to other solutions
> like SATA or SAS RAID (the 39320 hasn't change in at least 3 years,
> IIRC). Anyway, since the Adaptec part is a decent SCSI controller and a
> lousy RAID controller, have you tried just using software RAID?
>
> Mike Stone
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #22 (permalink)  
Old 04-19-2008, 09:50 AM
Craig A. James
 
Posts: n/a
Default Re: Bad iostat numbers

Alex Turner wrote:
> The problem I see with software raid is the issue of a battery backed
> unit: If the computer loses power, then the 'cache' which is held in
> system memory, goes away, and fubars your RAID.


I'm not sure I see the difference. If data are cached, they're not written whether it is software or hardware RAID. I guess if you're writing RAID 1, the N disks could be out of sync, but the system can synchronize them once the array is restored, so that's no different than a single disk or a hardware RAID. If you're writing RAID 5, then the blocks are inherently error detecting/correcting, so you're still OK if a partial write occurs, right?

I'm not familiar with the inner details of software RAID, but the only circumstance I can see where things would get corrupted is if the RAID driver writes a LOT of blocks to one disk of the array before synchronizing the others, but my guess (and it's just a guess) is that the writes to the N disks are tightly coupled.

If I'm wrong about this, I'd like to know, because I'm using software RAID 1 and 1+0, and I'm pretty happy with it.

Craig

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #23 (permalink)  
Old 04-19-2008, 09:50 AM
Michael Stone
 
Posts: n/a
Default Re: Bad iostat numbers

On Tue, Dec 05, 2006 at 07:57:43AM -0500, Alex Turner wrote:
>The problem I see with software raid is the issue of a battery backed unit:
>If the computer loses power, then the 'cache' which is held in system
>memory, goes away, and fubars your RAID.


Since the Adaptec doesn't have a BBU, it's a lateral move. Also, this is
less an issue of data integrity than performance; you can get exactly
the same level of integrity, you just have to wait for the data to sync
to disk. If you're read-mostly that's irrelevant.

Mike Stone

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #24 (permalink)  
Old 04-19-2008, 09:50 AM
Greg Smith
 
Posts: n/a
Default Re: Bad iostat numbers

On Tue, 5 Dec 2006, Craig A. James wrote:

> I'm not familiar with the inner details of software RAID, but the only
> circumstance I can see where things would get corrupted is if the RAID driver
> writes a LOT of blocks to one disk of the array before synchronizing the
> others...


You're talking about whether the discs in the RAID are kept consistant.
While it's helpful with that, too, that's not the main reason a the
battery-backed cache is so helpful. When PostgreSQL writes to the WAL, it
waits until that data has really been placed on the drive before it enters
that update into the database. In a normal situation, that means that you
have to pause until the disk has physically written the blocks out, and
that puts a fairly low upper limit on write performance that's based on
how fast your drives rotate. RAID 0, RAID 1, none of that will speed up
the time it takes to complete a single synchronized WAL write.

When your controller has a battery-backed cache, it can immediately tell
Postgres that the WAL write completed succesfully, while actually putting
it on the disk later. On my systems, this results in simple writes going
2-4X as fast as they do without a cache. Should there be a PC failure, as
long as power is restored before the battery runs out that transaction
will be preserved.

What Alex is rightly pointing out is that a software RAID approach doesn't
have this feature. In fact, in this area performance can be even worse
under SW RAID than what you get from a single disk, because you may have
to wait for multiple discs to spin to the correct position and write data
out before you can consider the transaction complete.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #25 (permalink)  
Old 04-19-2008, 09:50 AM
Steve Atkins
 
Posts: n/a
Default Re: Bad iostat numbers


On Dec 5, 2006, at 8:54 PM, Greg Smith wrote:

> On Tue, 5 Dec 2006, Craig A. James wrote:
>
>> I'm not familiar with the inner details of software RAID, but the
>> only circumstance I can see where things would get corrupted is if
>> the RAID driver writes a LOT of blocks to one disk of the array
>> before synchronizing the others...

>
> You're talking about whether the discs in the RAID are kept
> consistant. While it's helpful with that, too, that's not the main
> reason a the battery-backed cache is so helpful. When PostgreSQL
> writes to the WAL, it waits until that data has really been placed
> on the drive before it enters that update into the database. In a
> normal situation, that means that you have to pause until the disk
> has physically written the blocks out, and that puts a fairly low
> upper limit on write performance that's based on how fast your
> drives rotate. RAID 0, RAID 1, none of that will speed up the time
> it takes to complete a single synchronized WAL write.
>
> When your controller has a battery-backed cache, it can immediately
> tell Postgres that the WAL write completed succesfully, while
> actually putting it on the disk later. On my systems, this results
> in simple writes going 2-4X as fast as they do without a cache.
> Should there be a PC failure, as long as power is restored before
> the battery runs out that transaction will be preserved.
>
> What Alex is rightly pointing out is that a software RAID approach
> doesn't have this feature. In fact, in this area performance can
> be even worse under SW RAID than what you get from a single disk,
> because you may have to wait for multiple discs to spin to the
> correct position and write data out before you can consider the
> transaction complete.


So... the ideal might be a RAID1 controller with BBU for the WAL and
something else, such as software RAID, for the main data array?

Cheers,
Steve


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 04:52 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com