Unix Technical Forum

Re: Postgresql Performance on an HP DL385 and

This is a discussion on Re: Postgresql Performance on an HP DL385 and within the Pgsql Performance forums, part of the PostgreSQL category; --> mark@mark.mielke.cc writes: > WAL file is never appended - only re-written? > If so, then I'm wrong, and ext2 ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #31 (permalink)  
Old 04-19-2008, 09:16 AM
Tom Lane
 
Posts: n/a
Default Re: Postgresql Performance on an HP DL385 and

mark@mark.mielke.cc writes:
> WAL file is never appended - only re-written?


> If so, then I'm wrong, and ext2 is fine. The requirement is that no
> file system structures change as a result of any writes that
> PostgreSQL does. If no file system structures change, then I take
> everything back as uninformed.


That risk certainly exists in the general data directory, but AFAIK
it's not a problem for pg_xlog.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #32 (permalink)  
Old 04-19-2008, 09:16 AM
Michael Stone
 
Posts: n/a
Default Re: Postgresql Performance on an HP DL385 and

On Tue, Aug 15, 2006 at 02:15:05PM -0500, Jim C. Nasby wrote:
>Now, if
>fsync'ing a file also ensures that all the metadata is written, then
>we're probably fine...


....and it does. Unclean shutdowns cause problems in general because
filesystems operate asynchronously. postgres (and other similar
programs) go to great lengths to make sure that critical operations are
performed synchronously. If the program *doesn't* do that, metadata
journaling isn't a magic wand which will guarantee data integrity--it
won't. If the program *does* do that, all the metadata journaling adds
is the ability to skip fsck and start up faster.

Mike Stone

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #33 (permalink)  
Old 04-19-2008, 09:16 AM
Michael Stone
 
Posts: n/a
Default Re: Postgresql Performance on an HP DL385 and

On Tue, Aug 15, 2006 at 03:39:51PM -0400, mark@mark.mielke.cc wrote:
>No. This is not true. Updating the file system structure (inodes, indirect
>blocks) touches a separate part of the disk than the actual data. If
>the file system structure is modified, say, to extend a file to allow
>it to contain more data, but the data itself is not written, then upon
>a restore, with a system such as ext2, or ext3 with writeback, or xfs,
>it is possible that the end of the file, even the postgres log file,
>will contain a random block of data from the disk. If this random block
>of data happens to look like a valid xlog block, it may be played back,
>and the database corrupted.


you're conflating a whole lot of different issues here. You're ignoring
the fact that postgres preallocates the xlog segment, you're ignoring
the fact that you can sync a directory entry, you're ignoring the fact
that syncing some metadata (such as atime) doesn't matter (only the
block allocation is important in this case, and the blocks are
pre-allocated).

>This is also wrong. fsck is needed because the file system is broken.


nope, the file system *may* be broken. the dirty flag simply indicates
that the filesystem needs to be checked to find out whether or not it is
broken.

>I don't mean to be offensive, but I won't accept what you say, as it does
>not make sense with my understanding of how file systems work. :-)


<shrug> I'm not getting paid to convince you of anything.

Mike Stone

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #34 (permalink)  
Old 04-19-2008, 09:16 AM
mark@mark.mielke.cc
 
Posts: n/a
Default Re: Postgresql Performance on an HP DL385 and

On Tue, Aug 15, 2006 at 04:58:59PM -0400, Michael Stone wrote:
> On Tue, Aug 15, 2006 at 03:39:51PM -0400, mark@mark.mielke.cc wrote:
> >No. This is not true. Updating the file system structure (inodes, indirect
> >blocks) touches a separate part of the disk than the actual data. If
> >the file system structure is modified, say, to extend a file to allow
> >it to contain more data, but the data itself is not written, then upon
> >a restore, with a system such as ext2, or ext3 with writeback, or xfs,
> >it is possible that the end of the file, even the postgres log file,
> >will contain a random block of data from the disk. If this random block
> >of data happens to look like a valid xlog block, it may be played back,
> >and the database corrupted.

> you're conflating a whole lot of different issues here. You're ignoring
> the fact that postgres preallocates the xlog segment, you're ignoring
> the fact that you can sync a directory entry, you're ignoring the fact
> that syncing some metadata (such as atime) doesn't matter (only the
> block allocation is important in this case, and the blocks are
> pre-allocated).


Yes, no, no, no. :-)

I didn't know that the xlog segment only uses pre-allocated space. I
ignore mtime/atime as they don't count as file system structure
changes to me. It's updating a field in place. No change to the structure.

With the pre-allocation knowledge, I agree with you. Not sure how I
missed that in my reviewing of the archives... I did know it
pre-allocated once upon a time... Hmm....

> >This is also wrong. fsck is needed because the file system is broken.

> nope, the file system *may* be broken. the dirty flag simply indicates
> that the filesystem needs to be checked to find out whether or not it is
> broken.


Ah, but if we knew it wasn't broken, then fsck wouldn't be needed, now
would it? So we assume that it is broken. A little bit of a game, but
it is important to me. If I assumed the file system was not broken, I
wouldn't run fsck. I run fsck, because I assume it may be broken. If
broken, it indicates potential corruption.

The difference for me, is that if you are correct, that the xlog is
safe, than for a disk that only uses xlog, fsck is not ever necessary,
even after a system crash. If fsck is necessary, then there is potential
for a problem.

With the pre-allocation knowledge, I'm tempted to agree with you that
fsck is not ever necessary for partitions that only hold a properly
pre-allocated xlog.

> >I don't mean to be offensive, but I won't accept what you say, as it does
> >not make sense with my understanding of how file systems work. :-)

> <shrug> I'm not getting paid to convince you of anything.


Just getting you to back up your claim a bit... As I said, no intent
to offend. I learned from it.

Thanks,
mark

--
mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________
.. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #35 (permalink)  
Old 04-19-2008, 09:16 AM
Jim C. Nasby
 
Posts: n/a
Default Re: Postgresql Performance on an HP DL385 and

On Tue, Aug 15, 2006 at 05:38:43PM -0400, mark@mark.mielke.cc wrote:
> I didn't know that the xlog segment only uses pre-allocated space. I
> ignore mtime/atime as they don't count as file system structure
> changes to me. It's updating a field in place. No change to the structure.
>
> With the pre-allocation knowledge, I agree with you. Not sure how I
> missed that in my reviewing of the archives... I did know it
> pre-allocated once upon a time... Hmm....


This is only valid if the pre-allocation is also fsync'd *and* fsync
ensures that both the metadata and file data are on disk. Anyone
actually checked that?

BTW, I did see some anecdotal evidence on one of the lists a while ago.
A PostgreSQL DBA had suggested doing a 'pull the power cord' test to the
other DBAs (all of which were responsible for different RDBMSes,
including a bunch of well known names). They all thought he was off his
rocker. Not too long after that, an unplanned power outage did occur,
and PostgreSQL was the only RDBMS that recovered every single database
without intervention.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #36 (permalink)  
Old 04-19-2008, 09:16 AM
Steinar H. Gunderson
 
Posts: n/a
Default Re: Postgresql Performance on an HP DL385 and

On Tue, Aug 15, 2006 at 05:20:25PM -0500, Jim C. Nasby wrote:
> This is only valid if the pre-allocation is also fsync'd *and* fsync
> ensures that both the metadata and file data are on disk. Anyone
> actually checked that?


fsync() does that, yes. fdatasync() (if it exists), OTOH, doesn't sync the
metadata.

/* Steinar */
--
Homepage: http://www.sesse.net/

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #37 (permalink)  
Old 04-19-2008, 09:16 AM
David Lang
 
Posts: n/a
Default Re: Postgresql Performance on an HP DL385 and

On Tue, 15 Aug 2006 mark@mark.mielke.cc wrote:
>>> This is also wrong. fsck is needed because the file system is broken.

>> nope, the file system *may* be broken. the dirty flag simply indicates
>> that the filesystem needs to be checked to find out whether or not it is
>> broken.

>
> Ah, but if we knew it wasn't broken, then fsck wouldn't be needed, now
> would it? So we assume that it is broken. A little bit of a game, but
> it is important to me. If I assumed the file system was not broken, I
> wouldn't run fsck. I run fsck, because I assume it may be broken. If
> broken, it indicates potential corruption.


note tha the ext3, reiserfs, jfs, and xfs developers (at least) consider
fsck nessasary even for journaling fileysstems. they just let you get away
without it being mandatory after a unclean shutdown.

David Lang

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #38 (permalink)  
Old 04-19-2008, 09:16 AM
Tom Lane
 
Posts: n/a
Default Re: Postgresql Performance on an HP DL385 and

"Steinar H. Gunderson" <sgunderson@bigfoot.com> writes:
> On Tue, Aug 15, 2006 at 05:20:25PM -0500, Jim C. Nasby wrote:
>> This is only valid if the pre-allocation is also fsync'd *and* fsync
>> ensures that both the metadata and file data are on disk. Anyone
>> actually checked that?


> fsync() does that, yes. fdatasync() (if it exists), OTOH, doesn't sync the
> metadata.


Well, the POSIX spec says that fsync should do that ;-)

My guess is that most/all kernel filesystem layers do indeed try to sync
everything that the spec says they should. The Achilles' heel of the
whole business is disk drives that lie about write completion. The
kernel is just as vulnerable to that as any application ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #39 (permalink)  
Old 04-19-2008, 09:17 AM
Markus Schaber
 
Posts: n/a
Default Re: Postgresql Performance on an HP DL385 and

Hi, Jim,

Jim C. Nasby wrote:

> Well, if the controller is caching with a BBU, I'm not sure that order
> matters anymore, because the controller should be able to re-order at
> will. Theoretically. But this is why having some actual data posted
> somewhere would be great.


Well, actually, the controller should not reorder over write barriers.


Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 03:59 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com