Unix Technical Forum

Filesystem Direct I/O and WAL sync option

This is a discussion on Filesystem Direct I/O and WAL sync option within the Pgsql Performance forums, part of the PostgreSQL category; --> All, I'm very curious to know if we may expect or guarantee any data consistency with WAL sync=OFF but ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 11:09 AM
Dimitri
 
Posts: n/a
Default Filesystem Direct I/O and WAL sync option

All,

I'm very curious to know if we may expect or guarantee any data
consistency with WAL sync=OFF but using file system mounted in Direct
I/O mode (means every write() system call called by PG really writes
to disk before return)...

So may we expect data consistency:
- none?
- per checkpoint basis?
- full?...

Thanks a lot for any info!

Rgds,
-Dimitri

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 11:09 AM
Heikki Linnakangas
 
Posts: n/a
Default Re: Filesystem Direct I/O and WAL sync option

Dimitri wrote:
> I'm very curious to know if we may expect or guarantee any data
> consistency with WAL sync=OFF but using file system mounted in Direct
> I/O mode (means every write() system call called by PG really writes
> to disk before return)...


You'd have to turn that mode on on the data drives as well to get
consistency, because fsync=off disables checkpoint fsyncs of the data
files as well.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-19-2008, 11:09 AM
Dimitri
 
Posts: n/a
Default Re: Filesystem Direct I/O and WAL sync option

Yes, disk drives are also having cache disabled or having cache on
controllers and battery protected (in case of more high-level
storage) - but is it enough to expect data consistency?... (I was
surprised about checkpoint sync, but does it always calls write()
anyway? because in this way it should work without fsync)...

On 7/3/07, Heikki Linnakangas <heikki@enterprisedb.com> wrote:
> Dimitri wrote:
> > I'm very curious to know if we may expect or guarantee any data
> > consistency with WAL sync=OFF but using file system mounted in Direct
> > I/O mode (means every write() system call called by PG really writes
> > to disk before return)...

>
> You'd have to turn that mode on on the data drives as well to get
> consistency, because fsync=off disables checkpoint fsyncs of the data
> files as well.
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com
>


---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-19-2008, 11:09 AM
Gregory Stark
 
Posts: n/a
Default Re: Filesystem Direct I/O and WAL sync option


"Dimitri" <dimitrik.fr@gmail.com> writes:

> Yes, disk drives are also having cache disabled or having cache on
> controllers and battery protected (in case of more high-level
> storage) - but is it enough to expect data consistency?... (I was
> surprised about checkpoint sync, but does it always calls write()
> anyway? because in this way it should work without fsync)...


Well if everything is mounted in sync mode then I suppose you have the same
guarantee as if fsync were called after every single write. If that's true
then surely that's at least as good. I'm curious how it performs though.

Actually it seems like in that configuration fsync should be basically
zero-cost. In other words, you should be able to leave fsync=on and get the
same performance (whatever that is) and not have to worry about any risks.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-19-2008, 11:09 AM
Dimitri
 
Posts: n/a
Default Re: Filesystem Direct I/O and WAL sync option

Yes Gregory, that's why I'm asking, because from 1800 transactions/sec
I'm jumping to 2800 transactions/sec! and it's more than important
performance level increase )

Rgds,
-Dimitri

On 7/4/07, Gregory Stark <stark@enterprisedb.com> wrote:
>
> "Dimitri" <dimitrik.fr@gmail.com> writes:
>
> > Yes, disk drives are also having cache disabled or having cache on
> > controllers and battery protected (in case of more high-level
> > storage) - but is it enough to expect data consistency?... (I was
> > surprised about checkpoint sync, but does it always calls write()
> > anyway? because in this way it should work without fsync)...

>
> Well if everything is mounted in sync mode then I suppose you have the same
> guarantee as if fsync were called after every single write. If that's true
> then surely that's at least as good. I'm curious how it performs though.
>
> Actually it seems like in that configuration fsync should be basically
> zero-cost. In other words, you should be able to leave fsync=on and get the
> same performance (whatever that is) and not have to worry about any risks.
>
> --
> Gregory Stark
> EnterpriseDB http://www.enterprisedb.com
>
>


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-19-2008, 11:09 AM
Gregory Stark
 
Posts: n/a
Default Re: Filesystem Direct I/O and WAL sync option


"Dimitri" <dimitrik.fr@gmail.com> writes:

> Yes Gregory, that's why I'm asking, because from 1800 transactions/sec
> I'm jumping to 2800 transactions/sec! and it's more than important
> performance level increase )


wow. That's kind of suspicious though. Does the new configuration take
advantage of the lack of the filesystem cache by increasing the size of
shared_buffers? Even then I wouldn't expect such a big boost unless you got
very lucky with the size of your working set compared to the two sizes of
shared_buffers.

It seems likely that somehow this change is not providing the same guarantees
as fsync. Perhaps fsync is actually implementing IDE write barriers and the
sync mode is just flushing buffers to the hard drive cache and then returning.

What transaction rate do you get if you just have a single connection
streaming inserts in autocommit mode? What kind of transaction rate do you get
with both sync mode on and fsync=on in Postgres?

And did you say this with a battery backed cache? In theory fsync=on/off and
shouldn't make much difference at all with a battery backed cache. Stranger
and stranger.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-19-2008, 11:09 AM
Dimitri
 
Posts: n/a
Default Re: Filesystem Direct I/O and WAL sync option

Gregory, thanks for good questions! )
I got more lights on my throughput here )

The running OS is Solaris9 (customer is still not ready to upgrade to
Sol10), and I think the main "sync" issue is coming from the old UFS
implementation... UFS mounted with 'forcedirectio' option uses
different "sync" logic as well accepting concurrent writing to the
same file which is giving here a higher performance level. I did not
expect really so big gain, so did not think to replay the same test
with direct I/O on and fsync=on too. For my big surprise - it also
reached 2800 tps as with fsync=off !!! So, initial question is no more
valid )

As well my tests are executed just to validate server + storage
capabilities, and honestly it's really pity to see them used under old
Solaris version )
but well, at least we know what kind of performance they may expect
currently, and think about migration before the end of this year...

Seeing at least 10.000 random writes/sec on storage sub-system during
live database test was very pleasant to customer and make feel them
comfortable for their production...

Thanks a lot for all your help!

Best regards!
-Dimitri

On 7/4/07, Gregory Stark <stark@enterprisedb.com> wrote:
>
> "Dimitri" <dimitrik.fr@gmail.com> writes:
>
> > Yes Gregory, that's why I'm asking, because from 1800 transactions/sec
> > I'm jumping to 2800 transactions/sec! and it's more than important
> > performance level increase )

>
> wow. That's kind of suspicious though. Does the new configuration take
> advantage of the lack of the filesystem cache by increasing the size of
> shared_buffers? Even then I wouldn't expect such a big boost unless you got
> very lucky with the size of your working set compared to the two sizes of
> shared_buffers.
>
> It seems likely that somehow this change is not providing the same
> guarantees
> as fsync. Perhaps fsync is actually implementing IDE write barriers and the
> sync mode is just flushing buffers to the hard drive cache and then
> returning.
>
> What transaction rate do you get if you just have a single connection
> streaming inserts in autocommit mode? What kind of transaction rate do you
> get
> with both sync mode on and fsync=on in Postgres?
>
> And did you say this with a battery backed cache? In theory fsync=on/off and
> shouldn't make much difference at all with a battery backed cache. Stranger
> and stranger.
>
> --
> Gregory Stark
> EnterpriseDB http://www.enterprisedb.com
>
>


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-19-2008, 11:10 AM
Jim C. Nasby
 
Posts: n/a
Default Re: Filesystem Direct I/O and WAL sync option

On Tue, Jul 03, 2007 at 04:06:29PM +0100, Heikki Linnakangas wrote:
> Dimitri wrote:
> >I'm very curious to know if we may expect or guarantee any data
> >consistency with WAL sync=OFF but using file system mounted in Direct
> >I/O mode (means every write() system call called by PG really writes
> >to disk before return)...

>
> You'd have to turn that mode on on the data drives as well to get
> consistency, because fsync=off disables checkpoint fsyncs of the data
> files as well.


BTW, it might be worth trying the different wal_sync_methods. IIRC,
Jonah's seen some good results from open_datasync.
--
Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.3 (FreeBSD)

iD8DBQFGkptUdO30qud8SkgRAl8iAJ4s3IkZqGNqVq7DoHQcag nCyQkVBgCfcCin
ldjN/xHattQVcsicvO71x/s=
=rOSK
-----END PGP SIGNATURE-----

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-19-2008, 11:10 AM
Jonah H. Harris
 
Posts: n/a
Default Re: Filesystem Direct I/O and WAL sync option

On 7/9/07, Jim C. Nasby <decibel@decibel.org> wrote:
> BTW, it might be worth trying the different wal_sync_methods. IIRC,
> Jonah's seen some good results from open_datasync.


On Linux, using ext3, reiser, or jfs, I've seen open_sync perform
quite better than fsync/fdatasync in most of my tests. But, I haven't
done significant testing with direct I/O lately.

--
Jonah H. Harris, Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
33 Wood Ave S, 3rd Floor | jharris@enterprisedb.com
Iselin, New Jersey 08830 | http://www.enterprisedb.com/

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-19-2008, 11:10 AM
Dimitri
 
Posts: n/a
Default Re: Filesystem Direct I/O and WAL sync option

Yes, I tried all WAL sync methods, but there was no difference...
However, there was a huge difference when I run the same tests under
Solaris10 - 'fdatasync' option gave the best performance level. On the
same time direct I/O did not make difference on Solaris 10

So the main rule - there is no universal rule
just adapt system options according your workload...

Direct I/O will generally speed-up write operation due avoiding buffer
flashing overhead as well concurrent writing (breaking POSIX
limitation of single writer per given file on the same time). But on
the same time it may slow-down your read operations, and you may need
64bit PG version to use big cache to still keep same performance level
on SELECT queries. And again, there are other file systems like QFS
(for ex.) which may give you the best of both worlds: direct write and
buffered read on the same time! etc. etc. etc.

Rgds,
-Dimitri

On 7/9/07, Jonah H. Harris <jonah.harris@gmail.com> wrote:
> On 7/9/07, Jim C. Nasby <decibel@decibel.org> wrote:
> > BTW, it might be worth trying the different wal_sync_methods. IIRC,
> > Jonah's seen some good results from open_datasync.

>
> On Linux, using ext3, reiser, or jfs, I've seen open_sync perform
> quite better than fsync/fdatasync in most of my tests. But, I haven't
> done significant testing with direct I/O lately.
>
> --
> Jonah H. Harris, Software Architect | phone: 732.331.1324
> EnterpriseDB Corporation | fax: 732.331.1301
> 33 Wood Ave S, 3rd Floor | jharris@enterprisedb.com
> Iselin, New Jersey 08830 | http://www.enterprisedb.com/
>


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 05:53 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com