This is a discussion on Re: Lots of "semop" calls under load within the Pgsql Performance forums, part of the PostgreSQL category; --> "Albe Laurenz" <laurenz.albe@wien.gv.at> writes: > On a database (PostgreSQL 8.2.4 on 64-bit Linux 2.6.18 on 8 AMD Opterons) > ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| "Albe Laurenz" <laurenz.albe@wien.gv.at> writes: > On a database (PostgreSQL 8.2.4 on 64-bit Linux 2.6.18 on 8 AMD Opterons) > that is under high load, I observe the following: > ... > - "vmstat" shows that CPU time is divided between "idle" and "iowait", > with user and sys time practically zero. > - "sar" says that the disk with the database is on 100% of its capacity. It sounds like you've simply saturated the disk's I/O bandwidth. (I've noticed that Linux isn't all that good about distinguishing "idle" from "iowait" --- more than likely you're really looking at 100% iowait.) > Storage is on a SAN box. What kind of SAN box? You're going to need something pretty beefy to keep all those CPUs busy. > What puzzles me is the "strace -tt" output from that backend: Some low level of contention and consequent semops/context switches is to be expected. I don't think you need to worry if it's only 100/sec. The sort of "context swap storm" behavior we've seen in the past is in the tens of thousands of swaps/sec on hardware much weaker than what you have here --- if you were seeing one of those I bet you'd be well above 100000 swaps/sec. > Are the lseek and read operations really that fast although the disk is on 100%? lseek is (should be) cheap ... it doesn't do any actual I/O. The read()s you're showing here were probably satisfied from kernel disk cache. If you look at a larger sample you'll find slower ones, I think. Another thing to look for is slow writes. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance |
| ||||
| Tom Lane wrote: >> On a database (PostgreSQL 8.2.4 on 64-bit Linux 2.6.18 on 8 AMD Opterons) >> that is under high load, I observe the following: ... >> - "vmstat" shows that CPU time is divided between "idle" and "iowait", >> with user and sys time practically zero. >> - "sar" says that the disk with the database is on 100% of its capacity. > > It sounds like you've simply saturated the disk's I/O bandwidth. > (I've noticed that Linux isn't all that good about distinguishing "idle" > from "iowait" --- more than likely you're really looking at > 100% iowait.) > >> Storage is on a SAN box. > > What kind of SAN box? You're going to need something pretty beefy to > keep all those CPUs busy. HP EVA 8100. Our storage people think that the observed I/O rate is not ok. They mutter something about kernel disk cache configuration. >> What puzzles me is the "strace -tt" output from that backend: > > I don't think you need to worry [...] Thanks for explaining the strace output. I am now more confident that the I/O overload is not the fault of PostgreSQL. Most execution plans look as good as they can be, so it's probably either the I/O system or the application that's at fault. Yours, Laurenz Albe -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance |