This is a discussion on EXPLAIN ANALYZE on 8.2 within the Pgsql Performance forums, part of the PostgreSQL category; --> Hi, everybody! Running the same query on pg 8.2 through EXPLAIN ANALYZE takes 4x-10x time as running it without ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| Hi, everybody! Running the same query on pg 8.2 through EXPLAIN ANALYZE takes 4x-10x time as running it without it. Is it ok? Example: testing=> select count(*) from auth_user; count --------- 2575675 (1 row) Time: 1450,829 ms testing=> explain analyze select count(*) from auth_user; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=89814.87..89814.88 rows=1 width=0) (actual time=18460.436..18460.439 rows=1 loops=1) -> Seq Scan on auth_user (cost=0.00..83373.89 rows=2576389 width=0) (actual time=0.424..9871.520 rows=2575675 loops=1) Total runtime: 18460.535 ms (3 rows) Time: 18461,194 ms ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Evgeny Gridasov <eugrid@fpm.kubsu.ru> writes: > Running the same query on pg 8.2 through EXPLAIN ANALYZE takes 4x-10x time as running it without it. If your machine has slow gettimeofday() this is not surprising. 8.2 is no worse (or better) than any prior version. Some quick arithmetic from your results suggests that gettimeofday() is taking about 3.3 microseconds, which is indeed pretty awful. What sort of machine is this exactly? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| Evgeny Gridasov <eugrid@fpm.kubsu.ru> writes: > This is a Linux Debian 3.1 ontop of 2x XEON 3.4 Ghz. > PostgreSQL is 8.2 checked out from CVS REL8_2_STABLE yesterday. > I'm running the same Postgres on another machine, > with Debian Etch and have the same results. Hmph. With 8.2 on Fedora 5 on a 2.8Ghz dual Xeon, I get this: regression=# create table foo as select x from generate_series(1,2500000) x; SELECT regression=# vacuum foo; VACUUM regression=# checkpoint; CHECKPOINT regression=# \timing Timing is on. regression=# select count(*) from foo; count --------- 2500000 (1 row) Time: 666.639 ms regression=# select count(*) from foo; count --------- 2500000 (1 row) Time: 609.514 ms regression=# explain analyze select count(*) from foo; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Aggregate (cost=44764.00..44764.01 rows=1 width=0) (actual time=1344.812..1344.813 rows=1 loops=1) -> Seq Scan on foo (cost=0.00..38514.00 rows=2500000 width=0) (actual time=0.031..748.571 rows=2500000 loops=1) Total runtime: 1344.891 ms (3 rows) Time: 1345.755 ms regression=# explain analyze select count(*) from foo; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Aggregate (cost=44764.00..44764.01 rows=1 width=0) (actual time=1324.846..1324.847 rows=1 loops=1) -> Seq Scan on foo (cost=0.00..38514.00 rows=2500000 width=0) (actual time=0.046..748.582 rows=2500000 loops=1) Total runtime: 1324.902 ms (3 rows) Time: 1325.591 ms regression=# which works out to about 0.14 microsec per gettimeofday call, on a machine that ought to be slower than yours. So I think you've got either a crummy motherboard, or a kernel that doesn't know the best way to read the clock on that hardware. There is some discussion of this in the archives (probably in pgsql-hackers); look back around May or so when we were unsuccessfully trying to hack EXPLAIN to use fewer gettimeofday calls. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Tom, Hello. This is a Linux Debian 3.1 ontop of 2x XEON 3.4 Ghz. PostgreSQL is 8.2 checked out from CVS REL8_2_STABLE yesterday. I'm running the same Postgres on another machine, with Debian Etch and have the same results. On Thu, 14 Dec 2006 11:11:42 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > Evgeny Gridasov <eugrid@fpm.kubsu.ru> writes: > > Running the same query on pg 8.2 through EXPLAIN ANALYZE takes 4x-10x time as running it without it. > > If your machine has slow gettimeofday() this is not surprising. 8.2 is > no worse (or better) than any prior version. > > Some quick arithmetic from your results suggests that gettimeofday() is > taking about 3.3 microseconds, which is indeed pretty awful. What sort > of machine is this exactly? > > regards, tom lane > ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| On 12/14/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Evgeny Gridasov <eugrid@fpm.kubsu.ru> writes: > > This is a Linux Debian 3.1 ontop of 2x XEON 3.4 Ghz. > > PostgreSQL is 8.2 checked out from CVS REL8_2_STABLE yesterday. > > I'm running the same Postgres on another machine, > > with Debian Etch and have the same results. > > Hmph. With 8.2 on Fedora 5 on a 2.8Ghz dual Xeon, I get this: > <snip> > regression=# explain analyze select count(*) from foo; > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=44764.00..44764.01 rows=1 width=0) (actual time=1324.846..1324.847 rows=1 loops=1) > -> Seq Scan on foo (cost=0.00..38514.00 rows=2500000 width=0) (actual time=0.046..748.582 rows=2500000 loops=1) > Total runtime: 1324.902 ms > (3 rows) > > Time: 1325.591 ms > regression=# > > which works out to about 0.14 microsec per gettimeofday call, on a > machine that ought to be slower than yours. So I think you've got > either a crummy motherboard, or a kernel that doesn't know the best > way to read the clock on that hardware. There is some discussion > of this in the archives (probably in pgsql-hackers); look back around > May or so when we were unsuccessfully trying to hack EXPLAIN to use > fewer gettimeofday calls. Yow! I notice the same thing on our HP BL25p blades w/2*opteron 270 (four total cores, AMD 8111 or 8131 chipset). 1.25 microsec/call vs my new desktop (Intel Core2 6300) 0.16 microsec/call. I hope this isn't a "crummy mainboard" but I can't seem to affect things by changing clock source (kernel 2.6.16 SLES10). I tried kernel command option clock=XXX where XXX in (cyclone,hpet,pmtmr,tsc,pit), no option was significantly better than the default. Anyone know how this might be improved (short of replacing hardware)? -K ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| Kelly Burkhart wrote: > > I hope this isn't a "crummy mainboard" but I can't seem to affect > things by changing clock source (kernel 2.6.16 SLES10). I tried > kernel command option clock=XXX where XXX in > (cyclone,hpet,pmtmr,tsc,pit), no option was significantly better than > the default. > > Anyone know how this might be improved (short of replacing hardware)? > Updating the BIOS might be worth investigating, and then bugging your Linux distro mailing list/support etc for more help. (What sort of motherboard is it?) Cheers Mark ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| "Kelly Burkhart" <kelly.burkhart@gmail.com> writes: > I hope this isn't a "crummy mainboard" but I can't seem to affect > things by changing clock source (kernel 2.6.16 SLES10). I tried > kernel command option clock=XXX where XXX in > (cyclone,hpet,pmtmr,tsc,pit), no option was significantly better than > the default. I believe that on machines where gettimeofday is really nice and fast, it doesn't require entry to the kernel at all; there's some hack that makes the clock readable from userspace. (Obviously a switch to kernel mode would set you back a lot of the cycles involved here.) So it's not so much the kernel that you need to teach as glibc. How you do that is beyond my expertise, but maybe that will help you google for the right thing ... regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq |
| |||
| On Thu, 2006-12-14 at 19:05 -0500, Tom Lane wrote: > "Kelly Burkhart" <kelly.burkhart@gmail.com> writes: > > I hope this isn't a "crummy mainboard" but I can't seem to affect > > things by changing clock source (kernel 2.6.16 SLES10). I tried > > kernel command option clock=XXX where XXX in > > (cyclone,hpet,pmtmr,tsc,pit), no option was significantly better than > > the default. > > I believe that on machines where gettimeofday is really nice and fast, > it doesn't require entry to the kernel at all; there's some hack that > makes the clock readable from userspace. (Obviously a switch to kernel > mode would set you back a lot of the cycles involved here.) So it's not > so much the kernel that you need to teach as glibc. How you do that is > beyond my expertise, but maybe that will help you google for the right > thing ... Until we work out a better solution we can fix this in two ways: 1. EXPLAIN ANALYZE [ [ WITH | WITHOUT ] TIME STATISTICS ] ... 2. enable_analyze_timer = off | on (default) (USERSET) A performance drop of 4x-10x is simply unacceptable when trying to tune queries where the current untuned time is already too high. Tying down production servers for hours on end when we know for certain all they are doing is calling gettimeofday millions of times is not good. This quickly leads to the view from objective people that PostgreSQL doesn't have a great optimizer, whatever we say in its defence. I don't want to leave this alone, but I don't want to spend a month fixing it either. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| On Fri, Dec 15, 2006 at 12:20:46PM +0000, Simon Riggs wrote: > > I > > wrote a patch that tried statistical sampling, but the figures were too > > far off for people's liking. > > Well, I like your ideas, so if you have any more... > > Maybe sampling every 10 rows will bring things down to an acceptable > level (after the first N). You tried less than 10 didn't you? Yeah, it reduced the number of calls as the count got larger. It broke somewhere, though I don't quite remember why. > Maybe we can count how many real I/Os were required to perform each > particular row, so we can adjust the time per row based upon I/Os. ISTM > that sampling at too low a rate means we can't spot the effects of cache > and I/O which can often be low frequency but high impact. One idea is to sample at fixed interval. Where the I/O cost is high, there'll be a lot of sampling points. The issue being that the final result are not totally accurate anymore. > > I think the best option is setitimer(), but it's not POSIX so > > platform support is going to be patchy. > > Don't understand that. I thought that was to do with alarms and signals. On my system the manpage say setitimer() conforms to: SVr4, 4.4BSD. I'm unsure how many of the supported platforms fall under that catagorisation. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFFgpV8IB7bNG8LQkwRAkGgAJ4uNkxcDCktVu4SrnBh+g JDMlKibwCfdh3M kL0ipnnf9nsss7T/1uEiW0s= =xcXP -----END PGP SIGNATURE----- |
| ||||
| * Simon Riggs: >> I think the best option is setitimer(), but it's not POSIX so >> platform support is going to be patchy. > > Don't understand that. I thought that was to do with alarms and > signals. You could use it for sampling. Every few milliseconds, you record which code is executing (possibly using a global variable which is set and reset accordingly). But this might lead to tons of interrupted system calls, and not all systems mask them, so this might not be an option for the PostgreSQL code base. On the other hand, if performance in more recent Linux releases (from kernel.org) are acceptable, you should assume that the problem will eventually fix itself. FWIW, I see the 9x overhead on something that is close to 2.6.17 (on AMD64/Opteron), so this could be wishful thinking. 8-( -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Kriegsstraße 100 tel: +49-721-96201-1 D-76133 Karlsruhe fax: +49-721-96201-99 ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |