Unix Technical Forum

strange query behavior

This is a discussion on strange query behavior within the Pgsql Performance forums, part of the PostgreSQL category; --> Hi, I have a query that uses an IN clause and it seems in perform great when there is ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 09:55 AM
Tim Jones
 
Posts: n/a
Default strange query behavior

Hi,

I have a query that uses an IN clause and it seems in perform great
when there is more than two values in it but if there is only one it is
really slow. Also if I change the query to use an = instead of IN in the
case of only one value it is still slow. Possibly I need to reindex this
particular index?

thanks

Tim Jones
Healthcare Project Manager
Optio Software, Inc.
(770) 576-3555


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 09:55 AM
Ragnar
 
Posts: n/a
Default Re: strange query behavior

On miš, 2006-12-13 at 13:42 -0500, Tim Jones wrote:

> I have a query that uses an IN clause and it seems in perform great
> when there is more than two values in it but if there is only one it is
> really slow. Also if I change the query to use an = instead of IN in the
> case of only one value it is still slow. Possibly I need to reindex this
> particular index?


can you provide us with an EXPLAIN ANALYZE for these 2 cases?

what version pg is this?

does this happen only for a particular single value, or for any values?

I assume you have ANALYZEd the table in question.

gnari



---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-19-2008, 09:55 AM
Tom Lane
 
Posts: n/a
Default Re: strange query behavior

"Tim Jones" <TJones@optio.com> writes:
> I have a query that uses an IN clause and it seems in perform great
> when there is more than two values in it but if there is only one it is
> really slow. Also if I change the query to use an = instead of IN in the
> case of only one value it is still slow.


Please provide EXPLAIN ANALYZE output for both cases.

> Possibly I need to reindex this
> particular index?


More likely you need to ANALYZE the table so that the planner has
up-to-date stats ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-19-2008, 09:55 AM
Tim Jones
 
Posts: n/a
Default Re: strange query behavior

The tables for theses queries are vacuumed and analyzed regularly. I
just did an analyze to be sure and here are the results


explain analyze select * from battery join observationresults on
battery.batteryidentifier = observationresults.batteryidentifier left
outer join observationcomment on
observationresults.observationidentifier =
observationcomment.observationidentifier left outer join batterycomment
on battery.batteryidentifier=batterycomment.batteryid entifier where
battery.batteryidentifier in (1177470, 1177469);

QUERY PLAN

------------------------------------------------------------------------
------------------------------------------------------------------------
---------------
Nested Loop Left Join (cost=5.03..12553.00 rows=13 width=248) (actual
time=0.362..1.345 rows=30 loops=1)
-> Nested Loop Left Join (cost=4.01..12424.13 rows=13 width=208)
(actual time=0.307..0.927 rows=30 loops=1)
-> Nested Loop (cost=4.01..9410.49 rows=13 width=145) (actual
time=0.227..0.416 rows=30 loops=1)
-> Bitmap Heap Scan on battery (cost=4.01..11.64 rows=2
width=69) (actual time=0.135..0.138 rows=2 loops=1)
Recheck Cond: ((batteryidentifier = 1177470) OR
(batteryidentifier = 1177469))
-> BitmapOr (cost=4.01..4.01 rows=2 width=0)
(actual time=0.106..0.106 rows=0 loops=1)
-> Bitmap Index Scan on ix_battery_id
(cost=0.00..2.00 rows=1 width=0) (actual time=0.091..0.091 rows=1
loops=1)
Index Cond: (batteryidentifier =
1177470)
-> Bitmap Index Scan on ix_battery_id
(cost=0.00..2.00 rows=1 width=0) (actual time=0.011..0.011 rows=1
loops=1)
Index Cond: (batteryidentifier =
1177469)
-> Index Scan using ix_obresults_bat on
observationresults (cost=0.00..4682.40 rows=1362 width=76) (actual
time=0.047..0.091 rows=15 loops=2)
Index Cond: ("outer".batteryidentifier =
observationresults.batteryidentifier)
-> Index Scan using ix_obcomment_obid on observationcomment
(cost=0.00..227.73 rows=327 width=63) (actual time=0.013..0.013 rows=0
loops=30)
Index Cond: ("outer".observationidentifier =
observationcomment.observationidentifier)
-> Bitmap Heap Scan on batterycomment (cost=1.02..9.84 rows=6
width=40) (actual time=0.007..0.007 rows=0 loops=30)
Recheck Cond: ("outer".batteryidentifier =
batterycomment.batteryidentifier)
-> Bitmap Index Scan on ix_batcomment (cost=0.00..1.02 rows=6
width=0) (actual time=0.005..0.005 rows=0 loops=30)
Index Cond: ("outer".batteryidentifier =
batterycomment.batteryidentifier)
Total runtime: 1.585 ms


explain analyze select * from battery join observationresults on
battery.batteryidentifier = observationresults.batteryidentifier left
outer join observationcomment on
observationresults.observationidentifier =
observationcomment.observationidentifier left outer join batterycomment
on battery.batteryidentifier=batterycomment.batteryid entifier where
battery.batteryidentifier = 1177470;

QUERY PLAN

------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------
Hash Left Join (cost=4733.62..269304.43 rows=1348 width=248) (actual
time=19275.506..19275.568 rows=9 loops=1)
Hash Cond: ("outer".batteryidentifier = "inner".batteryidentifier)
-> Merge Right Join (cost=4723.75..269287.81 rows=1348 width=208)
(actual time=19275.432..19275.473 rows=9 loops=1)
Merge Cond: ("outer".observationidentifier =
"inner".observationidentifier)
-> Index Scan using ix_obcomment_obid on observationcomment
(cost=0.00..245841.14 rows=7484253 width=63) (actual
time=0.094..13403.300 rows=4361601 loops=1)
-> Sort (cost=4723.75..4727.12 rows=1348 width=145) (actual
time=0.270..0.278 rows=9 loops=1)
Sort Key: observationresults.observationidentifier
-> Nested Loop (cost=0.00..4653.67 rows=1348 width=145)
(actual time=0.166..0.215 rows=9 loops=1)
-> Index Scan using ix_battery_id on battery
(cost=0.00..5.81 rows=1 width=69) (actual time=0.079..0.082 rows=1
loops=1)
Index Cond: (batteryidentifier = 1177470)
-> Index Scan using ix_obresults_bat on
observationresults (cost=0.00..4634.38 rows=1348 width=76) (actual
time=0.079..0.102 rows=9 loops=1)
Index Cond: (1177470 = batteryidentifier)
-> Hash (cost=9.85..9.85 rows=6 width=40) (actual time=0.039..0.039
rows=0 loops=1)
-> Bitmap Heap Scan on batterycomment (cost=1.02..9.85 rows=6
width=40) (actual time=0.037..0.037 rows=0 loops=1)
Recheck Cond: (batteryidentifier = 1177470)
-> Bitmap Index Scan on ix_batcomment (cost=0.00..1.02
rows=6 width=0) (actual time=0.032..0.032 rows=0 loops=1)
Index Cond: (batteryidentifier = 1177470)
Total runtime: 19275.838 ms
(18 rows)


Tim Jones
Healthcare Project Manager
Optio Software, Inc.
(770) 576-3555
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Wednesday, December 13, 2006 2:17 PM
To: Tim Jones
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] strange query behavior

"Tim Jones" <TJones@optio.com> writes:
> I have a query that uses an IN clause and it seems in perform great
> when there is more than two values in it but if there is only one it
> is really slow. Also if I change the query to use an = instead of IN
> in the case of only one value it is still slow.


Please provide EXPLAIN ANALYZE output for both cases.

> Possibly I need to reindex this
> particular index?


More likely you need to ANALYZE the table so that the planner has
up-to-date stats ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-19-2008, 09:55 AM
Tom Lane
 
Posts: n/a
Default Re: strange query behavior

"Tim Jones" <TJones@optio.com> writes:
> The tables for theses queries are vacuumed and analyzed regularly. I
> just did an analyze to be sure and here are the results
> ...


There's something pretty wacko about the choice of plan in the slow case
--- I don't see why it'd not have used the same plan structure as for
the IN case. It's coming up with a cost a lot higher than for the
other, so it certainly knows this isn't a great plan ...

Which PG version is this exactly? Are you running with any nondefault
planner parameters?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-19-2008, 09:55 AM
Tim Jones
 
Posts: n/a
Default Re: strange query behavior

Version 8.1

Here are the planner constraints I believe we changed
effective_cache_size and random_page_cost
BTW this is an AIX 5.2

#-----------------------------------------------------------------------
----
# QUERY TUNING
#-----------------------------------------------------------------------
----

# - Planner Method Configuration -

#enable_bitmapscan = on
#enable_hashagg = on
#enable_hashjoin = on
#enable_indexscan = on
#enable_mergejoin = on
#enable_nestloop = on
#enable_seqscan = on
#enable_sort = on
#enable_tidscan = on

# - Planner Cost Constants -

#effective_cache_size = 10000 # typically 8KB each
effective_cache_size = 400000
random_page_cost = 3.8 # units are one sequential page fetch
# cost
#cpu_tuple_cost = 0.01 # (same)
#cpu_index_tuple_cost = 0.001 # (same)
#cpu_operator_cost = 0.0025 # (same)

# - Genetic Query Optimizer -

#geqo = on
#geqo_threshold = 12
#geqo_effort = 5 # range 1-10
#geqo_pool_size = 0 # selects default based on effort
#geqo_generations = 0 # selects default based on effort
#geqo_selection_bias = 2.0 # range 1.5-2.0

# - Other Planner Options -

#default_statistics_target = 10 # range 1-1000
#constraint_exclusion = off
#from_collapse_limit = 8
#join_collapse_limit = 8 # 1 disables collapsing of explicit
# JOINs


Thanks

Tim Jones
Healthcare Project Manager
Optio Software, Inc.
(770) 576-3555

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Wednesday, December 13, 2006 4:59 PM
To: Tim Jones
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] strange query behavior

"Tim Jones" <TJones@optio.com> writes:
> The tables for theses queries are vacuumed and analyzed regularly. I
> just did an analyze to be sure and here are the results ...


There's something pretty wacko about the choice of plan in the slow case
--- I don't see why it'd not have used the same plan structure as for
the IN case. It's coming up with a cost a lot higher than for the
other, so it certainly knows this isn't a great plan ...

Which PG version is this exactly? Are you running with any nondefault
planner parameters?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-19-2008, 09:56 AM
Tom Lane
 
Posts: n/a
Default Re: strange query behavior

"Tim Jones" <TJones@optio.com> writes:
>> Which PG version is this exactly? Are you running with any nondefault
>> planner parameters?


> Version 8.1


8.1.what?

> Here are the planner constraints I believe we changed
> effective_cache_size and random_page_cost


Those look reasonably harmless.

My best bet at the moment is that you've got a pretty early 8.1.x
release and are hitting one of the planner bugs that we fixed earlier
this year. Not enough info to say for sure though.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-19-2008, 09:56 AM
Tim Jones
 
Posts: n/a
Default Re: strange query behavior

Looks like 8.1.2

Tim Jones
Healthcare Project Manager
Optio Software, Inc.
(770) 576-3555

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Wednesday, December 13, 2006 5:37 PM
To: Tim Jones
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] strange query behavior

"Tim Jones" <TJones@optio.com> writes:
>> Which PG version is this exactly? Are you running with any
>> nondefault planner parameters?


> Version 8.1


8.1.what?

> Here are the planner constraints I believe we changed
> effective_cache_size and random_page_cost


Those look reasonably harmless.

My best bet at the moment is that you've got a pretty early 8.1.x
release and are hitting one of the planner bugs that we fixed earlier
this year. Not enough info to say for sure though.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-19-2008, 09:56 AM
Tim Jones
 
Posts: n/a
Default Re: strange query behavior

That's what I did and got 8.1.2 ... do you want gcc version etc 3.3.2
powerpc aix5.2

Tim Jones
Healthcare Project Manager
Optio Software, Inc.
(770) 576-3555

-----Original Message-----
From: Matthew O'Connor [mailto:matthew@zeut.net]
Sent: Wednesday, December 13, 2006 5:51 PM
To: Tim Jones
Subject: Re: [PERFORM] strange query behavior

From psql perform: select version();
and send us that output.

Tim Jones wrote:
> Looks like 8.1.2
>
> Tim Jones
> Healthcare Project Manager
> Optio Software, Inc.
> (770) 576-3555
>
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Wednesday, December 13, 2006 5:37 PM
> To: Tim Jones
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] strange query behavior
>
> "Tim Jones" <TJones@optio.com> writes:
>>> Which PG version is this exactly? Are you running with any
>>> nondefault planner parameters?

>
>> Version 8.1

>
> 8.1.what?
>
>> Here are the planner constraints I believe we changed
>> effective_cache_size and random_page_cost

>
> Those look reasonably harmless.
>
> My best bet at the moment is that you've got a pretty early 8.1.x
> release and are hitting one of the planner bugs that we fixed earlier
> this year. Not enough info to say for sure though.
>
> regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>




---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-19-2008, 09:56 AM
Tom Lane
 
Posts: n/a
Default Re: strange query behavior

"Tim Jones" <TJones@optio.com> writes:
> [ explain results ]


As best I can see, the problem is with the estimate of the size of the
inner join: for two keys we have

-> Nested Loop (cost=4.01..9410.49 rows=13 width=145) (actual time=0.227..0.416 rows=30 loops=1)
-> Bitmap Heap Scan on battery (cost=4.01..11.64 rows=2 width=69) (actual time=0.135..0.138 rows=2 loops=1)
Recheck Cond: ((batteryidentifier = 1177470) OR (batteryidentifier = 1177469))
-> BitmapOr (cost=4.01..4.01 rows=2 width=0) (actual time=0.106..0.106 rows=0 loops=1)
-> Bitmap Index Scan on ix_battery_id (cost=0.00..2.00 rows=1 width=0) (actual time=0.091..0.091 rows=1 loops=1)
Index Cond: (batteryidentifier = 1177470)
-> Bitmap Index Scan on ix_battery_id (cost=0.00..2.00 rows=1 width=0) (actual time=0.011..0.011 rows=1 loops=1)
Index Cond: (batteryidentifier = 1177469)
-> Index Scan using ix_obresults_bat on observationresults (cost=0.00..4682.40 rows=1362 width=76) (actual time=0.047..0.091 rows=15 loops=2)
Index Cond: ("outer".batteryidentifier = observationresults.batteryidentifier)

but for one key we have

-> Nested Loop (cost=0.00..4653.67 rows=1348 width=145) (actual time=0.166..0.215 rows=9 loops=1)
-> Index Scan using ix_battery_id on battery (cost=0.00..5.81 rows=1 width=69) (actual time=0.079..0.082 rows=1 loops=1)
Index Cond: (batteryidentifier = 1177470)
-> Index Scan using ix_obresults_bat on observationresults (cost=0.00..4634.38 rows=1348 width=76) (actual time=0.079..0.102 rows=9 loops=1)
Index Cond: (1177470 = batteryidentifier)

The large rowcount estimate makes it back off to a non-nestloop
plan for the outer joins, and in this situation that's a loser.

I'm actually not sure why they're not both too high --- with the
rowcount estimate of 1362 for the inner scan in the first example, you'd
expect about twice that for the join result. But the immediate problem
is that in the case where it knows exactly what batteryidentifier is
being probed for, it's still off by more than a factor of 100 on the
rowcount estimate for observationresults. How many rows in
observationresults, and may we see the pg_stats entry for
observationresults.batteryidentifier?

It's likely that the answer for you will be "raise the statistics target
for observationresults and re-ANALYZE", but I'd like to gather more info
about what's going wrong first.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 05:31 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com