Unix Technical Forum

Identical Queries

This is a discussion on Identical Queries within the Pgsql Performance forums, part of the PostgreSQL category; --> Question for anyone... I tried posting to the bugs, and they said this is a better question for here. ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 10:18 AM
Rob Schall
 
Posts: n/a
Default Identical Queries

Question for anyone...

I tried posting to the bugs, and they said this is a better question for here.
I have to queries. One runs in about 2 seconds. The other takes upwards
of 2 minutes. I have a temp table that is created with 2 columns. This
table is joined with the larger database of call detail records.
However, these 2 queries are handled very differently.

The queries:
First----

calldetail=> EXPLAIN SELECT current.* FROM current JOIN anitmp ON
current.destnum=anitmp.ani AND istf=true;
QUERY PLAN
--------------------------------------------------------------------------------------
Nested Loop (cost=0.00..2026113.09 rows=500908 width=108)
-> Seq Scan on anitmp (cost=0.00..33.62 rows=945 width=8)
Filter: (istf = true)
-> Index Scan using i_destnum on current (cost=0.00..2137.36
rows=531 width=108)
Index Cond: (current.destnum = "outer".ani)
(5 rows)

Second----
calldetail=> EXPLAIN SELECT current.* FROM current JOIN anitmp ON
current.orignum=anitmp.ani AND istf=false;
QUERY PLAN
---------------------------------------------------------------------------
Hash Join (cost=35.99..3402035.53 rows=5381529 width=108)
Hash Cond: ("outer".orignum = "inner".ani)
-> Seq Scan on current (cost=0.00..907191.05 rows=10170805 width=108)
-> Hash (cost=33.62..33.62 rows=945 width=8)
-> Seq Scan on anitmp (cost=0.00..33.62 rows=945 width=8)
Filter: (istf = false)
(6 rows)


The tables:
Table "public.current"
Column | Type | Modifiers
----------+-----------------------------+-----------
datetime | timestamp without time zone |
orignum | bigint |
destnum | bigint |
billto | bigint |
cost | numeric(6,4) |
duration | numeric(8,1) |
origcity | character(12) |
destcity | character(12) |
file | character varying(30) |
linenum | integer |
carrier | character(1) |
Indexes:
"i_destnum" btree (destnum)
"i_orignum" btree (orignum)


Table "public.anitmp"
Column | Type | Modifiers
--------+---------+-----------
ani | bigint |
istf | boolean |


I was also asked to post the EXPLAIN ANALYZE for both:

calldetail=> EXPLAIN ANALYZE SELECT current.* FROM anitmp JOIN current ON istf=false AND current.orignum=anitmp.ani;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=35.99..3427123.39 rows=5421215 width=108) (actual time=1994.164..157443.544 rows=157 loops=1)
Hash Cond: ("outer".orignum = "inner".ani)
-> Seq Scan on current (cost=0.00..913881.09 rows=10245809 width=108) (actual time=710.986..137963.320 rows=10893541 loops=1)
-> Hash (cost=33.62..33.62 rows=945 width=8) (actual time=10.948..10.948 rows=0 loops=1)
-> Seq Scan on anitmp (cost=0.00..33.62 rows=945 width=8) (actual time=10.934..10.939 rows=2 loops=1)
Filter: (istf = false)
Total runtime: 157443.900 ms
(7 rows)

calldetail=> EXPLAIN ANALYZE SELECT current.* FROM current JOIN anitmp ON current.destnum=anitmp.ani AND istf=true;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..2037526.69 rows=504602 width=108) (actual time=88.752..1050.295 rows=1445 loops=1)
-> Seq Scan on anitmp (cost=0.00..33.62 rows=945 width=8) (actual time=8.189..8.202 rows=2 loops=1)
Filter: (istf = true)
-> Index Scan using i_destnum on current (cost=0.00..2149.40 rows=534 width=108) (actual time=62.365..517.454 rows=722 loops=2)
Index Cond: (current.destnum = "outer".ani)
Total runtime: 1052.862 ms
(6 rows)


Anyone have any ideas for me? I have indexes on each of the necessary
columns.

Rob



---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 10:18 AM
Stephan Szabo
 
Posts: n/a
Default Re: Identical Queries

On Thu, 1 Mar 2007, Rob Schall wrote:

> Question for anyone...
>
> I tried posting to the bugs, and they said this is a better question for here.
> I have to queries. One runs in about 2 seconds. The other takes upwards
> of 2 minutes. I have a temp table that is created with 2 columns. This
> table is joined with the larger database of call detail records.
> However, these 2 queries are handled very differently.


How many rows are there in anitmp and how many rows in anitmp have
istf=true and how many have istf=false? If you don't currently analyze the
temp table after adding the rows, you might find that doing an analyze
helps, or at least makes the row estimates better.

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-19-2008, 10:18 AM
Rob Schall
 
Posts: n/a
Default Re: Identical Queries

There are 4 entries (wanted to make the playing field level for this
test). There are 2 with true for istf and 2 with false.

Rob


Stephan Szabo wrote:
> On Thu, 1 Mar 2007, Rob Schall wrote:
>
>
>> Question for anyone...
>>
>> I tried posting to the bugs, and they said this is a better question for here.
>> I have to queries. One runs in about 2 seconds. The other takes upwards
>> of 2 minutes. I have a temp table that is created with 2 columns. This
>> table is joined with the larger database of call detail records.
>> However, these 2 queries are handled very differently.
>>

>
> How many rows are there in anitmp and how many rows in anitmp have
> istf=true and how many have istf=false? If you don't currently analyze the
> temp table after adding the rows, you might find that doing an analyze
> helps, or at least makes the row estimates better.
>



---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-19-2008, 10:18 AM
Chad Wagner
 
Posts: n/a
Default Re: Identical Queries

On 3/1/07, Rob Schall <rschall@callone.net> wrote:
>
> There are 4 entries (wanted to make the playing field level for this
> test). There are 2 with true for istf and 2 with false.
>


Then the difference here has to do with using orignum vs destnum as the join
criteria. There must be more intersections for orignum than destnum, or
your statistics are so far out of whack. It appears to be estimating 5M vs
500K for a result set, and naturally it chose a different plan.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-19-2008, 10:18 AM
Stephan Szabo
 
Posts: n/a
Default Re: Identical Queries

On Thu, 1 Mar 2007, Rob Schall wrote:

> There are 4 entries (wanted to make the playing field level for this
> test). There are 2 with true for istf and 2 with false.


Then analyzing might help, because I think it's estimating many more rows
for both cases, and with 2 rows estimated to be returned the nested loop
should seem a lot more attractive than at 900+.

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-19-2008, 10:18 AM
Craig A. James
 
Posts: n/a
Default Re: Identical Queries

Stephan Szabo wrote:
> I tried posting to the bugs, and they said this is a better question for here.
> I have to queries. One runs in about 2 seconds. The other takes upwards
> of 2 minutes. I have a temp table that is created with 2 columns. This
> table is joined with the larger database of call detail records.
> However, these 2 queries are handled very differently.


Even for a temporary table, you should run ANALYZE on it after you fill it but before you query or join to it. I found out (the hard way) that a temporary table of just 100 rows will generate dramatically different plans before and after ANALYZE.

Craig


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 05:44 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com