Unix Technical Forum

Query performance

This is a discussion on Query performance within the Pgsql Performance forums, part of the PostgreSQL category; --> Hi, I'm having trouble understanding, why a specific query on a small database is taking so long... I've ported ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 08:50 AM
Antonio Batovanja
 
Posts: n/a
Default Query performance

Hi,

I'm having trouble understanding, why a specific query on a small database
is taking so long...

I've ported our LDAP database to an SQL backend using OpenLDAP's back-sql.
The database I'm using is Postgres 8.0.3.

The query generated by LDAP's back-sql is:

SELECT DISTINCT ldap_entries.id,organization.id,text('organization ') AS
objectClass,ldap_entries.dn AS dn FROM
ldap_entries,organization,ldap_entry_objclasses WHERE
organization.id=ldap_entries.keyval AND ldap_entries.oc_map_id=1 AND
upper(ldap_entries.dn) LIKE '%DC=HUMANOMED,DC=AT' AND 1=1 OR
(ldap_entries.id=ldap_entry_objclasses.entry_id AND
ldap_entry_objclasses.oc_name='organization');

The table ldap_entries has 1539 rows.
The table organization has only one row.
The table ldap_entry_objclasses has 5928 rows.

The above query takes 5 seconds to execute!

Here's the execution plan:
explain analyze SELECT DISTINCT
ldap_entries.id,organization.id,text('organization ') AS
objectClass,ldap_entries.dn AS dn FROM
ldap_entries,organization,ldap_entry_objclasses WHERE
organization.id=ldap_entries.keyval AND ldap_entries.oc_map_id=1 AND
upper(ldap_entries.dn) LIKE '%DC=HUMANOMED,DC=AT' AND 1=1 OR
(ldap_entries.id=ldap_entry_objclasses.entry_id AND
ldap_entry_objclasses.oc_name='organization');

QUERY PLAN
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------
Unique (cost=57482741.28..57482807.98 rows=5336 width=408) (actual
time=96009.993..96107.959 rows=1 loops=1)
-> Sort (cost=57482741.28..57482754.62 rows=5336 width=408) (actual
time=96009.979..96039.864 rows=5928 loops=1)
Sort Key: ldap_entries.id, organization.id, 'organization'::text,
ldap_entries.dn
-> Nested Loop (cost=120.19..57481808.58 rows=5336 width=408)
(actual time=0.100..95744.124 rows=5928 loops=1)
Join Filter: ((("outer".id = "outer".keyval) AND
("outer".oc_map_id = 1) AND (upper(("outer".dn)::text) ~~
'%DC=HUMANOMED,DC=AT'::text)) OR (("outer".id = "inner".entry_id) AND
(("inner".oc_name)::text = 'organization'::text)))
-> Nested Loop (cost=11.98..5590.77 rows=277020 width=416)
(actual time=0.052..72.748 rows=1539 loops=1)
-> Seq Scan on ldap_entries (cost=0.00..38.39
rows=1539 width=412) (actual time=0.016..20.903 rows=1539 loops=1)
-> Materialize (cost=11.98..13.78 rows=180 width=4)
(actual time=0.005..0.010 rows=1 loops=1539)
-> Seq Scan on organization (cost=0.00..11.80
rows=180 width=4) (actual time=0.007..0.014 rows=1 loops=1)
-> Materialize (cost=108.21..167.49 rows=5928 width=118)
(actual time=0.005..29.913 rows=5928 loops=1539)
-> Seq Scan on ldap_entry_objclasses
(cost=0.00..102.28 rows=5928 width=118) (actual time=0.006..31.911
rows=5928 loops=1)
Total runtime: 96109.571 ms
(12 rows)


Now if I rewrite that statement so that I don't have the OR clause any more:
explain analyze SELECT DISTINCT
ldap_entries.id,organization.id,text('organization ') AS
objectClass,ldap_entries.dn AS dn FROM
ldap_entries,organization,ldap_entry_objclasses WHERE
organization.id=ldap_entries.keyval AND ldap_entries.oc_map_id=1 AND
upper(ldap_entries.dn) LIKE '%DC=HUMANOMED,DC=AT' AND 1=1 UNION SELECT
DISTINCT ldap_entries.id,organization.id,text('organization ') AS
objectClass,ldap_entries.dn AS dn FROM
ldap_entries,organization,ldap_entry_objclasses WHERE
(ldap_entries.id=ldap_entry_objclasses.entry_id AND
ldap_entry_objclasses.oc_name='organization');

QUERY PLAN
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
-----------------------
Unique (cost=3637.67..3707.42 rows=5580 width=408) (actual
time=461.258..461.273 rows=1 loops=1)
-> Sort (cost=3637.67..3651.62 rows=5580 width=408) (actual
time=461.246..461.251 rows=1 loops=1)
Sort Key: id, id, objectclass, dn
-> Append (cost=1247.62..2657.12 rows=5580 width=408) (actual
time=361.263..461.220 rows=1 loops=1)
-> Subquery Scan "*SELECT* 1" (cost=1247.62..1323.52
rows=180 width=408) (actual time=361.252..461.065 rows=1 loops=1)
-> Unique (cost=1247.62..1321.72 rows=180 width=408)
(actual time=361.238..461.039 rows=1 loops=1)
-> Sort (cost=1247.62..1262.44 rows=5928
width=408) (actual time=361.226..392.136 rows=5928 loops=1)
Sort Key: ldap_entries.id, organization.id,
'organization'::text, ldap_entries.dn
-> Nested Loop (cost=0.00..196.80
rows=5928 width=408) (actual time=0.091..95.728 rows=5928 loops=1)
-> Nested Loop (cost=0.00..35.24
rows=1 width=408) (actual time=0.067..0.104 rows=1 loops=1)
-> Index Scan using
ldap_entries_oc_idx on ldap_entries (cost=0.00..29.41 rows=1 width=408)
(actual time=0.039..0.050 rows=1 loops=1)
Index Cond: (oc_map_id =
1)
Filter:
(upper((dn)::text) ~~ '%DC=HUMANOMED,DC=AT'::text)
-> Index Scan using
organization_id_key on organization (cost=0.00..5.82 rows=1 width=4)
(actual time=0.010..0.020 rows=1 loops=1)
Index Cond:
(organization.id = "outer".keyval)
-> Seq Scan on ldap_entry_objclasses
(cost=0.00..102.28 rows=5928 width=0) (actual time=0.007..32.119 rows=5928
loops=1)
-> Subquery Scan "*SELECT* 2" (cost=1212.10..1333.60
rows=5400 width=408) (actual time=0.129..0.129 rows=0 loops=1)
-> Unique (cost=1212.10..1279.60 rows=5400 width=408)
(actual time=0.118..0.118 rows=0 loops=1)
-> Sort (cost=1212.10..1225.60 rows=5400
width=408) (actual time=0.108..0.108 rows=0 loops=1)
Sort Key: ldap_entries.id, organization.id,
'organization'::text, ldap_entries.dn
-> Nested Loop (cost=103.42..265.50
rows=5400 width=408) (actual time=0.078..0.078 rows=0 loops=1)
-> Hash Join (cost=91.44..145.52
rows=30 width=404) (actual time=0.068..0.068 rows=0 loops=1)
Hash Cond: ("outer".id =
"inner".entry_id)
-> Seq Scan on ldap_entries
(cost=0.00..38.39 rows=1539 width=404) (never executed)
-> Hash (cost=91.37..91.37
rows=30 width=4) (actual time=0.045..0.045 rows=0 loops=1)
-> Index Scan using xxxx
on ldap_entry_objclasses (cost=0.00..91.37 rows=30 width=4) (actual
time=0.034..0.034 rows=0 loops=1)
Index Cond:
((oc_name)::text = 'organization'::text)
-> Materialize (cost=11.98..13.78
rows=180 width=4) (never executed)
-> Seq Scan on organization
(cost=0.00..11.80 rows=180 width=4) (never executed)
Total runtime: 461.907 ms
(30 rows)


Suddenly the query takes only 0.29 seconds!


Could someone please explain me what's the big difference between these two
queries?
Are there any server-side flags that would improve the performance of the
first query? I wouldn't like to change the code of ldap's back-sql...

Toni
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 08:50 AM
Antonio Batovanja
 
Posts: n/a
Default Re: Query performance

Laurenz Albe wrote:

> Antonio Batovanja wrote:
>> I'm having trouble understanding, why a specific query on a small
>> database is taking so long...
>>

>
> Before I try to understand the execution plans:
>
> Have you run ANALYZE on the tables involved before you ran the query?


Hi,

Just to be on the safe side, I've run ANALYZE now.
Here are the query plans for the two queries:

1) the slooooow query:
EXPLAIN ANALYZE SELECT DISTINCT ldap_entries.id, organization.id,
text('organization') AS objectClass, ldap_entries.dn AS dn FROM
ldap_entries, organization, ldap_entry_objclasses WHERE
organization.id=ldap_entries.keyval AND ldap_entries.oc_map_id=1 AND
upper(ldap_entries.dn) LIKE '%DC=HUMANOMED,DC=AT' AND 1=1 OR
(ldap_entries.id=ldap_entry_objclasses.entry_id AND
ldap_entry_objclasses.oc_name='organization');

QUERY PLAN
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
------
Unique (cost=319470.73..319470.97 rows=19 width=74) (actual
time=96354.150..96452.492 rows=1 loops=1)
-> Sort (cost=319470.73..319470.78 rows=19 width=74) (actual
time=96354.136..96383.958 rows=5928 loops=1)
Sort Key: ldap_entries.id, organization.id, 'organization'::text,
ldap_entries.dn
-> Nested Loop (cost=56.33..319470.33 rows=19 width=74) (actual
time=0.088..96086.902 rows=5928 loops=1)
Join Filter: ((("inner".id = "inner".keyval) AND
("inner".oc_map_id = 1) AND (upper(("inner".dn)::text) ~~
'%DC=HUMANOMED,DC=AT'::text)) OR (("inner".id = "outer".entry_id) AND
(("outer".oc_name)::text = 'organization'::text)))
-> Seq Scan on ldap_entry_objclasses (cost=0.00..102.28
rows=5928 width=20) (actual time=0.012..39.040 rows=5928 loops=1)
-> Materialize (cost=56.33..71.72 rows=1539 width=82)
(actual time=0.005..7.786 rows=1539 loops=5928)
-> Nested Loop (cost=0.00..54.79 rows=1539 width=82)
(actual time=0.033..26.661 rows=1539 loops=1)
-> Seq Scan on organization (cost=0.00..1.01
rows=1 width=4) (actual time=0.006..0.013 rows=1 loops=1)
-> Seq Scan on ldap_entries (cost=0.00..38.39
rows=1539 width=78) (actual time=0.010..9.845 rows=1539 loops=1)
Total runtime: 96453.293 ms
(11 rows)


2) The fast query - just replaced OR with UNION:
EXPLAIN ANALYZE SELECT DISTINCT ldap_entries.id, organization.id,
text('organization') AS objectClass, ldap_entries.dn AS dn FROM
ldap_entries, organization, ldap_entry_objclasses WHERE
organization.id=ldap_entries.keyval AND ldap_entries.oc_map_id=1 AND
upper(ldap_entries.dn) LIKE '%DC=HUMANOMED,DC=AT' AND 1=1
UNION SELECT DISTINCT ldap_entries.id, organization.id, text('organization')
AS objectClass, ldap_entries.dn AS dn FROM ldap_entries, organization,
ldap_entry_objclasses WHERE (ldap_entries.id=ldap_entry_objclasses.entry_id
AND ldap_entry_objclasses.oc_name='organization');

QUERY PLAN
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
-----------------------
Unique (cost=728.39..728.64 rows=20 width=74) (actual
time=463.587..463.602 rows=1 loops=1)
-> Sort (cost=728.39..728.44 rows=20 width=74) (actual
time=463.577..463.582 rows=1 loops=1)
Sort Key: id, id, objectclass, dn
-> Append (cost=537.13..727.95 rows=20 width=74) (actual
time=364.814..463.549 rows=1 loops=1)
-> Subquery Scan "*SELECT* 1" (cost=537.13..611.24 rows=1
width=74) (actual time=364.804..463.357 rows=1 loops=1)
-> Unique (cost=537.13..611.23 rows=1 width=74)
(actual time=364.790..463.331 rows=1 loops=1)
-> Sort (cost=537.13..551.95 rows=5928
width=74) (actual time=364.778..394.740 rows=5928 loops=1)
Sort Key: ldap_entries.id, organization.id,
'organization'::text, ldap_entries.dn
-> Nested Loop (cost=0.00..165.64
rows=5928 width=74) (actual time=0.097..96.368 rows=5928 loops=1)
-> Nested Loop (cost=0.00..4.08
rows=1 width=74) (actual time=0.073..0.107 rows=1 loops=1)
-> Seq Scan on organization
(cost=0.00..1.01 rows=1 width=4) (actual time=0.013..0.020 rows=1 loops=1)
-> Index Scan using
ldap_entries_key_idx on ldap_entries (cost=0.00..3.06 rows=1 width=74)
(actual time=0.042..0.054 rows=1 loops=1)
Index Cond: ("outer".id =
ldap_entries.keyval)
Filter: ((oc_map_id = 1)
AND (upper((dn)::text) ~~ '%DC=HUMANOMED,DC=AT'::text))
-> Seq Scan on ldap_entry_objclasses
(cost=0.00..102.28 rows=5928 width=0) (actual time=0.006..31.622 rows=5928
loops=1)
-> Subquery Scan "*SELECT* 2" (cost=116.29..116.71 rows=19
width=74) (actual time=0.167..0.167 rows=0 loops=1)
-> Unique (cost=116.29..116.52 rows=19 width=74)
(actual time=0.156..0.156 rows=0 loops=1)
-> Sort (cost=116.29..116.33 rows=19 width=74)
(actual time=0.145..0.145 rows=0 loops=1)
Sort Key: ldap_entries.id, organization.id,
'organization'::text, ldap_entries.dn
-> Nested Loop (cost=63.62..115.88
rows=19 width=74) (actual time=0.117..0.117 rows=0 loops=1)
-> Seq Scan on organization
(cost=0.00..1.01 rows=1 width=4) (actual time=0.013..0.018 rows=1 loops=1)
-> Merge Join (cost=63.62..114.68
rows=19 width=70) (actual time=0.074..0.074 rows=0 loops=1)
Merge Cond: ("outer".id =
"inner".entry_id)
-> Index Scan using
ldap_entries_pkey on ldap_entries (cost=0.00..46.93 rows=1539 width=70)
(actual time=0.019..0.019 rows=1 loops=1)
-> Sort (cost=63.62..63.67
rows=19 width=4) (actual time=0.040..0.040 rows=0 loops=1)
Sort Key:
ldap_entry_objclasses.entry_id
-> Index Scan using xxxx
on ldap_entry_objclasses (cost=0.00..63.22 rows=19 width=4) (actual
time=0.027..0.027 rows=0 loops=1)
Index Cond:
((oc_name)::text = 'organization'::text)
Total runtime: 464.189 ms
(29 rows)



I know the output doesn't look nice, but wrapping at 80 characters is a must
for newsgroups...

Thank you all in advance,
Toni
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-19-2008, 08:50 AM
Steinar H. Gunderson
 
Posts: n/a
Default Re: Query performance

antonio.batovanja@humanomed.co.at wrote:
> The above query takes 5 seconds to execute!
>
> [...]
>
> Total runtime: 96109.571 ms


It sure doesn't look like it...

> Total runtime: 461.907 ms
>
> [...]
>
> Suddenly the query takes only 0.29 seconds!


How are you timing this, really?

/* Steinar */
--
Homepage: http://www.sesse.net/
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-19-2008, 08:50 AM
Antonio Batovanja
 
Posts: n/a
Default Re: Query performance

Steinar H. Gunderson wrote:

> antonio.batovanja@humanomed.co.at wrote:
>> The above query takes 5 seconds to execute!
>>
>> [...]
>>
>> Total runtime: 96109.571 ms

>
> It sure doesn't look like it...
>
>> Total runtime: 461.907 ms
>>
>> [...]
>>
>> Suddenly the query takes only 0.29 seconds!

>
> How are you timing this, really?
>
> /* Steinar */


I'm executing the queries from phpPgAdmin.
The above are for explain analyse. I was referring to the pure query
execution time.
Does anyone have an idea why the OR-query takes so long?
Any server-side tuning possibilities? I wouldn't like to change the code of
ldap's back-sql...

Toni
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-19-2008, 08:52 AM
Erwin Brandstetter
 
Posts: n/a
Default Re: Query performance

Antonio Batovanja wrote:
> Laurenz Albe wrote:
>
>> Antonio Batovanja wrote:
>>> I'm having trouble understanding, why a specific query on a small
>>> database is taking so long...
>>>

>> Before I try to understand the execution plans:
>>
>> Have you run ANALYZE on the tables involved before you ran the query?

>
> Hi,
>
> Just to be on the safe side, I've run ANALYZE now.
> Here are the query plans for the two queries:


I suspect a misunderstanding here. What Laurenz probably meant is to run
analyze on the involved _tables_ so the statistics data is refreshed.
If the query planner runs with outdated statistics, queries may perform
very poorly. Try

vacuum full analyze yourdatabase

To fully vacuum your database and analyze all tables.
(vacuum full is extra, but can't hurt.)

http://www.postgresql.org/docs/8.1/s...ql-vacuum.html
http://www.postgresql.org/docs/8.1/s...l-analyze.html

Regards, Erwin
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-19-2008, 08:52 AM
Erwin Brandstetter
 
Posts: n/a
Default Re: Query performance

Antonio Batovanja wrote:
(...)

> 1) the slooooow query:
> EXPLAIN ANALYZE SELECT DISTINCT ldap_entries.id, organization.id,
> text('organization') AS objectClass, ldap_entries.dn AS dn FROM
> ldap_entries, organization, ldap_entry_objclasses WHERE
> organization.id=ldap_entries.keyval AND ldap_entries.oc_map_id=1 AND
> upper(ldap_entries.dn) LIKE '%DC=HUMANOMED,DC=AT' AND 1=1 OR
> (ldap_entries.id=ldap_entry_objclasses.entry_id AND
> ldap_entry_objclasses.oc_name='organization');



First, presenting your query in any readable form might be helpful if
you want the community to help you. (Hint! Hint!)

SELECT DISTINCT ldap_entries.id, organization.id,
text('organization') AS objectClass, ldap_entries.dn AS dn
FROM ldap_entries, organization, ldap_entry_objclasses
WHERE organization.id=ldap_entries.keyval
AND ldap_entries.oc_map_id=1
AND upper(ldap_entries.dn) LIKE '%DC=HUMANOMED,DC=AT'
AND 1=1
OR (ldap_entries.id=ldap_entry_objclasses.entry_id
AND ldap_entry_objclasses.oc_name='organization');

Next, you might want to use aliases to make it more readable.

SELECT DISTINCT e.id, o.id, text('organization') AS objectClass, e.dn AS dn
FROM ldap_entries AS e, organization AS o, ldap_entry_objclasses AS eo
WHERE o.id=e.keyval
AND e.oc_map_id=1
AND upper(e.dn) LIKE '%DC=HUMANOMED,DC=AT'
AND 1=1
OR (e.id=eo.entry_id
AND eo.oc_name='organization');

There are a couple redundant (nonsensical) items, syntax-wise. Let's
strip these:

SELECT DISTINCT e.id, o.id, text('organization') AS objectClass, e.dn
FROM ldap_entries AS e, organization AS o, ldap_entry_objclasses AS eo
WHERE o.id=e.keyval
AND e.oc_map_id=1
AND e.dn ILIKE '%DC=HUMANOMED,DC=AT'
OR e.id=eo.entry_id
AND eo.oc_name='organization';

And finally, I suspect the lexical precedence of AND and OR might be the
issue here.
http://www.postgresql.org/docs/8.1/s...SQL-PRECEDENCE
Maybe that is what you really want (just guessing):

SELECT DISTINCT e.id, o.id, text('organization') AS objectClass, e.dn
FROM ldap_entries e
JOIN organization o ON o.id=e.keyval
LEFT JOIN ldap_entry_objclasses eo ON eo.entry_id=e.id
WHERE e.oc_map_id=1
AND e.dn ILIKE '%DC=HUMANOMED,DC=AT'
OR eo.oc_name='organization)';

I didn't take the time to read the rest. My appologies if I guessed wrong.


Regards, Erwin
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-19-2008, 08:53 AM
Christopher Kings-Lynne
 
Posts: n/a
Default Re: Query performance

> I'm executing the queries from phpPgAdmin.
> The above are for explain analyse. I was referring to the pure query
> execution time.
> Does anyone have an idea why the OR-query takes so long?
> Any server-side tuning possibilities? I wouldn't like to change the code of
> ldap's back-sql...


If you're using phpPgAdmin's timings, they could be more off than the
real explain analyze timings. Make sure you're using the figure given
by explain analyze itself.

Chris


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 04:15 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com