Unix Technical Forum

multiple joins + Order by + LIMIT query performance issue

This is a discussion on multiple joins + Order by + LIMIT query performance issue within the Pgsql Performance forums, part of the PostgreSQL category; --> > > If a misestimate of this kind is bugging you enough that you're > willing > to change ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #11 (permalink)  
Old 05-07-2008, 11:19 AM
Antoine Baudoux
 
Posts: n/a
Default Re: multiple joins + Order by + LIMIT query performance issue

>
> If a misestimate of this kind is bugging you enough that you're
> willing
> to change the query, I think you can fix it like this:
>
> select ... from foo order by x limit n;
> =>
> select ... from (select ... from foo order by x) ss limit n;
>
> The subselect will be planned without awareness of the LIMIT, so you
> should get a plan using a sort rather than one that bets on the LIMIT
> being reached quickly.


I tried that, using a subquery. Unfortunately this does not change
anything :

select * from (select * from t_Event event
inner join t_Service service on event.service_id=service.id
inner join t_System system on service.system_id=system.id
inner join t_Interface interface on system.id=interface.system_id
where (interface.network_id=1) order by event.c_date desc ) ss limit 25

"Limit (cost=147.79..5563.93 rows=25 width=3672)"
" -> Subquery Scan ss (cost=147.79..2896263.01 rows=13368
width=3672)"
" -> Nested Loop (cost=147.79..2896129.33 rows=13368
width=958)"
" Join Filter: (service.id = event.service_id)"
" -> Index Scan Backward using event_date_idx on t_event
event (cost=0.00..1160633.69 rows=8569619 width=344)"
" -> Materialize (cost=147.79..147.88 rows=9 width=614)"
" -> Hash Join (cost=16.56..147.79 rows=9
width=614)"
" Hash Cond: (service.system_id = system.id)"
" -> Seq Scan on t_service service
(cost=0.00..109.28 rows=5828 width=40)"
" -> Hash (cost=16.55..16.55 rows=1
width=574)"
" -> Nested Loop (cost=0.00..16.55
rows=1 width=574)"
" -> Index Scan using
interface_network_id_idx on t_interface interface (cost=0.00..8.27
rows=1 width=558)"
" Index Cond: (network_id =
1)"
" -> Index Scan using
t_system_pkey on t_system system (cost=0.00..8.27 rows=1 width=16)"
" Index Cond: (system.id =
interface.system_id)"


The worst thing about all this is that there are ZERO rows to join
with the t_event table. So the planner decide to index-scan 8 millions
row, where there is no hope of finding a match!
This seems a very ,very , very poor decision

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #12 (permalink)  
Old 05-07-2008, 07:21 PM
Matthew Wakeling
 
Posts: n/a
Default Re: multiple joins + Order by + LIMIT query performanceissue

On Tue, 6 May 2008, Tom Lane wrote:
> If a misestimate of this kind is bugging you enough that you're willing
> to change the query, I think you can fix it like this:
>
> select ... from foo order by x limit n;
> =>
> select ... from (select ... from foo order by x) ss limit n;
>
> The subselect will be planned without awareness of the LIMIT, so you
> should get a plan using a sort rather than one that bets on the LIMIT
> being reached quickly.


Surely if that's the case, that in itself is a bug? Apart from being
"useful", I mean.

Matthew

--
"Television is a medium because it is neither rare nor well done."
-- Fred Friendly

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #13 (permalink)  
Old 05-10-2008, 03:06 PM
Antoine Baudoux
 
Posts: n/a
Default Re: multiple joins + Order by + LIMIT query performance issue

Ok, I've tried everything, and the planner keeps choosing index scans
when it shouldnt.

Is there a way to disable index scans?


Antoine

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #14 (permalink)  
Old 05-10-2008, 03:06 PM
Scott Marlowe
 
Posts: n/a
Default Re: multiple joins + Order by + LIMIT query performance issue

On Fri, May 9, 2008 at 1:18 AM, Antoine Baudoux <ab@taktik.be> wrote:
> Ok, I've tried everything, and the planner keeps choosing index scans when
> it shouldnt.
>
> Is there a way to disable index scans?


You can use "set enable_indexscan off;" as the first command I've had
one or two reporting queries in the past that it was a necessity to do
that before running certain queries on very large datasets where a seq
scan would kill performance.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 06:39 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com