Unix Technical Forum

=?WINDOWS-1252?Q?Query_Optimization_with_Kruskal=92s_Algorit hm?=

This is a discussion on =?WINDOWS-1252?Q?Query_Optimization_with_Kruskal=92s_Algorit hm?= within the Pgsql Performance forums, part of the PostgreSQL category; --> Hello friends, I'm working on optimizing queries using the Kruskal algorithm ( http://ieeexplore.ieee.org/xpls/abs_...umber=4318118) . I did several tests in ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 05-07-2008, 07:21 PM
Tarcizio Bini
 
Posts: n/a
Default =?WINDOWS-1252?Q?Query_Optimization_with_Kruskal=92s_Algorit hm?=

Hello friends,

I'm working on optimizing queries using the Kruskal algorithm (
http://ieeexplore.ieee.org/xpls/abs_...umber=4318118). I did several
tests in the database itself and saw interesting results.
I did 10 executions with each query using unchanged source of Postgres and
then adapted to the algorithm of Kruskal.
The query I used is composed of 12 tables and 11 joins.

Results Postgresql unchanged (ms): (\ timing)

170,690
168,214
182,832
166,172
174,466
167,143
167,287
172,891
170,452
165,665
average=> 170,5812 ms


Results of Postgresql with the Kruskal algorithm (ms): (\ timing)

520,590
13,533
8,410
5,162
5,543
4,999
9,871
4,984
5,010
8,883
average=> 58,6985 ms


As you can see the result, using the Kruskal algorithm, the first query
takes more time to return results. This does not occur when using the
original source of Postgres.
So how is the best method to conduct the tests? I take into consideration
the average of 10 executions or just the first one?
Do you think I must clean the cache after each query? (because the other (9)
executions may have information in memory).

regards, Tarcizio Bini.

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 05-10-2008, 03:06 PM
Alexander Staubo
 
Posts: n/a
Default =?UTF-8?Q?Re:__Query_Optimiza?= =?UTF-8?Q?tion_with_Kruskal=E2=80=99s_Algorithm?=

On 5/7/08, Tarcizio Bini <tarcizioab@c3sl.ufpr.br> wrote:
> I'm working on optimizing queries using the Kruskal algorithm
> (http://ieeexplore.ieee.org/xpls/abs_...umber=4318118).


That paper looks very interesting. I would love to hear what the
PostgreSQL committers think of this algorithm.

Alexander.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 05-11-2008, 09:42 PM
Rauan Maemirov
 
Posts: n/a
Default =?windows-1252?Q?Re=3A_Query_Optimization_with_Kruskal=92s_A lgorithm?=

On May 8, 2:09*am, a...@purefiction.net ("Alexander Staubo") wrote:
> On 5/7/08, Tarcizio Bini <tarcizi...@c3sl.ufpr.br> wrote:
>
> > I'm working on optimizing queries using the Kruskal algorithm
> > (http://ieeexplore.ieee.org/xpls/abs_...umber=4318118).

>
> That paper looks very interesting. I would love to hear what the
> PostgreSQL committers think of this algorithm.
>
> Alexander.
>
> --
> Sent via pgsql-performance mailing list (pgsql-performa...@postgresql.org)
> To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-performance


I also would like to hear from them. But seems like the thread is
loosed in tonn of other threads.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 05-11-2008, 09:42 PM
Jonah H. Harris
 
Posts: n/a
Default =?WINDOWS-1252?Q?Re:__Re:_Query_Optimi?= =?WINDOWS-1252?Q?zation_with_Kruskal=92s_Algorithm?=

Repost to -hackers, you're more likely to get a response on this topic.

On Sat, May 10, 2008 at 1:31 PM, Rauan Maemirov <rauan1987@gmail.com> wrote:
> On May 8, 2:09 am, a...@purefiction.net ("Alexander Staubo") wrote:
>> On 5/7/08, Tarcizio Bini <tarcizi...@c3sl.ufpr.br> wrote:
>>
>> > I'm working on optimizing queries using the Kruskal algorithm
>> > (http://ieeexplore.ieee.org/xpls/abs_...umber=4318118).

>>
>> That paper looks very interesting. I would love to hear what the
>> PostgreSQL committers think of this algorithm.
>>
>> Alexander.
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-performa...@postgresql.org)
>> To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-performance

>
> I also would like to hear from them. But seems like the thread is
> loosed in tonn of other threads.
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>




--
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com
Edison, NJ 08837 | http://www.enterprisedb.com/

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 05-11-2008, 09:42 PM
Tom Lane
 
Posts: n/a
Default Re: =?WINDOWS-1252?Q?Re:__Re:_Query_Optimi?= =?WINDOWS-1252?Q?zation_with_Kruskal=92s_Algorithm?=

"Jonah H. Harris" <jonah.harris@gmail.com> writes:
> Repost to -hackers, you're more likely to get a response on this topic.


Probably not, unless you cite a more readily available reference.
(I dropped my IEEE membership maybe fifteen years ago ...)

regards, tom lane

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 05-11-2008, 09:42 PM
Jonah H. Harris
 
Posts: n/a
Default =?WINDOWS-1252?Q?Re:_Re:__Re:_Query_Opti?= =?WINDOWS-1252?Q?mization_with_Kruskal=92s_Algorithm?=

On Sat, May 10, 2008 at 5:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Jonah H. Harris" <jonah.harris@gmail.com> writes:
>> Repost to -hackers, you're more likely to get a response on this topic.

>
> Probably not, unless you cite a more readily available reference.
> (I dropped my IEEE membership maybe fifteen years ago ...)


Yeah, I don't have one either. Similarly, I couldn't find anything
applicable to the PG implementation except references to the paper.
Wikipedia has the algorithm itself
(http://en.wikipedia.org/wiki/Kruskal's_algorithm), but I was more
interested in the actual applicability to PG and any issues they ran
into.

--
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | jonah.harris@enterprisedb.com
Edison, NJ 08837 | http://www.enterprisedb.com/

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 05-11-2008, 09:42 PM
Tom Lane
 
Posts: n/a
Default Re: =?WINDOWS-1252?Q?Re:_Re:__Re:_Query_Opti?= =?WINDOWS-1252?Q?mization_with_Kruskal=92s_Algorithm?=

"Jonah H. Harris" <jonah.harris@gmail.com> writes:
> Wikipedia has the algorithm itself
> (http://en.wikipedia.org/wiki/Kruskal's_algorithm), but I was more
> interested in the actual applicability to PG and any issues they ran
> into.


Hmm ... minimum spanning tree of a graph, eh? Right offhand I'd say
this is a pretty terrible model of the join order planning problem.
The difficulty with trying to represent join order as a weighted
graph is that it assumes the cost to join two relations (ie, the
weight on the arc between them) is independent of what else you have
joined first. Which is clearly utterly wrong for join planning.

Our GEQO optimizer has a similar issue --- it uses a search algorithm
that is designed to solve traveling-salesman, which is almost the same
thing as minimum spanning tree. The saving grace for GEQO is that its
TSP orientation is only driving a heuristic; when it considers a given
overall join order it is at least capable of computing the right cost.
It looks to me like Kruskal's algorithm is entirely dependent on the
assumption that minimizing the sum of some predetermined pairwise costs
gives the correct plan.

In short, I'm sure it's pretty fast compared to either of our current
join planning methods, but I'll bet a lot that it often picks a much
worse plan. Color me unexcited, unless they've found some novel way
of defining the graph representation that avoids this problem.

regards, tom lane

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 05-11-2008, 09:42 PM
Michael Glaesemann
 
Posts: n/a
Default =?WINDOWS-1252?Q?Re:__Re:_Query_Optimization_with_Krusk?==?W INDOWS-1252?Q?al=92s_Algorithm?=


On May 10, 2008, at 1:31 PM, Rauan Maemirov wrote:

> I also would like to hear from them. But seems like the thread is
> loosed in tonn of other threads.


It's also the middle of a commit fest, when a lot of the developers
are focussed on processing the current patches in the queue, rather
than actively exploring new, potential features.

Michael Glaesemann
grzm seespotcode net




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 06:27 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com