Unix Technical Forum

500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running

This is a discussion on 500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running within the Pgsql Performance forums, part of the PostgreSQL category; --> Hi, appreciate if someone can have some pointers for this. PG.8.2.4 1.4G centrino(s/core) 1.5GB ram/5400rpm laptop HD 3 mail ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 11:32 AM
El-Lotso
 
Posts: n/a
Default 500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running

Hi,

appreciate if someone can have some pointers for this.

PG.8.2.4 1.4G centrino(s/core) 1.5GB ram/5400rpm laptop HD

3 mail tables which has already been selected "out" into separate tables
(useing create table foo as select * from foo_main where x=y)

These test tables containing only a very small subset of the main data's
table (max 1k to 10k rows vs 1.5mill to 7mill rows in the main table)

table definitions and actual query are attached. (names has been altered
to protect the innocent)

I've played around with some tweaking of the postgres.conf setting per
guidance from jdavis (in irc) w/o much(any) improvement. Also tried
re-writing the queries to NOT use subselects (per depesz in irc also)
also yielded nothing spectacular.

The only thing I noticed was that when the subqueries combine more than
3 tables, then PG will choke. If only at 3 joined tables per subquery,
the results come out fast, even for 6K rows.

but if the subqueries (these subqueries by itself, executes fast and
returns results in 1 to 10secs) were done independently and then placed
into a temp table, and then finally joined together using a query such
as

select a.a,b.b,c.c from a inner join b on (x = x) left outer join c on(x
= y)

then it would also be fast

work_mem = 8MB / 32MB /128MB (32 MB default in my setup)
effective_Cache_size = 128MB/500MB (500 default)
shared_buffers = 200MB
geqo_threshold = 5 (default 12)
geqo_effort = 2 (default 5)
ramdom_page_cose = 8.0 (default 4)
maintenance_work_mem = 64MB
join_collapse_limit = 1/8/15 (8 default)
from_collapse_limit = 1/8/15 (8 default)
enable_nestloop = f (on by default)

based on current performance, even with a small number of rows in the
individual tables (max 20k), I can't even get a result out in 2 hours.
(> 3 tables joined per subquery) which is making me re-think of PG's
useful-ness.



BTW, I also tried 8.2.4 CVS_STABLE Branch


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 11:32 AM
El-Lotso
 
Posts: n/a
Default Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hours and stillrunning

sorry.. I sent this as I was about to go to bed and the explain analyse
of the query w/ 4 tables joined per subquery came out.

So.. attaching it..

On Wed, 2007-09-12 at 00:57 +0800, El-Lotso wrote:
> Hi,
>
> appreciate if someone can have some pointers for this.
>
> PG.8.2.4 1.4G centrino(s/core) 1.5GB ram/5400rpm laptop HD
>
> 3 mail tables which has already been selected "out" into separate tables
> (useing create table foo as select * from foo_main where x=y)
>
> These test tables containing only a very small subset of the main data's
> table (max 1k to 10k rows vs 1.5mill to 7mill rows in the main table)
>
> table definitions and actual query are attached. (names has been altered
> to protect the innocent)
>
> I've played around with some tweaking of the postgres.conf setting per
> guidance from jdavis (in irc) w/o much(any) improvement. Also tried
> re-writing the queries to NOT use subselects (per depesz in irc also)
> also yielded nothing spectacular.
>
> The only thing I noticed was that when the subqueries combine more than
> 3 tables, then PG will choke. If only at 3 joined tables per subquery,
> the results come out fast, even for 6K rows.
>
> but if the subqueries (these subqueries by itself, executes fast and
> returns results in 1 to 10secs) were done independently and then placed
> into a temp table, and then finally joined together using a query such
> as
>
> select a.a,b.b,c.c from a inner join b on (x = x) left outer join c on(x
> = y)
>
> then it would also be fast
>
> work_mem = 8MB / 32MB /128MB (32 MB default in my setup)
> effective_Cache_size = 128MB/500MB (500 default)
> shared_buffers = 200MB
> geqo_threshold = 5 (default 12)
> geqo_effort = 2 (default 5)
> ramdom_page_cose = 8.0 (default 4)
> maintenance_work_mem = 64MB
> join_collapse_limit = 1/8/15 (8 default)
> from_collapse_limit = 1/8/15 (8 default)
> enable_nestloop = f (on by default)
>
> based on current performance, even with a small number of rows in the
> individual tables (max 20k), I can't even get a result out in 2 hours.
> (> 3 tables joined per subquery) which is making me re-think of PG's
> useful-ness.
>
>
>
> BTW, I also tried 8.2.4 CVS_STABLE Branch



---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-19-2008, 11:32 AM
Tom Lane
 
Posts: n/a
Default Re: Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running

El-Lotso <el.lotso@gmail.com> writes:
> sorry.. I sent this as I was about to go to bed and the explain analyse
> of the query w/ 4 tables joined per subquery came out.


It's those factor-of-1000 misestimates of the join sizes that are
killing you, eg this one:

> -> Hash Join (cost=249.61..512.56 rows=1 width=87) (actual time=15.139..32.858 rows=969 loops=1)
> Hash Cond: (((test_db.ts.id)::text = (test_db.d.id)::text) AND (test_db.ts.start_timestamp = test_db.trd.start_timestamp) AND (test_db.ts.ttype = test_db.trd.ttype))
> -> Seq Scan on ts (cost=0.00..226.44 rows=3244 width=40) (actual time=0.135..6.916 rows=3244 loops=1)
> -> Hash (cost=235.00..235.00 rows=835 width=47) (actual time=14.933..14.933 rows=1016 loops=1)


The single-row-result estimate persuades it to use a nestloop at the
next level up, and then when the output is actually 969 rows, that
means 969 executions of the other side of the upper join.

The two input size estimates are reasonably close to reality, so
the problem seems to be in the estimate of selectivity of the
join condition. First off, do you have up-to-date statistics
for all the columns being joined here? It might be that
increasing the statistics targets for those columns would help.

But what I'm a bit worried about is the idea that the join
conditions are correlated or even outright redundant; the
planner will not know that, and will make an unrealistic
estimate of their combined selectivity. If that's the
case, you might need to redesign the table schema to
eliminate the redundancy before you'll get good plans.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-19-2008, 11:32 AM
El-Lotso
 
Posts: n/a
Default Re: Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hoursand still running

On Tue, 2007-09-11 at 14:23 -0400, Tom Lane wrote:
> El-Lotso <el.lotso@gmail.com> writes:
> > sorry.. I sent this as I was about to go to bed and the explain analyse
> > of the query w/ 4 tables joined per subquery came out.

>
> It's those factor-of-1000 misestimates of the join sizes that are
> killing you, eg this one:
>
> > -> Hash Join (cost=249.61..512.56 rows=1 width=87) (actual time=15.139..32.858 rows=969 loops=1)
> > Hash Cond: (((test_db.ts.id)::text = (test_db.d.id)::text) AND (test_db.ts.start_timestamp = test_db.trd.start_timestamp) AND (test_db.ts.ttype = test_db.trd.ttype))
> > -> Seq Scan on ts (cost=0.00..226.44 rows=3244 width=40) (actual time=0.135..6.916 rows=3244 loops=1)
> > -> Hash (cost=235.00..235.00 rows=835 width=47) (actual time=14.933..14.933 rows=1016 loops=1)

>
> The single-row-result estimate persuades it to use a nestloop at the
> next level up, and then when the output is actually 969 rows, that
> means 969 executions of the other side of the upper join.


Yep.. that's consistent with the larger results output. more rows = more
loops

>
> The two input size estimates are reasonably close to reality, so
> the problem seems to be in the estimate of selectivity of the
> join condition. First off, do you have up-to-date statistics
> for all the columns being joined here? It might be that
> increasing the statistics targets for those columns would help.


I've already upped the stats level to 1000, reindex, vacuum, analysed
etc but nothing has basically changed. The issue here is mainly because
for each id, there is between 2 to 8 hid.

eg:
table d
seq : 1234567 / code : CED89

table trh
seq : 123456
hid : 0/1/2/3/4/5/6/7

and the prob is also compounded by the different ttypes available which
causes the use of the subqueries.

end of the day.. this data output is desired

ID HID
===========
1234567 |0
1234567 |1
1234567 |2
1234567 |3
1234567 |4
1234567 |5
1234567 |6
1234567 |7

the d table has the unique id whereas the other tables has all the
subsets. Like a family tree.. Starts at 2, (mom/pop) then to children +
children's grandchildren (pair1) children's grandchildren(pair2)

d to trh is a one to many relationship

> But what I'm a bit worried about is the idea that the join
> conditions are correlated or even outright redundant; the
> planner will not know that, and will make an unrealistic
> estimate of their combined selectivity. If that's the
> case, you might need to redesign the table schema to
> eliminate the redundancy before you'll get good plans.


I'm not I understand (actually, i don't) the above comment. I've already
made then from subqueries to actual joins (collapse it) and still no
dice.

btw, this same schema runs fine on SQL server. (which I'm pulling data
from and pumping into PG)

I'm downgrading to 8.1.9 to see if it helps too.

appreciate any pointers at all.


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-19-2008, 11:32 AM
El-Lotso
 
Posts: n/a
Default Re: Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hoursand still running

On Wed, 2007-09-12 at 10:15 +0800, El-Lotso wrote:
> I'm downgrading to 8.1.9 to see if it helps too.\


Nope : Doesn't help at all.. the number of rows at the nested loop and
hash joins are still 1 to 500 ratio. This plan is slightly different in
that PG is choosing seq_scans

Nested Loop Left Join (cost=2604.28..4135.15 rows=1 width=59) (actual time=249.973..15778.157 rows=528 loops=1)
Join Filter: ((("inner".id)::text = ("outer".id)::text) AND ("inner".hid = "outer".hid) AND ("inner".seq_time = "outer".seq_time) AND ("inner".seq_date = "outer".seq_date))
-> Nested Loop Left Join (cost=1400.08..2766.23 rows=1 width=67) (actual time=168.375..8002.573 rows=528 loops=1)
Join Filter: ((("inner".id)::text = ("outer".id)::text) AND ("inner".hid = "outer".hid) AND ("inner".seq_time = "outer".seq_time) AND ("inner".seq_date = "outer".seq_date))
-> Hash Join (cost=127.25..1328.68 rows=1 width=59) (actual time=74.195..84.855 rows=528 loops=1)
Hash Cond: ((("outer".id)::text = ("inner".id)::text) AND ("outer".ttype = "inner".ttype) AND ("outer".start_timestamp = "inner".start_timestamp))
-> Seq Scan on trh (cost=0.00..1060.18 rows=9416 width=36) (actual time=0.022..53.830 rows=9416 loops=1)
Filter: ((ttype = 35) OR (ttype = 75) OR (ttype = 703) OR (ttype = 740) OR (ttype = 764))
-> Hash (cost=125.53..125.53 rows=230 width=63) (actual time=12.487..12.487 rows=192 loops=1)
-> Hash Join (cost=18.69..125.53 rows=230 width=63) (actual time=11.043..12.007 rows=192 loops=1)
Hash Cond: (("outer".id)::text = ("inner".id)::text)
-> Seq Scan on ts (cost=0.00..87.36 rows=3436 width=40) (actual time=0.003..5.436 rows=3436 loops=1)
-> Hash (cost=18.57..18.57 rows=48 width=23) (actual time=0.876..0.876 rows=48 loops=1)
-> Seq Scan on d (cost=0.00..18.57 rows=48 width=23) (actual time=0.019..0.771 rows=48 loops=1)
Filter: ((record_update_date_time >= '2007-08-20 00:00:00'::timestamp without time zone) AND (record_update_date_time <= '2007-09-08 00:00:00'::timestamp without time zone) AND ((code)::text = 'HUA75'::text))
-> Hash Join (cost=1272.83..1437.52 rows=1 width=61) (actual time=11.784..14.216 rows=504 loops=528)
Hash Cond: ((("outer".id)::text = ("inner".id)::text) AND ("outer".ttype = "inner".ttype) AND ("outer".start_timestamp = "inner".start_timestamp))
-> Seq Scan on ts (cost=0.00..87.36 rows=3436 width=40) (actual time=0.003..5.744 rows=3436 loops=528)
-> Hash (cost=1268.29..1268.29 rows=606 width=59) (actual time=82.783..82.783 rows=504 loops=1)
-> Hash Join (cost=18.69..1268.29 rows=606 width=59) (actual time=76.454..81.515 rows=504 loops=1)
Hash Cond: (("outer".id)::text = ("inner".id)::text)
-> Seq Scan on trh (cost=0.00..1198.22 rows=9064 width=36) (actual time=0.051..66.555 rows=9064 loops=1)
Filter: ((ttype = 69) OR (ttype = 178) OR (ttype = 198) OR (ttype = 704) OR (ttype = 757) OR (ttype = 741) OR (ttype = 765))
-> Hash (cost=18.57..18.57 rows=48 width=23) (actual time=0.863..0.863 rows=48 loops=1)
-> Seq Scan on d (cost=0.00..18.57 rows=48 width=23) (actual time=0.019..0.761 rows=48 loops=1)
Filter: ((record_update_date_time >= '2007-08-20 00:00:00'::timestamp without time zone) AND (record_update_date_time <= '2007-09-08 00:00:00'::timestamp without time zone) AND ((code)::text = 'HUA75'::text))
-> Hash Join (cost=1204.20..1368.89 rows=1 width=61) (actual time=11.498..13.941 rows=504 loops=528)
Hash Cond: ((("outer".id)::text = ("inner".id)::text) AND ("outer".ttype = "inner".ttype) AND ("outer".start_timestamp = "inner".start_timestamp))
-> Seq Scan on ts (cost=0.00..87.36 rows=3436 width=40) (actual time=0.003..5.593 rows=3436 loops=528)
-> Hash (cost=1199.62..1199.62 rows=610 width=59) (actual time=70.186..70.186 rows=504 loops=1)
-> Hash Join (cost=18.69..1199.62 rows=610 width=59) (actual time=64.270..68.886 rows=504 loops=1)
Hash Cond: (("outer".id)::text = ("inner".id)::text)
-> Seq Scan on trh (cost=0.00..1129.20 rows=9128 width=36) (actual time=0.020..54.050 rows=9128 loops=1)
Filter: ((ttype = 177) OR (ttype = 197) OR (ttype = 705) OR (ttype = 742) OR (ttype = 758) OR (ttype = 766))
-> Hash (cost=18.57..18.57 rows=48 width=23) (actual time=1.100..1.100 rows=48 loops=1)
-> Seq Scan on d (cost=0.00..18.57 rows=48 width=23) (actual time=0.019..0.994 rows=48 loops=1)
Filter: ((record_update_date_time >= '2007-08-20 00:00:00'::timestamp without time zone) AND (record_update_date_time <= '2007-09-08 00:00:00'::timestamp without time zone) AND ((code)::text = 'HUA75'::text))
Total runtime: 15779.769 ms

Am I screwed? Is a schema redesign really a necessity? This would be a
real pain given the rewrite of _all_ the queries and can't maintain
compatibility in the front-end app between sql server and PG.


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-19-2008, 11:32 AM
El-Lotso
 
Posts: n/a
Default Re: Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hoursand still running

On Wed, 2007-09-12 at 10:15 +0800, El-Lotso wrote:
> On Tue, 2007-09-11 at 14:23 -0400, Tom Lane wrote:
> > El-Lotso <el.lotso@gmail.com> writes:
> > > sorry.. I sent this as I was about to go to bed and the explain analyse
> > > of the query w/ 4 tables joined per subquery came out.

> >
> > It's those factor-of-1000 misestimates of the join sizes that are
> > killing you, eg this one:
> >
> > > -> Hash Join (cost=249.61..512.56 rows=1 width=87) (actual time=15.139..32.858 rows=969 loops=1)
> > > Hash Cond: (((test_db.ts.id)::text = (test_db.d.id)::text) AND (test_db.ts.start_timestamp = test_db.trd.start_timestamp) AND (test_db.ts.ttype = test_db.trd.ttype))
> > > -> Seq Scan on ts (cost=0.00..226.44 rows=3244 width=40) (actual time=0.135..6.916 rows=3244 loops=1)
> > > -> Hash (cost=235.00..235.00 rows=835 width=47) (actual time=14.933..14.933 rows=1016 loops=1)

> >
> > The single-row-result estimate persuades it to use a nestloop at the
> > next level up, and then when the output is actually 969 rows, that
> > means 969 executions of the other side of the upper join.

>
> Yep.. that's consistent with the larger results output. more rows = more
> loops



I'm on the verge of giving up... the schema seems simple and yet there's
so much issues with it. Perhaps it's the layout of the data, I don't
know. But based on the ordering/normalisation of the data and the one to
many relationship of some tables, this is giving the planner a headache
(and me a bulge on the head from knockin it against the wall)

I've tried multiple variations, subqueries, not use subqueries, not join
the table, (but to include it as a subquery - which gets re-written to a
join anyway) exists/not exists to no avail.

PG is fast, yes even w/ all the nested loops for up to 48K of results,
(within 4 minutes) but as soon as I put it into a inner join/left
join/multiple temporary(memory) tables it will choke.

select
a.a,b.b,c.c from
(select
x,y,z
from zz)a
inner join b
on a.a = b.a
left join (select
x,a,z
from xx)
then it will choke.

I'm really at my wits end here.


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-19-2008, 11:32 AM
=?ISO-8859-1?Q?Nis_J=F8rgensen?=
 
Posts: n/a
Default Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running

El-Lotso skrev:

> I'm on the verge of giving up... the schema seems simple and yet there's
> so much issues with it. Perhaps it's the layout of the data, I don't
> know. But based on the ordering/normalisation of the data and the one to
> many relationship of some tables, this is giving the planner a headache
> (and me a bulge on the head from knockin it against the wall)


I think you should look more at the db design, and less on rewriting the
query. Here are some observations:

- Your table structure is quite hard to understand (possibly because you
have changed the names) - if you want help on improving it, you will
need to explain the data to us, and possibly post some sample data.
- You seem to be lacking constraints on the tables. My guess is that
(id,ttype,start_timestamp) is unique in both trh and ts - but I cannot
tell (and neither can the query planner). Foreign key constraints might
help as well. These would also help others to understand your data, and
suggest reformulations of your queries.
- Another guess is that the ttype sets (177,197,705,742,758,766),
(69,178,198,704,757,741,765) are actually indicating some other property
a common "type" of record, and that only one of each will be present for
an id,start_timestamp combination. This may be related to the repeating
fields issue - if a certain ttype indicates that we are interested in a
certain pber_x field (and possibly that the others are empty).
- You have what looks like repeating fields - pber_x, fval_x, index_x -
in your tables. Fixing this might not improve your query, but might be a
good idea for other reasons.
- seq_date and seq_time seems like they may be redundant - are they
different casts of the same data?

All speculation. Hope it helps

Nis


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-19-2008, 11:32 AM
Tom Lane
 
Posts: n/a
Default Re: Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hours and still running

El-Lotso <el.lotso@gmail.com> writes:
> I'm really at my wits end here.


Try to merge the multiple join keys into one, somehow. I'm not sure why
the planner is overestimating the selectivity of the combined join
conditions, but that's basically where your problem is coming from.

A truly brute-force solution would be "set enable_nestloop = off"
but this is likely to screw performance for other queries.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-19-2008, 11:33 AM
El-Lotso
 
Posts: n/a
Default Re: Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hoursand still running

On Wed, 2007-09-12 at 10:41 -0400, Tom Lane wrote:
> El-Lotso <el.lotso@gmail.com> writes:
> > I'm really at my wits end here.

>
> Try to merge the multiple join keys into one, somehow. I'm not sure why
> the planner is overestimating the selectivity of the combined join
> conditions, but that's basically where your problem is coming from.


I've tried merging them together.. what previously was

INNER JOIN TS
ON TS.ID = TRH.ID AND
TS.TTYPE = TRH.TTYPE AND
TS.START_TIMESTAMP = TRH.START_TIMESTAMP

has become
inner join TS
on ts.id_ttype_startstamp = trh.id_ttype_startstamp

where id_ttype_startstamp = (id || '-'||ttype || '-' || start_timestamp)

It's working somewhat better but everything is not as rosy as it should
as the planner is still over/under estimating the # of rows.

FROM org :
Nested Loop Left Join (cost=10612.48..24857.20 rows=1 width=61) (actual
time=1177.626..462856.007 rows=750 loops=1)

TO merge joined conditions :
Hash Join (cost=41823.94..45889.49 rows=6101 width=61) (actual
time=3019.609..3037.692 rows=750 loops=1)
Hash Cond: (trd.trd_join_key = ts.ts_join_key)

Merged Join using the Main table : 3 - 5 million rows
Hash Left Join (cost=80846.38..121112.36 rows=25 width=244) (actual
time=5088.437..5457.269 rows=750 loops=1)

Note that it still doesn't really help that much, the estimated rows is
still way off the actual number of rows. On one of the querys there the
hid field has a subset of 8 values, it's even worst. And it seems like
the merge condition doesn't help at all.


I'm still trying to merge more join conditions to see if it helps.




> A truly brute-force solution would be "set enable_nestloop = off"
> but this is likely to screw performance for other queries.


I've also tried this... It's not helping much actually.
As mentioned previously, this is a one to many relationship and because
of that, somehow PG just doesn't take it into account.

I'm still not having much luck here. (playing with a subset of the main
table's data _does_ show some promise, but when querying on the main
table w/ 3 million data, everything grinds to a halt)





---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-19-2008, 11:33 AM
Ow Mun Heng
 
Posts: n/a
Default Re: 500rows = 1min/2.5k rows=20min/6K rows 2 hours andstill running

On Wed, 2007-09-12 at 15:14 +0200, Nis Jørgensen wrote:
> El-Lotso skrev:
>
> > I'm on the verge of giving up... the schema seems simple and yet there's
> > so much issues with it. Perhaps it's the layout of the data, I don't
> > know. But based on the ordering/normalisation of the data and the one to
> > many relationship of some tables, this is giving the planner a headache
> > (and me a bulge on the head from knockin it against the wall)

>
> I think you should look more at the db design, and less on rewriting the
> query. Here are some observations:


I can't help much with the design per-se. So..

>
> - Your table structure is quite hard to understand (possibly because you
> have changed the names) - if you want help on improving it, you will
> need to explain the data to us, and possibly post some sample data.


If anyone is willing, I can send some sample data to you off-list.

on the trh table, hid is a subset of data for a particular id.

eg:
PARENT : CHILD 1
PARENT : CHILD 2
PARENT : CHILD 3
PARENT : CHILD 4

uniqueid = merged fields from id / index1 / index2 / start_timestamp(IN EPOCH)
/ phase_id / ttype which is unique on each table (but not across ALL the tables)


> - You seem to be lacking constraints on the tables. My guess is that
> (id,ttype,start_timestamp) is unique in both trh and ts - but I cannot
> tell (and neither can the query planner). Foreign key constraints might
> help as well. These would also help others to understand your data, and
> suggest reformulations of your queries.


AFAICT, there are no foreign constraints in the original DB design. (and
I'm not even sure how to begin the FK design based on this org design)

the unique_id is as above.
TRH/TRD uniqueid = merged fields from id / index1 / index2 /
start_timestamp(IN EPOCH) / phase_id / ttype

TS uniqueid = merged fields from id / start_timestamp(IN EPOCH) / ttype

Problem with this is that the fields in which they are unique is
different across the different tables, so the unique_id is only unique
for that table alone and acts as a primary key so that no dupes exists
in that one table.



> - Another guess is that the ttype sets (177,197,705,742,758,766),
> (69,178,198,704,757,741,765) are actually indicating some other property
> a common "type" of record, and that only one of each will be present for
> an id,start_timestamp combination. This may be related to the repeatingd
> fields issue - if a certain ttype indicates that we are interested in a
> certain pber_x field (and possibly that the others are empty).


yes..

eg:
id | hid |ttype | start_timestamp | pber_2 | pber 3 |pber_4
PARENT | 0 |764 | 2007-07-01 00:00 | 4000 | null | null
PARENT | 0 |765 | 2007-07-01 00:00 | null | 9000 | null
PARENT | 0 |766 | 2007-07-01 00:00 | null | null | 7999
PARENT | 1 |764 | 2007-07-01 00:00 | 4550 | null | null
PARENT | 1 |765 | 2007-07-01 00:00 | null | 9220 | null
PARENT | 1 |766 | 2007-07-01 00:00 | null | null | 6669


the subqueries are just to take out the fields with the value and leave
the nulls so that we end-up with

id |hid| start_timestamp |pber_2 | pber 3 | pber_4
PARENT | 0 | 2007-07-01 00:00 | 4000 | 9000 | 7999
PARENT | 1 | 2007-07-01 00:00 | 4550 | 9220 | 6669

which is basically just joining a table by itself, but there is a caveat
whereby pber_3 and pber_4 is/can only be joined together based on the
seq_date/seq_time in the ts table hence the query..

JOIN1.id = join2.id
and join1.seq_date = join2.seq_date
etc..

but the problem is confounded by the fact that there is numerous hid
values for head id

> - You have what looks like repeating fields - pber_x, fval_x, index_x -
> in your tables. Fixing this might not improve your query, but might be a
> good idea for other reasons.


it's being looked at by some other team to collapse this to something
like this

ttype | pber
764 | 500
765 | 600
766 | 700

so that there are lesser # of columns and no null fields. But the query
will remain the same

> - seq_date and seq_time seems like they may be redundant - are they
> different casts of the same data?


No. They're used to join together the pber_2/3/4 fields as one may
happen between a few hours to days between each other, but each will be
uniquely identified by the seq_date/time

eg :

id | pber_2 | seq_date | seq time
PARENT | 400 | 2007-07-01 00:00:00 | 1980-01-01 20:00:00
PARENT | 410 | 2007-07-10 00:00:00 | 1980-01-01 22:00:00

id | pber_3 | seq_date | seq time
PARENT | 900 | 2007-07-01 00:00:00 | 1980-01-01 20:00:00
PARENT | 100 | 2007-07-10 00:00:00 | 1980-01-01 22:00:00

id | pber_4 | seq_date | seq time
PARENT | 10000 | 2007-07-01 00:00:00 | 1980-01-01 20:00:00
PARENT | 999 | 2007-07-10 00:00:00 | 1980-01-01 22:00:00


so, the correct value for the fields when joined together will be of the
form

id |start_timestamp |seq_date | seq_time |pber_2 | pber 3 | pber_4
PARENT |2007-07-01 00:00 |2007-07-01 00:00:00 | 1980-01-01 20:00:00| 400 | 900 | 10000
PARENT |2007-07-01 00:00 |2007-07-10 00:00:00 | 1980-01-01 22:00:00| 410 | 100 | 999

(repeating for each hid subset value)



> All speculation. Hope it helps



anything would help.. I'm more or less willing to try anything to make
things faster else this project is going to the toilet.

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 06:02 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com