This is a discussion on count * performance issue within the Pgsql Performance forums, part of the PostgreSQL category; --> count(*) tooks much time... but with the where clause we can make this to use indexing,... what where clause ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| count(*) tooks much time... but with the where clause we can make this to use indexing,... what where clause we can use?? Am using postgres 7.4 in Debian OS with 1 GB RAM, am having a table with nearly 50 lakh records, it has more than 15 columns, i want to count how many records are there, it is taking nearly 17 seconds to do that... i know that to get a approximate count we can use SELECT reltuples FROM pg_class where relname = TABLENAME; but this give approximate count, and i require exact count... |
| |||
| will you please tell, what is autovacuuming... and wat it ll do... is there any good article in this.... On Thu, Mar 6, 2008 at 11:56 AM, Shoaib Mir <shoaibmir@gmail.com> wrote: > On Thu, Mar 6, 2008 at 5:19 PM, sathiya psql <sathiya.psql@gmail.com> > wrote: > > > buy every time i need to put ANALYZE... > > this takes the same time as count(*) takes, what is the use ?? > > > > > > > Dont you have autovacuuming running in the background which is taking care > of the analyze as well? > > If not then hmm turn it on and doing manual analyze then shouldnt I guess > take much time! > > But yes, I will say if its possible go with the trigger option as that > might be more helpful and a very fast way to do that. > > > -- > Shoaib Mir > Fujitsu Australia Software Technology > shoaibm[@]fast.fujitsu.com.au > |
| |||
| On Thu, Mar 6, 2008 at 5:31 PM, sathiya psql <sathiya.psql@gmail.com> wrote: > will you please tell, what is autovacuuming... and wat it ll do... is > there any good article in this.... > > > Read this --> http://www.postgresql.org/docs/8.3/i...tml#AUTOVACUUM -- Shoaib Mir Fujitsu Australia Software Technology shoaibm[@]fast.fujitsu.com.au |
| |||
| am Thu, dem 06.03.2008, um 1:26:46 -0500 mailte Mark Mielke folgendes: > > > There aren't a general solution. If you realy need the exact count of > tuples than you can play with a TRIGGER and increase/decrease the > tuple-count for this table in an extra table. > > > Of course, this means accepting the cost of obtaining update locks on the count > table. > > The original poster should understand that they can either get a fast estimated > count, or they can get a slow accurate count (either slow in terms of select > using count(*) or slow in terms of updates using triggers and locking). > > Other systems have their own issues. An index scan may be faster than a table > scan for databases that can accurately determine counts using only the index, No. The current index-implementation contains no information about the row-visibility within the current transaction. You need to scan the whole data-table to obtain if the current row are visible within the current transaction. > but it's still a relatively slow operation, and people don't normally need an > accurate count for records in the range of 100,000+? :-) right. Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://mail.postgresql.org/mj/mj_www...ql-performance |
| |||
| is there any way to explicitly force the postgres to use index scan On Thu, Mar 6, 2008 at 12:06 PM, A. Kretschmer < andreas.kretschmer@schollglas.com> wrote: > am Thu, dem 06.03.2008, um 1:26:46 -0500 mailte Mark Mielke folgendes: > > > > > > There aren't a general solution. If you realy need the exact > count of > > tuples than you can play with a TRIGGER and increase/decrease > the > > tuple-count for this table in an extra table. > > > > > > Of course, this means accepting the cost of obtaining update locks on > the count > > table. > > > > The original poster should understand that they can either get a fast > estimated > > count, or they can get a slow accurate count (either slow in terms of > select > > using count(*) or slow in terms of updates using triggers and locking). > > > > Other systems have their own issues. An index scan may be faster than a > table > > scan for databases that can accurately determine counts using only the > index, > > No. The current index-implementation contains no information about the > row-visibility within the current transaction. You need to scan the > whole data-table to obtain if the current row are visible within the > current transaction. > > > > but it's still a relatively slow operation, and people don't normally > need an > > accurate count for records in the range of 100,000+? :-) > > right. > > > Andreas > -- > Andreas Kretschmer > Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) > GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > > http://mail.postgresql.org/mj/mj_www...ql-performance > |
| |||
| ---------- Forwarded message ---------- From: sathiya psql <sathiya.psql@gmail.com> Date: Thu, Mar 6, 2008 at 12:17 PM Subject: Re: [PERFORM] count * performance issue To: "A. Kretschmer" <andreas.kretschmer@schollglas.com> Cc: psql-performance@postgresql.org TRIGGER i can use if i want the count of the whole table, but i require for some of the rows with WHERE condition.... so how to do that ??? On Thu, Mar 6, 2008 at 12:06 PM, A. Kretschmer < andreas.kretschmer@schollglas.com> wrote: > am Thu, dem 06.03.2008, um 1:26:46 -0500 mailte Mark Mielke folgendes: > > > > > > There aren't a general solution. If you realy need the exact > count of > > tuples than you can play with a TRIGGER and increase/decrease > the > > tuple-count for this table in an extra table. > > > > > > Of course, this means accepting the cost of obtaining update locks on > the count > > table. > > > > The original poster should understand that they can either get a fast > estimated > > count, or they can get a slow accurate count (either slow in terms of > select > > using count(*) or slow in terms of updates using triggers and locking). > > > > Other systems have their own issues. An index scan may be faster than a > table > > scan for databases that can accurately determine counts using only the > index, > > No. The current index-implementation contains no information about the > row-visibility within the current transaction. You need to scan the > whole data-table to obtain if the current row are visible within the > current transaction. > > > > but it's still a relatively slow operation, and people don't normally > need an > > accurate count for records in the range of 100,000+? :-) > > right. > > > Andreas > -- > Andreas Kretschmer > Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) > GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > > http://mail.postgresql.org/mj/mj_www...ql-performance > |
| |||
| am Thu, dem 06.03.2008, um 12:13:17 +0530 mailte sathiya psql folgendes: > is there any way to explicitly force the postgres to use index scan Not realy, PG use a cost-based optimizer and use an INDEX if it make sense. > > On Thu, Mar 6, 2008 at 12:06 PM, A. Kretschmer < > andreas.kretschmer@schollglas.com> wrote: please, no silly top-posting with the complete quote below. Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://mail.postgresql.org/mj/mj_www...ql-performance |
| |||
| am Thu, dem 06.03.2008, um 12:17:55 +0530 mailte sathiya psql folgendes: > TRIGGER i can use if i want the count of the whole table, but i require for > some of the rows with WHERE condition.... > > so how to do that ??? Okay, in this case a TRIGGER are a bad idea. You can use an INDEX on this row. Can you show us the output for a EXPLAIN ANALYSE SELECT count(*) from <your_table> WHERE <your_row> = ... ? Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://mail.postgresql.org/mj/mj_www...ql-performance |
| |||
| QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------ Aggregate (cost=205756.95..205756.95 rows=1 width=0) (actual time= 114675.042..114675.042 rows=1 loops=1) -> Seq Scan on call_log (cost=0.00..193224.16 rows=5013112 width=0) (actual time=11.754..91429.594 rows=5061619 loops=1) Filter: (call_id > 0) Total runtime: 114699.797 ms (4 rows) it is now taking 114 seconds, i think because of load in my system.... any way will you explain., what is this COST, actual time and other stuffs.... On Thu, Mar 6, 2008 at 12:27 PM, A. Kretschmer < andreas.kretschmer@schollglas.com> wrote: > am Thu, dem 06.03.2008, um 12:17:55 +0530 mailte sathiya psql folgendes: > > TRIGGER i can use if i want the count of the whole table, but i require > for > > some of the rows with WHERE condition.... > > > > so how to do that ??? > > Okay, in this case a TRIGGER are a bad idea. You can use an INDEX on > this row. Can you show us the output for a EXPLAIN ANALYSE SELECT > count(*) from <your_table> WHERE <your_row> = ... ? > > > Andreas > -- > Andreas Kretschmer > Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) > GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > > http://mail.postgresql.org/mj/mj_www...ql-performance > |
| ||||
| am Thu, dem 06.03.2008, um 12:36:48 +0530 mailte sathiya psql folgendes: > > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------ > Aggregate (cost=205756.95..205756.95 rows=1 width=0) (actual time= > 114675.042..114675.042 rows=1 loops=1) > -> Seq Scan on call_log (cost=0.00..193224.16 rows=5013112 width=0) > (actual time=11.754..91429.594 rows=5061619 loops=1) > Filter: (call_id > 0) > Total runtime: 114699.797 ms > (4 rows) 'call_id > 0' are your where-condition? An INDEX can't help, all rows with call_id > 0 are in the result, and i guess, that's all records in the table. > > > it is now taking 114 seconds, i think because of load in my system.... any way > will you explain., what is this COST, actual time and other stuffs.... 08:16 < akretschmer> ??explain 08:16 < rtfm_please> For information about explain 08:16 < rtfm_please> see http://explain-analyze.info 08:16 < rtfm_please> or http://www.depesz.com/index.php/2007...plain-analyze/ 08:16 < rtfm_please> or http://www.postgresql.org/docs/curre...l-explain.html and http://redivi.com/~bob/oscon2005_pgs...ain_Public.pdf Read this to learn more about explain. Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://mail.postgresql.org/mj/mj_www...ql-performance |
| Thread Tools | |
| Display Modes | |
|
|