Unix Technical Forum

count * performance issue

This is a discussion on count * performance issue within the Pgsql Performance forums, part of the PostgreSQL category; --> count(*) tooks much time... but with the where clause we can make this to use indexing,... what where clause ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 11:45 AM
sathiya psql
 
Posts: n/a
Default count * performance issue

count(*) tooks much time...

but with the where clause we can make this to use indexing,... what where
clause we can use??

Am using postgres 7.4 in Debian OS with 1 GB RAM,

am having a table with nearly 50 lakh records,

it has more than 15 columns, i want to count how many records are there, it
is taking nearly 17 seconds to do that...

i know that to get a approximate count we can use
SELECT reltuples FROM pg_class where relname = TABLENAME;

but this give approximate count, and i require exact count...

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 11:45 AM
sathiya psql
 
Posts: n/a
Default Re: count * performance issue

will you please tell, what is autovacuuming... and wat it ll do... is there
any good article in this....

On Thu, Mar 6, 2008 at 11:56 AM, Shoaib Mir <shoaibmir@gmail.com> wrote:

> On Thu, Mar 6, 2008 at 5:19 PM, sathiya psql <sathiya.psql@gmail.com>
> wrote:
>
> > buy every time i need to put ANALYZE...
> > this takes the same time as count(*) takes, what is the use ??
> >
> >
> >

> Dont you have autovacuuming running in the background which is taking care
> of the analyze as well?
>
> If not then hmm turn it on and doing manual analyze then shouldnt I guess
> take much time!
>
> But yes, I will say if its possible go with the trigger option as that
> might be more helpful and a very fast way to do that.
>
>
> --
> Shoaib Mir
> Fujitsu Australia Software Technology
> shoaibm[@]fast.fujitsu.com.au
>


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-19-2008, 11:45 AM
Shoaib Mir
 
Posts: n/a
Default Re: count * performance issue

On Thu, Mar 6, 2008 at 5:31 PM, sathiya psql <sathiya.psql@gmail.com> wrote:

> will you please tell, what is autovacuuming... and wat it ll do... is
> there any good article in this....
>
>
>

Read this -->
http://www.postgresql.org/docs/8.3/i...tml#AUTOVACUUM

--
Shoaib Mir
Fujitsu Australia Software Technology
shoaibm[@]fast.fujitsu.com.au

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-19-2008, 11:45 AM
A. Kretschmer
 
Posts: n/a
Default Re: count * performance issue

am Thu, dem 06.03.2008, um 1:26:46 -0500 mailte Mark Mielke folgendes:
>
>
> There aren't a general solution. If you realy need the exact count of
> tuples than you can play with a TRIGGER and increase/decrease the
> tuple-count for this table in an extra table.
>
>
> Of course, this means accepting the cost of obtaining update locks on the count
> table.
>
> The original poster should understand that they can either get a fast estimated
> count, or they can get a slow accurate count (either slow in terms of select
> using count(*) or slow in terms of updates using triggers and locking).
>
> Other systems have their own issues. An index scan may be faster than a table
> scan for databases that can accurately determine counts using only the index,


No. The current index-implementation contains no information about the
row-visibility within the current transaction. You need to scan the
whole data-table to obtain if the current row are visible within the
current transaction.


> but it's still a relatively slow operation, and people don't normally need an
> accurate count for records in the range of 100,000+? :-)


right.


Andreas
--
Andreas Kretschmer
Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://mail.postgresql.org/mj/mj_www...ql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-19-2008, 11:45 AM
sathiya psql
 
Posts: n/a
Default Re: count * performance issue

is there any way to explicitly force the postgres to use index scan

On Thu, Mar 6, 2008 at 12:06 PM, A. Kretschmer <
andreas.kretschmer@schollglas.com> wrote:

> am Thu, dem 06.03.2008, um 1:26:46 -0500 mailte Mark Mielke folgendes:
> >
> >
> > There aren't a general solution. If you realy need the exact

> count of
> > tuples than you can play with a TRIGGER and increase/decrease

> the
> > tuple-count for this table in an extra table.
> >
> >
> > Of course, this means accepting the cost of obtaining update locks on

> the count
> > table.
> >
> > The original poster should understand that they can either get a fast

> estimated
> > count, or they can get a slow accurate count (either slow in terms of

> select
> > using count(*) or slow in terms of updates using triggers and locking).
> >
> > Other systems have their own issues. An index scan may be faster than a

> table
> > scan for databases that can accurately determine counts using only the

> index,
>
> No. The current index-implementation contains no information about the
> row-visibility within the current transaction. You need to scan the
> whole data-table to obtain if the current row are visible within the
> current transaction.
>
>
> > but it's still a relatively slow operation, and people don't normally

> need an
> > accurate count for records in the range of 100,000+? :-)

>
> right.
>
>
> Andreas
> --
> Andreas Kretschmer
> Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header)
> GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
>
> http://mail.postgresql.org/mj/mj_www...ql-performance
>


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-19-2008, 11:45 AM
sathiya psql
 
Posts: n/a
Default Fwd: count * performance issue

---------- Forwarded message ----------
From: sathiya psql <sathiya.psql@gmail.com>
Date: Thu, Mar 6, 2008 at 12:17 PM
Subject: Re: [PERFORM] count * performance issue
To: "A. Kretschmer" <andreas.kretschmer@schollglas.com>
Cc: psql-performance@postgresql.org


TRIGGER i can use if i want the count of the whole table, but i require for
some of the rows with WHERE condition....

so how to do that ???


On Thu, Mar 6, 2008 at 12:06 PM, A. Kretschmer <
andreas.kretschmer@schollglas.com> wrote:

> am Thu, dem 06.03.2008, um 1:26:46 -0500 mailte Mark Mielke folgendes:
> >
> >
> > There aren't a general solution. If you realy need the exact

> count of
> > tuples than you can play with a TRIGGER and increase/decrease

> the
> > tuple-count for this table in an extra table.
> >
> >
> > Of course, this means accepting the cost of obtaining update locks on

> the count
> > table.
> >
> > The original poster should understand that they can either get a fast

> estimated
> > count, or they can get a slow accurate count (either slow in terms of

> select
> > using count(*) or slow in terms of updates using triggers and locking).
> >
> > Other systems have their own issues. An index scan may be faster than a

> table
> > scan for databases that can accurately determine counts using only the

> index,
>
> No. The current index-implementation contains no information about the
> row-visibility within the current transaction. You need to scan the
> whole data-table to obtain if the current row are visible within the
> current transaction.
>
>
> > but it's still a relatively slow operation, and people don't normally

> need an
> > accurate count for records in the range of 100,000+? :-)

>
> right.
>
>
> Andreas
> --
> Andreas Kretschmer
> Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header)
> GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
>
> http://mail.postgresql.org/mj/mj_www...ql-performance
>


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-19-2008, 11:45 AM
A. Kretschmer
 
Posts: n/a
Default Re: count * performance issue

am Thu, dem 06.03.2008, um 12:13:17 +0530 mailte sathiya psql folgendes:
> is there any way to explicitly force the postgres to use index scan


Not realy, PG use a cost-based optimizer and use an INDEX if it make
sense.


>
> On Thu, Mar 6, 2008 at 12:06 PM, A. Kretschmer <
> andreas.kretschmer@schollglas.com> wrote:


please, no silly top-posting with the complete quote below.


Andreas
--
Andreas Kretschmer
Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://mail.postgresql.org/mj/mj_www...ql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-19-2008, 11:45 AM
A. Kretschmer
 
Posts: n/a
Default Re: count * performance issue

am Thu, dem 06.03.2008, um 12:17:55 +0530 mailte sathiya psql folgendes:
> TRIGGER i can use if i want the count of the whole table, but i require for
> some of the rows with WHERE condition....
>
> so how to do that ???


Okay, in this case a TRIGGER are a bad idea. You can use an INDEX on
this row. Can you show us the output for a EXPLAIN ANALYSE SELECT
count(*) from <your_table> WHERE <your_row> = ... ?


Andreas
--
Andreas Kretschmer
Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://mail.postgresql.org/mj/mj_www...ql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-19-2008, 11:45 AM
sathiya psql
 
Posts: n/a
Default Re: count * performance issue

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=205756.95..205756.95 rows=1 width=0) (actual time=
114675.042..114675.042 rows=1 loops=1)
-> Seq Scan on call_log (cost=0.00..193224.16 rows=5013112 width=0)
(actual time=11.754..91429.594 rows=5061619 loops=1)
Filter: (call_id > 0)
Total runtime: 114699.797 ms
(4 rows)


it is now taking 114 seconds, i think because of load in my system.... any
way will you explain., what is this COST, actual time and other stuffs....

On Thu, Mar 6, 2008 at 12:27 PM, A. Kretschmer <
andreas.kretschmer@schollglas.com> wrote:

> am Thu, dem 06.03.2008, um 12:17:55 +0530 mailte sathiya psql folgendes:
> > TRIGGER i can use if i want the count of the whole table, but i require

> for
> > some of the rows with WHERE condition....
> >
> > so how to do that ???

>
> Okay, in this case a TRIGGER are a bad idea. You can use an INDEX on
> this row. Can you show us the output for a EXPLAIN ANALYSE SELECT
> count(*) from <your_table> WHERE <your_row> = ... ?
>
>
> Andreas
> --
> Andreas Kretschmer
> Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header)
> GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
>
> http://mail.postgresql.org/mj/mj_www...ql-performance
>


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-19-2008, 11:45 AM
A. Kretschmer
 
Posts: n/a
Default Re: count * performance issue

am Thu, dem 06.03.2008, um 12:36:48 +0530 mailte sathiya psql folgendes:
>
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------
> Aggregate (cost=205756.95..205756.95 rows=1 width=0) (actual time=
> 114675.042..114675.042 rows=1 loops=1)
> -> Seq Scan on call_log (cost=0.00..193224.16 rows=5013112 width=0)
> (actual time=11.754..91429.594 rows=5061619 loops=1)
> Filter: (call_id > 0)
> Total runtime: 114699.797 ms
> (4 rows)


'call_id > 0' are your where-condition? An INDEX can't help, all rows
with call_id > 0 are in the result, and i guess, that's all records in
the table.


>
>
> it is now taking 114 seconds, i think because of load in my system.... any way
> will you explain., what is this COST, actual time and other stuffs....



08:16 < akretschmer> ??explain
08:16 < rtfm_please> For information about explain
08:16 < rtfm_please> see http://explain-analyze.info
08:16 < rtfm_please> or http://www.depesz.com/index.php/2007...plain-analyze/
08:16 < rtfm_please> or http://www.postgresql.org/docs/curre...l-explain.html

and

http://redivi.com/~bob/oscon2005_pgs...ain_Public.pdf


Read this to learn more about explain.


Andreas
--
Andreas Kretschmer
Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://mail.postgresql.org/mj/mj_www...ql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 07:12 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com