Unix Technical Forum

[Again] Postgres performance problem

This is a discussion on [Again] Postgres performance problem within the Pgsql Performance forums, part of the PostgreSQL category; --> On 9/12/07, Frank Schoep <frank@ffnn.nl> wrote: > On Sep 12, 2007, at 9:07 PM, Scott Marlowe wrote: > > ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply

 

LinkBack Thread Tools Display Modes
  #11 (permalink)  
Old 04-19-2008, 11:32 AM
Scott Marlowe
 
Posts: n/a
Default Re: [Again] Postgres performance problem

On 9/12/07, Frank Schoep <frank@ffnn.nl> wrote:
> On Sep 12, 2007, at 9:07 PM, Scott Marlowe wrote:
> > On 9/12/07, Mikko Partio <mpartio@gmail.com> wrote:
> >> …
> >> Aren't you mixing up REINDEX and CLUSTER?

> >
> > …
> > Either one does what a vacuum full did / does, but generally does
> > it better.

>
> On topic of REINDEX / VACUUM FULL versus a CLUSTER / VACUUM ANALYZE
> I'd like to ask if CLUSTER is safe to run on a table that is in
> active use.
>
> After updating my maintenance scripts from a VACUUM FULL (add me to
> the list) to CLUSTER (which improves performance a lot) I noticed I
> was getting "could not open relation …" errors in the log while the
> scripts ran so I reverted the change. This was on 8.1.9.
>
> Am I hitting a corner case or is it generally not a good idea to
> CLUSTER tables which are being queried? I haven't had problems with
> the REINDEX / VACUUM FULL combination while CLUSTER / VACUUM ANALYZE
> resulted in errors on the first run.
>
> Can the "could not open relation…" error bring down the whole
> database server? I'm really interested in using CLUSTER regularly as
> it speeds up my system by a factor of two because of more efficient I/O.


No, it won't bring it down. Basically the query lost the relation is
was operating against because it disappeared when the cluster command
runs.

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #12 (permalink)  
Old 04-19-2008, 11:32 AM
Alvaro Herrera
 
Posts: n/a
Default Re: [Again] Postgres performance problem

Scott Marlowe escribió:

> > Aren't you mixing up REINDEX and CLUSTER?

>
> I don't think so. reindex (which runs on tables and indexes, so the
> name is a bit confusing, I admit) basically was originally a "repair"
> operation that rewrote the whole relation and wasn't completely
> transaction safe (way back, 7.2 days or so I think). Due to the
> issues with vacuum full bloating indexes, and being slowly replaced by
> regular vacuum, reindex received some attention to make it transaction
> / crash safe and has kind of take the place of vacuum full in terms of
> "how to fix bloated objects".


Hmm, REINDEX does not rewrite tables. If there are dead tuples, they
will still be there after REINDEX.


> cluster, otoh, rewrites the table into index order.


.... excluding dead tuples, and then rewrites all the indexes.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #13 (permalink)  
Old 04-19-2008, 11:33 AM
Erik Jones
 
Posts: n/a
Default Re: [Again] Postgres performance problem


On Sep 12, 2007, at 2:19 PM, Frank Schoep wrote:

> On Sep 12, 2007, at 9:07 PM, Scott Marlowe wrote:
>> On 9/12/07, Mikko Partio <mpartio@gmail.com> wrote:
>>> …
>>> Aren't you mixing up REINDEX and CLUSTER?

>>
>> …
>> Either one does what a vacuum full did / does, but generally does
>> it better.

>
> On topic of REINDEX / VACUUM FULL versus a CLUSTER / VACUUM ANALYZE
> I'd like to ask if CLUSTER is safe to run on a table that is in
> active use.
>
> After updating my maintenance scripts from a VACUUM FULL (add me to
> the list) to CLUSTER (which improves performance a lot) I noticed I
> was getting "could not open relation …" errors in the log while the
> scripts ran so I reverted the change. This was on 8.1.9.


You'd probably see the same behavior on 8.2.x. CLUSTER is not
transactionally safe so you don't want to run CLUSTER on tables that
are actively being used. I believe that's been fixed for 8.3.

Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #14 (permalink)  
Old 04-19-2008, 11:33 AM
Greg Smith
 
Posts: n/a
Default Re: [Again] Postgres performance problem

On Wed, 12 Sep 2007, Scott Marlowe wrote:

> I'm getting more and more motivated to rewrite the vacuum docs. I think
> a rewrite from the ground up might be best... I keep seeing people
> doing vacuum full on this list and I'm thinking it's as much because of
> the way the docs represent vacuum full as anything.


I agree you shouldn't start thinking in terms of how to fix the existing
documentation. I'd suggest instead writing a tutorial leading someone
through what they need to know about their tables first and then going
into how vacuum works based on that data.

As an example, people throw around terms like "index bloat" and "dead
tuples" when talking about vacuuming. The tutorial I'd like to see
somebody write would start by explaining those terms and showing how to
measure them--preferably with a good and bad example to contrast. The way
these terms are thrown around right now, I don't expect newcomers to
understand either the documentation or the advice people are giving them;
I think it's shooting over their heads and what's needed are some
walkthroughs. Another example I'd like to see thrown in there is what it
looks like when you don't have enough FSM slots.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #15 (permalink)  
Old 04-19-2008, 11:33 AM
El-Lotso
 
Posts: n/a
Default Re: [Again] Postgres performance problem

On Thu, 2007-09-13 at 01:58 -0400, Greg Smith wrote:
> On Wed, 12 Sep 2007, Scott Marlowe wrote:
>
> > I'm getting more and more motivated to rewrite the vacuum docs. I think
> > a rewrite from the ground up might be best... I keep seeing people
> > doing vacuum full on this list and I'm thinking it's as much because of
> > the way the docs represent vacuum full as anything.

>
> I agree you shouldn't start thinking in terms of how to fix the existing
> documentation. I'd suggest instead writing a tutorial leading someone
> through what they need to know about their tables first and then going
> into how vacuum works based on that data.


I'm new to PG and it's true that I am confused.
As it stands this is a newbie's understanding of the various terms.

cluster -> rewrites a table according to index order so that IO is
ordered/sequential
reindex -> basically, rewrites the indexes adding new records/fixes up
old deleted records
vacuum -> does cleaning
vacuum analyse -> clean and update statistics (i run this mostly)
autovacuum - does vacuum analyse automatically per default setup or some
or cost based parameter

vacuum full -> I also do this frequently (test DB only) as a means to
retrieve back used spaces due to MVCC. (trying lots of different methods
of query/add new index/make concatenated join/unique keys and then
deleting them if it's not useful)


>
> As an example, people throw around terms like "index bloat" and "dead
> tuples" when talking about vacuuming.


I honestly have only the vaguest idea what these 2 mean. (i only grasped
recently that tuples = records/rows)

> The tutorial I'd like to see
> somebody write would start by explaining those terms and showing how to
> measure them--preferably with a good and bad example to contrast. The way
> these terms are thrown around right now, I don't expect newcomers to
> understand either the documentation or the advice people are giving them;
> I think it's shooting over their heads and what's needed are some
> walkthroughs. Another example I'd like to see thrown in there is what it
> looks like when you don't have enough FSM slots.



actually, an additional item I would like is to understand explain
analyse. The current docs written by tom only shows explain and not
explain analyse and I'm getting confuse as to the rows=xxx vs actual
rows=yyy where on some of my queries can be very far apart 1 vs 500x
ratio on some problematic query[1]. And googling doesn't give much doc
on the explain. (the only other useful doc I've seen is a presentation
given from oscon 2003)

[1](See my other post)


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #16 (permalink)  
Old 04-19-2008, 11:33 AM
Scott Marlowe
 
Posts: n/a
Default Re: [Again] Postgres performance problem

On 9/13/07, Greg Smith <gsmith@gregsmith.com> wrote:
> On Wed, 12 Sep 2007, Scott Marlowe wrote:
>
> > I'm getting more and more motivated to rewrite the vacuum docs. I think
> > a rewrite from the ground up might be best... I keep seeing people
> > doing vacuum full on this list and I'm thinking it's as much because of
> > the way the docs represent vacuum full as anything.

>
> I agree you shouldn't start thinking in terms of how to fix the existing
> documentation. I'd suggest instead writing a tutorial leading someone
> through what they need to know about their tables first and then going
> into how vacuum works based on that data.


I think both things are needed actually. The current docs were
started back when pg 7.2 roamed the land, and they've been updated a
bit at a time. The technical definitions of vacuum, vacuum full,
analyze etc all show a bit too much history from back in the day, and
are confusing. so, I think that 1: vacuum and analyze should have
their own sections. analyze used to be a subcommand of vacuum but it
no longer is, but the docs still pretty much tie them together. 2:
The definition for vacuum full needs to include a caveat that vacuum
full should be considered more of a recovery operation than a way to
simply get back some space on your hard drives.

Which leads me to thinking that we then need a simple tutorial on
vacuuming to include the free space map, vacuum, vacuum analyze,
vacuum full, and the autovacuum daemon. We can throw analyze in there
somewhere too, I just don't want it to seem like it's still married to
vacuum.

> As an example, people throw around terms like "index bloat" and "dead
> tuples" when talking about vacuuming. The tutorial I'd like to see
> somebody write would start by explaining those terms and showing how to
> measure them--preferably with a good and bad example to contrast.


I agree. I might rearrange it a bit but that's the way I'm looking at it too.

> The way
> these terms are thrown around right now, I don't expect newcomers to
> understand either the documentation or the advice people are giving them;
> I think it's shooting over their heads and what's needed are some
> walkthroughs. Another example I'd like to see thrown in there is what it
> looks like when you don't have enough FSM slots.


OK. Got something to start with. I'm thinking I might work on a
vacuum tutorial first, then the tech docs...

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #17 (permalink)  
Old 04-19-2008, 11:33 AM
Gavin M. Roy
 
Posts: n/a
Default Re: [Again] Postgres performance problem

How many backends do you have at any given time? Have you tried using
something like pgBouncer to lower backend usage? How about your IO
situation? Have you run something like sysstat to see what iowait is
at?

On 9/11/07, Ruben Rubio <ruben@rentalia.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> Hi,
>
> I having the same problem I told here a few weeks before. Database is
> using too much resources again.
>
> I do a vacumm full each day, but seems it is not working. I am preparing
> an update to postgres 8.2.4 (actually I am using at 8.1.3, and tests for
> update will need several days)
>
> Last time I had this problem i solved it stopping website, restarting
> database, vacuumm it, run again website. But I guess this is going to
> happen again.
>
> I would like to detect and solve the problem. Any ideas to detect it?
>
> Thanks in advance,
>
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFG5jbLIo1XmbAXRboRArcpAJ0YvoCT6KWv2fafVAtapu 6nwFmKoACcD0uA
> zFTx9Wq+2NSxijIf/R8E5f8=
> =u0k5
> -----END PGP SIGNATURE-----
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
>
>
>


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #18 (permalink)  
Old 04-19-2008, 11:33 AM
Erik Jones
 
Posts: n/a
Default Re: [Again] Postgres performance problem

On Sep 13, 2007, at 12:58 AM, Greg Smith wrote:

> On Wed, 12 Sep 2007, Scott Marlowe wrote:
>
>> I'm getting more and more motivated to rewrite the vacuum docs. I
>> think a rewrite from the ground up might be best... I keep seeing
>> people doing vacuum full on this list and I'm thinking it's as
>> much because of the way the docs represent vacuum full as anything.

>
> I agree you shouldn't start thinking in terms of how to fix the
> existing documentation. I'd suggest instead writing a tutorial
> leading someone through what they need to know about their tables
> first and then going into how vacuum works based on that data.
>
> As an example, people throw around terms like "index bloat" and
> "dead tuples" when talking about vacuuming. The tutorial I'd like
> to see somebody write would start by explaining those terms and
> showing how to measure them--preferably with a good and bad example
> to contrast. The way these terms are thrown around right now, I
> don't expect newcomers to understand either the documentation or
> the advice people are giving them; I think it's shooting over their
> heads and what's needed are some walkthroughs. Another example I'd
> like to see thrown in there is what it looks like when you don't
> have enough FSM slots.


Isn't that the point of the documentation? I mean, if the existing,
official manual has been demonstrated (through countless mailing list
help requests) to not sufficiently explain a given topic, shouldn't
it be revised? One thing that might help is a hyperlinked glossary
so that people reading through the documentation can go straight to
the postgres definition of dead tuple, index bloat, etc.


Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #19 (permalink)  
Old 04-19-2008, 11:33 AM
Scott Marlowe
 
Posts: n/a
Default Re: [Again] Postgres performance problem

On 9/13/07, Erik Jones <erik@myemma.com> wrote:
> On Sep 13, 2007, at 12:58 AM, Greg Smith wrote:
>
> > On Wed, 12 Sep 2007, Scott Marlowe wrote:
> >
> >> I'm getting more and more motivated to rewrite the vacuum docs. I
> >> think a rewrite from the ground up might be best... I keep seeing
> >> people doing vacuum full on this list and I'm thinking it's as
> >> much because of the way the docs represent vacuum full as anything.

> >
> > I agree you shouldn't start thinking in terms of how to fix the
> > existing documentation. I'd suggest instead writing a tutorial
> > leading someone through what they need to know about their tables
> > first and then going into how vacuum works based on that data.
> >
> > As an example, people throw around terms like "index bloat" and
> > "dead tuples" when talking about vacuuming. The tutorial I'd like
> > to see somebody write would start by explaining those terms and
> > showing how to measure them--preferably with a good and bad example
> > to contrast. The way these terms are thrown around right now, I
> > don't expect newcomers to understand either the documentation or
> > the advice people are giving them; I think it's shooting over their
> > heads and what's needed are some walkthroughs. Another example I'd
> > like to see thrown in there is what it looks like when you don't
> > have enough FSM slots.

>
> Isn't that the point of the documentation? I mean, if the existing,
> official manual has been demonstrated (through countless mailing list
> help requests) to not sufficiently explain a given topic, shouldn't
> it be revised? One thing that might help is a hyperlinked glossary
> so that people reading through the documentation can go straight to
> the postgres definition of dead tuple, index bloat, etc.


Yes and no. The official docs are more of a technical specification.
Short, simple and to the point so that if you know mostly what you're
doing you don't have to wade through a long tutorial to find the
answer. I find MySQL's documentation frustrating as hell because I
can never find just the one thing I wanna look for. Because it's all
written as a tutorial. I.e. I have to pay the "stupid tax" when I
read their docs.

What I want to do is two fold. 1: fix the technical docs so they have
better explanations of each of the topics, without turning them into
huge tutorials. 2: Write a vacuuming tutorial that will be useful
should someone be new to postgresql and need to set up their system.
I think the tutorial should be broken into at least two sections, a
quick start guide and an ongoing maintenance and tuning section.

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #20 (permalink)  
Old 04-19-2008, 11:33 AM
Greg Smith
 
Posts: n/a
Default Re: [Again] Postgres performance problem

On Thu, 13 Sep 2007, Scott Marlowe wrote:

> I think both things are needed actually. The current docs were
> started back when pg 7.2 roamed the land, and they've been updated a
> bit at a time...


No argument here that ultimately the documentation needs to be updated as
well. I was just suggesting what I've been thinking about as the path of
least resistance to move in that direction. Updating the documentation is
harder to do because of the build process involved. It's easier to write
something new that addresses the deficiencies, get that right, and then
merge it into the documentation when it's stable. After the main new
content is done, then it's easier to sweep back through the existing
material and clean things up.

> Which leads me to thinking that we then need a simple tutorial on
> vacuuming to include the free space map, vacuum, vacuum analyze,
> vacuum full, and the autovacuum daemon.


Right, that's the sort of thing that's missing right now, and I think that
would be more useful to newbies than correcting the documentation that's
already there.

Also: if you don't have a public working area to assemble this document
at, I've set a precedent of sorts that it's OK to put working material
like this onto the PG developer's wiki at http://developer.postgresql.org/
as long as your stated intention is ultimately to move it off of there
once it's complete. In addition to providing a nice set of tools for
working the text (presuming you're comfortable with Wiki syntax) that will
get you a pool of reviewers/contributors who can make changes directly
rather than you needing to do all the work yourself.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 05:47 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com