This is a discussion on [Again] Postgres performance problem within the Pgsql Performance forums, part of the PostgreSQL category; --> On 9/12/07, Frank Schoep <frank@ffnn.nl> wrote: > On Sep 12, 2007, at 9:07 PM, Scott Marlowe wrote: > > ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| On 9/12/07, Frank Schoep <frank@ffnn.nl> wrote: > On Sep 12, 2007, at 9:07 PM, Scott Marlowe wrote: > > On 9/12/07, Mikko Partio <mpartio@gmail.com> wrote: > >> … > >> Aren't you mixing up REINDEX and CLUSTER? > > > > … > > Either one does what a vacuum full did / does, but generally does > > it better. > > On topic of REINDEX / VACUUM FULL versus a CLUSTER / VACUUM ANALYZE > I'd like to ask if CLUSTER is safe to run on a table that is in > active use. > > After updating my maintenance scripts from a VACUUM FULL (add me to > the list) to CLUSTER (which improves performance a lot) I noticed I > was getting "could not open relation …" errors in the log while the > scripts ran so I reverted the change. This was on 8.1.9. > > Am I hitting a corner case or is it generally not a good idea to > CLUSTER tables which are being queried? I haven't had problems with > the REINDEX / VACUUM FULL combination while CLUSTER / VACUUM ANALYZE > resulted in errors on the first run. > > Can the "could not open relation…" error bring down the whole > database server? I'm really interested in using CLUSTER regularly as > it speeds up my system by a factor of two because of more efficient I/O. No, it won't bring it down. Basically the query lost the relation is was operating against because it disappeared when the cluster command runs. ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org |
| |||
| Scott Marlowe escribió: > > Aren't you mixing up REINDEX and CLUSTER? > > I don't think so. reindex (which runs on tables and indexes, so the > name is a bit confusing, I admit) basically was originally a "repair" > operation that rewrote the whole relation and wasn't completely > transaction safe (way back, 7.2 days or so I think). Due to the > issues with vacuum full bloating indexes, and being slowly replaced by > regular vacuum, reindex received some attention to make it transaction > / crash safe and has kind of take the place of vacuum full in terms of > "how to fix bloated objects". Hmm, REINDEX does not rewrite tables. If there are dead tuples, they will still be there after REINDEX. > cluster, otoh, rewrites the table into index order. .... excluding dead tuples, and then rewrites all the indexes. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| On Sep 12, 2007, at 2:19 PM, Frank Schoep wrote: > On Sep 12, 2007, at 9:07 PM, Scott Marlowe wrote: >> On 9/12/07, Mikko Partio <mpartio@gmail.com> wrote: >>> … >>> Aren't you mixing up REINDEX and CLUSTER? >> >> … >> Either one does what a vacuum full did / does, but generally does >> it better. > > On topic of REINDEX / VACUUM FULL versus a CLUSTER / VACUUM ANALYZE > I'd like to ask if CLUSTER is safe to run on a table that is in > active use. > > After updating my maintenance scripts from a VACUUM FULL (add me to > the list) to CLUSTER (which improves performance a lot) I noticed I > was getting "could not open relation …" errors in the log while the > scripts ran so I reverted the change. This was on 8.1.9. You'd probably see the same behavior on 8.2.x. CLUSTER is not transactionally safe so you don't want to run CLUSTER on tables that are actively being used. I believe that's been fixed for 8.3. Erik Jones Software Developer | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster |
| |||
| On Wed, 12 Sep 2007, Scott Marlowe wrote: > I'm getting more and more motivated to rewrite the vacuum docs. I think > a rewrite from the ground up might be best... I keep seeing people > doing vacuum full on this list and I'm thinking it's as much because of > the way the docs represent vacuum full as anything. I agree you shouldn't start thinking in terms of how to fix the existing documentation. I'd suggest instead writing a tutorial leading someone through what they need to know about their tables first and then going into how vacuum works based on that data. As an example, people throw around terms like "index bloat" and "dead tuples" when talking about vacuuming. The tutorial I'd like to see somebody write would start by explaining those terms and showing how to measure them--preferably with a good and bad example to contrast. The way these terms are thrown around right now, I don't expect newcomers to understand either the documentation or the advice people are giving them; I think it's shooting over their heads and what's needed are some walkthroughs. Another example I'd like to see thrown in there is what it looks like when you don't have enough FSM slots. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| On Thu, 2007-09-13 at 01:58 -0400, Greg Smith wrote: > On Wed, 12 Sep 2007, Scott Marlowe wrote: > > > I'm getting more and more motivated to rewrite the vacuum docs. I think > > a rewrite from the ground up might be best... I keep seeing people > > doing vacuum full on this list and I'm thinking it's as much because of > > the way the docs represent vacuum full as anything. > > I agree you shouldn't start thinking in terms of how to fix the existing > documentation. I'd suggest instead writing a tutorial leading someone > through what they need to know about their tables first and then going > into how vacuum works based on that data. I'm new to PG and it's true that I am confused. As it stands this is a newbie's understanding of the various terms. cluster -> rewrites a table according to index order so that IO is ordered/sequential reindex -> basically, rewrites the indexes adding new records/fixes up old deleted records vacuum -> does cleaning vacuum analyse -> clean and update statistics (i run this mostly) autovacuum - does vacuum analyse automatically per default setup or some or cost based parameter vacuum full -> I also do this frequently (test DB only) as a means to retrieve back used spaces due to MVCC. (trying lots of different methods of query/add new index/make concatenated join/unique keys and then deleting them if it's not useful) > > As an example, people throw around terms like "index bloat" and "dead > tuples" when talking about vacuuming. I honestly have only the vaguest idea what these 2 mean. (i only grasped recently that tuples = records/rows) > The tutorial I'd like to see > somebody write would start by explaining those terms and showing how to > measure them--preferably with a good and bad example to contrast. The way > these terms are thrown around right now, I don't expect newcomers to > understand either the documentation or the advice people are giving them; > I think it's shooting over their heads and what's needed are some > walkthroughs. Another example I'd like to see thrown in there is what it > looks like when you don't have enough FSM slots. actually, an additional item I would like is to understand explain analyse. The current docs written by tom only shows explain and not explain analyse and I'm getting confuse as to the rows=xxx vs actual rows=yyy where on some of my queries can be very far apart 1 vs 500x ratio on some problematic query[1]. And googling doesn't give much doc on the explain. (the only other useful doc I've seen is a presentation given from oscon 2003) [1](See my other post) ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| On 9/13/07, Greg Smith <gsmith@gregsmith.com> wrote: > On Wed, 12 Sep 2007, Scott Marlowe wrote: > > > I'm getting more and more motivated to rewrite the vacuum docs. I think > > a rewrite from the ground up might be best... I keep seeing people > > doing vacuum full on this list and I'm thinking it's as much because of > > the way the docs represent vacuum full as anything. > > I agree you shouldn't start thinking in terms of how to fix the existing > documentation. I'd suggest instead writing a tutorial leading someone > through what they need to know about their tables first and then going > into how vacuum works based on that data. I think both things are needed actually. The current docs were started back when pg 7.2 roamed the land, and they've been updated a bit at a time. The technical definitions of vacuum, vacuum full, analyze etc all show a bit too much history from back in the day, and are confusing. so, I think that 1: vacuum and analyze should have their own sections. analyze used to be a subcommand of vacuum but it no longer is, but the docs still pretty much tie them together. 2: The definition for vacuum full needs to include a caveat that vacuum full should be considered more of a recovery operation than a way to simply get back some space on your hard drives. Which leads me to thinking that we then need a simple tutorial on vacuuming to include the free space map, vacuum, vacuum analyze, vacuum full, and the autovacuum daemon. We can throw analyze in there somewhere too, I just don't want it to seem like it's still married to vacuum. > As an example, people throw around terms like "index bloat" and "dead > tuples" when talking about vacuuming. The tutorial I'd like to see > somebody write would start by explaining those terms and showing how to > measure them--preferably with a good and bad example to contrast. I agree. I might rearrange it a bit but that's the way I'm looking at it too. > The way > these terms are thrown around right now, I don't expect newcomers to > understand either the documentation or the advice people are giving them; > I think it's shooting over their heads and what's needed are some > walkthroughs. Another example I'd like to see thrown in there is what it > looks like when you don't have enough FSM slots. OK. Got something to start with. I'm thinking I might work on a vacuum tutorial first, then the tech docs... ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| How many backends do you have at any given time? Have you tried using something like pgBouncer to lower backend usage? How about your IO situation? Have you run something like sysstat to see what iowait is at? On 9/11/07, Ruben Rubio <ruben@rentalia.com> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > Hi, > > I having the same problem I told here a few weeks before. Database is > using too much resources again. > > I do a vacumm full each day, but seems it is not working. I am preparing > an update to postgres 8.2.4 (actually I am using at 8.1.3, and tests for > update will need several days) > > Last time I had this problem i solved it stopping website, restarting > database, vacuumm it, run again website. But I guess this is going to > happen again. > > I would like to detect and solve the problem. Any ideas to detect it? > > Thanks in advance, > > > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.6 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iD8DBQFG5jbLIo1XmbAXRboRArcpAJ0YvoCT6KWv2fafVAtapu 6nwFmKoACcD0uA > zFTx9Wq+2NSxijIf/R8E5f8= > =u0k5 > -----END PGP SIGNATURE----- > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > > > ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| On Sep 13, 2007, at 12:58 AM, Greg Smith wrote: > On Wed, 12 Sep 2007, Scott Marlowe wrote: > >> I'm getting more and more motivated to rewrite the vacuum docs. I >> think a rewrite from the ground up might be best... I keep seeing >> people doing vacuum full on this list and I'm thinking it's as >> much because of the way the docs represent vacuum full as anything. > > I agree you shouldn't start thinking in terms of how to fix the > existing documentation. I'd suggest instead writing a tutorial > leading someone through what they need to know about their tables > first and then going into how vacuum works based on that data. > > As an example, people throw around terms like "index bloat" and > "dead tuples" when talking about vacuuming. The tutorial I'd like > to see somebody write would start by explaining those terms and > showing how to measure them--preferably with a good and bad example > to contrast. The way these terms are thrown around right now, I > don't expect newcomers to understand either the documentation or > the advice people are giving them; I think it's shooting over their > heads and what's needed are some walkthroughs. Another example I'd > like to see thrown in there is what it looks like when you don't > have enough FSM slots. Isn't that the point of the documentation? I mean, if the existing, official manual has been demonstrated (through countless mailing list help requests) to not sufficiently explain a given topic, shouldn't it be revised? One thing that might help is a hyperlinked glossary so that people reading through the documentation can go straight to the postgres definition of dead tuple, index bloat, etc. Erik Jones Software Developer | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |
| |||
| On 9/13/07, Erik Jones <erik@myemma.com> wrote: > On Sep 13, 2007, at 12:58 AM, Greg Smith wrote: > > > On Wed, 12 Sep 2007, Scott Marlowe wrote: > > > >> I'm getting more and more motivated to rewrite the vacuum docs. I > >> think a rewrite from the ground up might be best... I keep seeing > >> people doing vacuum full on this list and I'm thinking it's as > >> much because of the way the docs represent vacuum full as anything. > > > > I agree you shouldn't start thinking in terms of how to fix the > > existing documentation. I'd suggest instead writing a tutorial > > leading someone through what they need to know about their tables > > first and then going into how vacuum works based on that data. > > > > As an example, people throw around terms like "index bloat" and > > "dead tuples" when talking about vacuuming. The tutorial I'd like > > to see somebody write would start by explaining those terms and > > showing how to measure them--preferably with a good and bad example > > to contrast. The way these terms are thrown around right now, I > > don't expect newcomers to understand either the documentation or > > the advice people are giving them; I think it's shooting over their > > heads and what's needed are some walkthroughs. Another example I'd > > like to see thrown in there is what it looks like when you don't > > have enough FSM slots. > > Isn't that the point of the documentation? I mean, if the existing, > official manual has been demonstrated (through countless mailing list > help requests) to not sufficiently explain a given topic, shouldn't > it be revised? One thing that might help is a hyperlinked glossary > so that people reading through the documentation can go straight to > the postgres definition of dead tuple, index bloat, etc. Yes and no. The official docs are more of a technical specification. Short, simple and to the point so that if you know mostly what you're doing you don't have to wade through a long tutorial to find the answer. I find MySQL's documentation frustrating as hell because I can never find just the one thing I wanna look for. Because it's all written as a tutorial. I.e. I have to pay the "stupid tax" when I read their docs. What I want to do is two fold. 1: fix the technical docs so they have better explanations of each of the topics, without turning them into huge tutorials. 2: Write a vacuuming tutorial that will be useful should someone be new to postgresql and need to set up their system. I think the tutorial should be broken into at least two sections, a quick start guide and an ongoing maintenance and tuning section. ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| ||||
| On Thu, 13 Sep 2007, Scott Marlowe wrote: > I think both things are needed actually. The current docs were > started back when pg 7.2 roamed the land, and they've been updated a > bit at a time... No argument here that ultimately the documentation needs to be updated as well. I was just suggesting what I've been thinking about as the path of least resistance to move in that direction. Updating the documentation is harder to do because of the build process involved. It's easier to write something new that addresses the deficiencies, get that right, and then merge it into the documentation when it's stable. After the main new content is done, then it's easier to sweep back through the existing material and clean things up. > Which leads me to thinking that we then need a simple tutorial on > vacuuming to include the free space map, vacuum, vacuum analyze, > vacuum full, and the autovacuum daemon. Right, that's the sort of thing that's missing right now, and I think that would be more useful to newbies than correcting the documentation that's already there. Also: if you don't have a public working area to assemble this document at, I've set a precedent of sorts that it's OK to put working material like this onto the PG developer's wiki at http://developer.postgresql.org/ as long as your stated intention is ultimately to move it off of there once it's complete. In addition to providing a nice set of tools for working the text (presuming you're comfortable with Wiki syntax) that will get you a pool of reviewers/contributors who can make changes directly rather than you needing to do all the work yourself. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |