This is a discussion on Feature freeze date for 8.1 within the pgsql Hackers forums, part of the PostgreSQL category; --> "Matthew T. O'Connor" <matthew@zeut.net> writes: > What to people think about having an optional "maintenance window" so > that ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| "Matthew T. O'Connor" <matthew@zeut.net> writes: > What to people think about having an optional "maintenance window" so > that autovac only takes action during an approved time. But perhaps > just using the vacuum delay settings will be enough. I'm not sure autovac should go completely catatonic during the day; what if someone does an unusual mass deletion, or something? But it does seem pretty reasonable to have a notion of a maintenance window where it should be more active than it is at other times. Maybe what you want is two complete sets of autovac parameters. Definitely at least two sets of the vacuum-delay values. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| On Fri, 29 Apr 2005, Tom Lane wrote: > "Matthew T. O'Connor" <matthew@zeut.net> writes: >> What to people think about having an optional "maintenance window" so >> that autovac only takes action during an approved time. But perhaps >> just using the vacuum delay settings will be enough. > > I'm not sure autovac should go completely catatonic during the day; > what if someone does an unusual mass deletion, or something? But > it does seem pretty reasonable to have a notion of a maintenance > window where it should be more active than it is at other times. > > Maybe what you want is two complete sets of autovac parameters. > Definitely at least two sets of the vacuum-delay values. With the introduction of the stats collector, is there not some way of extending it so that autovac has more information to work off of? For instance, in my environment, we have clients in every timezone hitting the database ... our Japanese clients will be busy at a totally different time of day then our East Coast/NA clients, so a 'maintenance window' is near impossible to state ... I know one person was talking about being able to target only those that pages that have changes, instead of the whole table ... but some sort of "load monitoring" that checks # of active connections and tries to find 'lulls'? Basically, everything right now is being keyed to updates to the tables themselves, but isn't looking at what the system itself is doing ... if I'm doing a massive import of data into a table, the last time I want is a VACUUM to cut in and slow down the loading ... ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664 ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) |
| |||
| Martha Stewart called it a Good Thing when scrappy@postgresql.org ("Marc G. Fournier") wrote: > I know one person was talking about being able to target only those > that pages that have changes, instead of the whole table ... but some > sort of "load monitoring" that checks # of active connections and > tries to find 'lulls'? I have some "log table purging" processes I'd like to put in place; it would be really slick to be able to get some statistics from the system as to how busy the DB has been in the last little while. The nice, adaptive algorithm: - Loop forever - Once a minute, evaluate how busy things seem, giving some metric X -> If X is "high" then purge 10 elderly tuples from table log_table -> If X is "moderate" then purge 100 elderly tuples from table log_table -> If X is "low" then purge 1000 elderly tuples from table log_table The trouble is in measuring some form of "X." Some reasonable approximations might include: - How much disk I/O was recorded in the last 60 seconds? - How many application transactions (e.g. - invoices or such) were issued in the last 60 seconds (monitoring a sequence could be good enough). -- output = reverse("gro.mca" "@" "enworbbc") http://linuxfinances.info/info/slony.html ?OM ERROR |
| |||
| I think what you're suggesting is that vacuum settings (most likely delay) take into consideration the load on the database, which I think is a great idea. One possibility is if vacuum tracks how many blocks it's read/written, it can see how many blocks the database has done overall; subtract the two and you know how much other disk IO is going on in the system. You can then use that number to decide how long you'll sleep before the next vacuum cycle. On Fri, Apr 29, 2005 at 01:34:56PM -0300, Marc G. Fournier wrote: > On Fri, 29 Apr 2005, Tom Lane wrote: > > >"Matthew T. O'Connor" <matthew@zeut.net> writes: > >>What to people think about having an optional "maintenance window" so > >>that autovac only takes action during an approved time. But perhaps > >>just using the vacuum delay settings will be enough. > > > >I'm not sure autovac should go completely catatonic during the day; > >what if someone does an unusual mass deletion, or something? But > >it does seem pretty reasonable to have a notion of a maintenance > >window where it should be more active than it is at other times. > > > >Maybe what you want is two complete sets of autovac parameters. > >Definitely at least two sets of the vacuum-delay values. > > With the introduction of the stats collector, is there not some way of > extending it so that autovac has more information to work off of? For > instance, in my environment, we have clients in every timezone hitting the > database ... our Japanese clients will be busy at a totally different time > of day then our East Coast/NA clients, so a 'maintenance window' is near > impossible to state ... > > I know one person was talking about being able to target only those that > pages that have changes, instead of the whole table ... but some sort of > "load monitoring" that checks # of active connections and tries to find > 'lulls'? > > Basically, everything right now is being keyed to updates to the tables > themselves, but isn't looking at what the system itself is doing ... if > I'm doing a massive import of data into a table, the last time I want is a > VACUUM to cut in and slow down the loading ... > > ---- > Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) > Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664 > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?" ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster |
| |||
| On Fri, 29 Apr 2005, Christopher Browne wrote: > Martha Stewart called it a Good Thing when scrappy@postgresql.org ("Marc G. Fournier") wrote: >> I know one person was talking about being able to target only those >> that pages that have changes, instead of the whole table ... but some >> sort of "load monitoring" that checks # of active connections and >> tries to find 'lulls'? > > I have some "log table purging" processes I'd like to put in place; it > would be really slick to be able to get some statistics from the > system as to how busy the DB has been in the last little while. > > The nice, adaptive algorithm: > > - Loop forever > > - Once a minute, evaluate how busy things seem, giving some metric X > > -> If X is "high" then purge 10 elderly tuples from table log_table > -> If X is "moderate" then purge 100 elderly tuples from table > log_table > -> If X is "low" then purge 1000 elderly tuples from table > log_table > > The trouble is in measuring some form of "X." > > Some reasonable approximations might include: > - How much disk I/O was recorded in the last 60 seconds? > - How many application transactions (e.g. - invoices or such) were > issued in the last 60 seconds (monitoring a sequence could be > good enough). Some way of doing a 'partial vacuum' would be nice ... where a VACUUM could stop after it processed those '10 elderly tuples' and on the next pass, resume from that point instead of starting from the beginning again .... ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664 ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Marc G. Fournier wrote: > On Fri, 29 Apr 2005, Christopher Browne wrote: > >> Some reasonable approximations might include: >> - How much disk I/O was recorded in the last 60 seconds? >> - How many application transactions (e.g. - invoices or such) were >> issued in the last 60 seconds (monitoring a sequence could be >> good enough). > > > Some way of doing a 'partial vacuum' would be nice ... where a VACUUM > could stop after it processed those '10 elderly tuples' and on the > next pass, resume from that point instead of starting from the > beginning again ... That is sorta what the vacuum delay settings accomplish. ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) |
| |||
| The world rejoiced as matthew@zeut.net ("Matthew T. O'Connor") wrote: > Marc G. Fournier wrote: > >> On Fri, 29 Apr 2005, Christopher Browne wrote: >> >>> Some reasonable approximations might include: >>> - How much disk I/O was recorded in the last 60 seconds? >>> - How many application transactions (e.g. - invoices or such) were >>> issued in the last 60 seconds (monitoring a sequence could be >>> good enough). >> >> >> Some way of doing a 'partial vacuum' would be nice ... where a >> VACUUM could stop after it processed those '10 elderly tuples' and >> on the next pass, resume from that point instead of starting from >> the beginning again ... > > That is sorta what the vacuum delay settings accomplish. What they do is orthogonal to that. "Vacuum delay" prevents vacuum I/O from taking over the I/O bus. Unfortunately, if you have a table with a very large number of _live_ tuples, there is no way to skip over those and only concentrate on the dead ones. In that scenario "vacuum delay" leads to the vacuum on the table running for a Very, Very Long Time, because it sits there delaying a lot as it walks thru pages it never modifies. The one good news is that, for any pages where no tuples are touched, the indices are also left untouched. -- wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','gmail.com'). http://linuxdatabases.info/info/slony.html "The Board views the endemic use of PowerPoint briefing slides instead of technical papers as an illustration of the problematic methods of technical communication at NASA." -- Official report on the Columbia shuttle disaster. |
| |||
| Hi, > What to people think about having an optional "maintenance window" so > that autovac only takes action during an approved time. This sounds like a realy good idea to me! Sander. ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster |
| |||
| We have talked about performance and some new features before freeze of 8.1. Like ; · Bitmap indexes · Autovacuum · GIS features · Object-Oriented features · PITR · Table Partition But there is a feature that is too important for a database. It is availability.Now PostgreSQL doesn't have high availability.We must discuss it here. Imagine a database that has a lots of features that others don’t have. I tested the PostgreSQL for that feature, i couldn't find it enough. Here : Process A start to update / insert some rows in a table and then the connection of process A is lost to PostgreSQL before it sends commit or rollback. Other processes want to update the same rows or SELECT …..FOR UPDATE for the same rows.Now these processes are providing SELECT WAITING… or CANCEL QUERY if statement_timeout was set. Imagine these processes is getting grower. What will we do now ? Restarting backend or finding process A and kill it ? Now, do you think that the PostgreSQL database is a high available database ? A feature must be added to solve that problem or PostgreSQL databases would never get a good place among huge databases. Best Regards Adnan DURSUN ASRIN Bilişim Ltd. Ankara /TURKEY ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org |
| ||||
| On Sun, May 01, 2005 at 03:09:37PM +0300, adnandursun@asrinbilisim.com.tr wrote: > Process A start to update / insert some rows in a table > and then the connection of process A is lost to PostgreSQL > before it sends commit or rollback. Other processes want to > update the same rows or SELECT …..FOR UPDATE for the same > rows.Now these processes are providing SELECT WAITING… or > CANCEL QUERY if statement_timeout was set. Imagine these > processes is getting grower. What will we do now ? > Restarting backend or finding process A and kill it ? Well, if process A loses the connection to the client, then the transaction will be rolled back and other processes will be able to continue. Another thing to keep in mind is that if process A is inserting a tuple, other processes will not see it because it isn't committed. So MVCC rules protect them from blocking. (Unless there is a unique restriction and some other process wants to insert the same value to it.) Now, we do have an "availability" problem in 8.0 and earlier, which is that you could block trying to check a foreign key that other process is also checking. I am happy to say that it doesn't happen anymore so that's one less barrier. -- Alvaro Herrera (<alvherre[@]dcc.uchile.cl>) "In fact, the basic problem with Perl 5's subroutines is that they're not crufty enough, so the cruft leaks out into user-defined code instead, by the Conservation of Cruft Principle." (Larry Wall, Apocalypse 6) ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |