This is a discussion on Determining server load from client within the Pgsql Performance forums, part of the PostgreSQL category; --> I've found that it would be helpful to be able to tell how busy my dedicated PG server is ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| I've found that it would be helpful to be able to tell how busy my dedicated PG server is ( Linux 2.6 kernel, v8.0.3 currently ) before pounding it with some OLAP-type queries. Specifically, I have a multi-threaded client program that needs to run several thousand sequential queries. I broke it into threads to take advantage of the multi-core architecture of the server hardware. It would be very nice if I could check the load of the server at certain intervals to throttle the number of concurrent queries and mitigate load problems when other processes might be already inducing a significant load. I have seen some other nice back-end things exposed through PG functions ( e.g. database size on disk ) and wondered if there was anything applicable to this. Even if it can't return the load average proper, is there anything else in the pg_* tables that might give me a clue how "busy" the server is for a period of time? I've thought about allowing an ssh login without a keyphrase to log in and capture it. But, the client process is running as an apache user. Giving the apache user a shell login to the DB box does not seem like a smart idea for obvious security reasons... So far, that's all I can come up with, other than a dedicated socket server daemon on the DB machine to do it. Any creative ideas are welcomed Thanks -Dan ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Dan Harris wrote: > I've found that it would be helpful to be able to tell how busy my > dedicated PG server is ... > > I have seen some other nice back-end things exposed through PG functions > ( e.g. database size on disk ) and wondered if there was anything > applicable to this. I'd write a simple pg-perl function to do this. You can access operating-system calls to find out the system's load. But notice that you need "Untrusted Perl" to do this, so you can only do it on a system where you trust every application that connects to your database. Something like this: create or replace function get_stats() returns text as ' open(STAT, "</proc/stat"); my @stats = <STAT>; close STAT; return join("", @stats); ' language plperlu; See http://www.postgresql.org/docs/8.1/i...l-trusted.html Craig ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| (forgot to send to list) Dan Harris wrote: > architecture of the server hardware. It would be very nice if I could > check the load of the server at certain intervals to throttle the > number of concurrent queries and mitigate load problems when other > processes might be already inducing a significant load. > > I have seen some other nice back-end things exposed through PG > functions ( e.g. database size on disk ) and wondered if there was > anything applicable to this. Even if it can't return the load average > proper, is there anything else in the pg_* tables that might give me a > clue how "busy" the server is for a period of time? I have installed munin (http://munin.projects.linpro.no/) on a few systems. This lets you look at graphs of system resources/load etc. I have also added python scripts which do sample queries to let me know if performance/index size is changing dramatically. I have attached an example script. Hope that helps, Joe ------------------------------------------------------------------------ #! /usr/bin/python import psycopg import sys def fixName(name): return name[:19] if len(sys.argv) > 1 and sys.argv[1] == "config": print """graph_title Postgresql Index Sizes graph_vlabel Mb""" con = psycopg.connect("host=xxx user=xxx dbname=xxx password=xxx") cur = con.cursor() cur.execute("select relname, relpages from pg_class where relowner > 10 and relkind='i' and relpages > 256 order by reltuples desc;") results = cur.fetchall() for name, pages in results: print "%s.label %s" % (fixName(name), name) else: con = psycopg.connect("host=xxx user=xxx dbname=xxx password=xxx") cur = con.cursor() cur.execute("select relname, relpages from pg_class where relowner > 10 and relkind='i' and relpages > 256 order by reltuples desc;") results = cur.fetchall() for name, pages in results: print "%s.value %.2f" % (name[:19], pages*8.0/1024.0) ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Dan Harris wrote: > I've found that it would be helpful to be able to tell how busy my > dedicated PG server is ( Linux 2.6 kernel, v8.0.3 currently ) before > pounding it with some OLAP-type queries. ...snip Thank you all for your great ideas! I'm going to try the perl function as that seems like a very elegant way of doing it. -Dan ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend |
| |||
| Dan Use the following plperlu function create or replace function LoadAVG() returns record as $$ use Sys::Statistics::Linux::LoadAVG; my $lxs = new Sys::Statistics::Linux::LoadAVG; my $stats = $lxs->get; return $stats; $$ language plperlu; select * from LoadAVg() as (avg_1 float,avg_5 float,avg_15 float); The Sys::Statistics::Linux has all kind of info (from the /proc) file system. Jim -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto Sent: Tuesday, March 20, 2007 8:48 PM To: PostgreSQL Performance Subject: [PERFORM] Determining server load from client I've found that it would be helpful to be able to tell how busy my dedicated PG server is ( Linux 2.6 kernel, v8.0.3 currently ) before pounding it with some OLAP-type queries. Specifically, I have a multi-threaded client program that needs to run several thousand sequential queries. I broke it into threads to take advantage of the multi-core architecture of the server hardware. It would be very nice if I could check the load of the server at certain intervals to throttle the number of concurrent queries and mitigate load problems when other processes might be already inducing a significant load. I have seen some other nice back-end things exposed through PG functions ( e.g. database size on disk ) and wondered if there was anything applicable to this. Even if it can't return the load average proper, is there anything else in the pg_* tables that might give me a clue how "busy" the server is for a period of time? I've thought about allowing an ssh login without a keyphrase to log in and capture it. But, the client process is running as an apache user. Giving the apache user a shell login to the DB box does not seem like a smart idea for obvious security reasons... So far, that's all I can come up with, other than a dedicated socket server daemon on the DB machine to do it. Any creative ideas are welcomed Thanks -Dan ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match |
| |||
| Joe Healy wrote: > (forgot to send to list) > Dan Harris wrote: >> architecture of the server hardware. It would be very nice if I could >> check the load of the server at certain intervals to throttle the >> number of concurrent queries and mitigate load problems when other >> processes might be already inducing a significant load. > I have installed munin (http://munin.projects.linpro.no/) on a few > systems. This lets you look at graphs of system resources/load etc. I > have also added python scripts which do sample queries to let me know if > performance/index size is changing dramatically. I have attached an > example script. For general monitoring of a handful of servers, I've been impressed with munin. It's very simple to get it running and write your own plugins. -- Richard Huxton Archonet Ltd ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings |
| |||
| I have my postgres munin monitoring script at http://oppetid.no/~tobixen/pg_activity.munin.txt (had to suffix it with ..txt to make the local apache happy). I would like to see what others have done as well. ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly |
| |||
| Tobias Brox wrote: > I have my postgres munin monitoring script at > http://oppetid.no/~tobixen/pg_activity.munin.txt (had to suffix it with > .txt to make the local apache happy). > > I would like to see what others have done as well. Well, I use Perl rather than shell, but that's just me. The main difference is that although I downloaded a couple of simple pg-monitoring scripts from the web, I've concentrated on monitoring the application(s) instead. Things like: - number of news items posted - searches run - logins, logouts The main limitation with it for me is the fixed 5-min time interval. It provides a slight irritation that I've got hourly/daily cron jobs that are being monitored continually. -- Richard Huxton Archonet Ltd ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate |
| |||
| On Mar 21, 2007, at 5:13 AM, Tobias Brox wrote: > I have my postgres munin monitoring script at > http://oppetid.no/~tobixen/pg_activity.munin.txt (had to suffix it > with > .txt to make the local apache happy). > > I would like to see what others have done as well. I use cacti (http://cacti.net) which does the same thing that munin does but in php instead. Here's what I use to db stats to it (again, php): You basically call the script with the database name and the stat you want. I have the active_queries stat set up as a gauge in cacti and the others as counters: if(!isset($argv[1])) { echo "DB name argument required!\n"; exit (); } $stats = array('xact_commit', 'xact_rollback', 'blks_read', 'blks_hit', 'active_queries'); if(!isset($argv[2]) || !in_array($argv[2], $stats)) { echo "Invalid stat arg!: {$argv[2]}"; exit(); } require_once('DB.php'); $db_name = $argv[1]; if(DB::isError($db = DB::connect("pgsql://user@host:5432/$db_name"))) { exit(); } if($argv[2] == 'active_queries') { $actives_sql = "SELECT COUNT(*) FROM pg_stat_activity WHERE current_query NOT ILIKE '<idle>' AND now() - query_start > '1 second';"; if(DB::isError($db_stat = $db->getOne($actives_sql))) { exit(); } echo "$db_stat\n"; exit(); } $db_stat_sql = "SELECT {$argv[2]} FROM pg_stat_database WHERE datname='$db_name';"; if(DB::isError($db_stat = $db->getOne($db_stat_sql))) { exit(); } echo "$db_stat\n"; erik jones <erik@myemma.com> software developer 615-296-0838 emma(r) |
| ||||
| On Mar 21, 2007, at 4:13 PM, Tobias Brox wrote: > [Erik Jones - Wed at 09:31:48AM -0500] >> I use cacti (http://cacti.net) which does the same thing that munin >> does but in php instead. Here's what I use to db stats to it (again, >> php): > > I haven't tried cacti, but our sysadm has done a little bit of > research > and concluded "cacti is better". Maybe some day we'll move over. > > Munin is generating all the graphs statically every fifth minute, > while > cacti generates them on demand as far as I've understood. The munin > approach is pretty bloat, since one usually would watch the graphs > much > more seldom than what they are generated (at least, we do). That's > not > really an argument since CPU is cheap nowadays - but a real > argument is > that the munin approach is less flexible. One would like to adjust > the > graph (like, min/max values for both axis) while watching quite some > times. Well, by "default", Cacti polls all of the data sources you've set up every five minutes as well as that's how the docs instruct you to set up the cron job for the poller. However, with a little understanding of how the rrdtool rras work, you could definitely poll more often and simply edit the existing rras and datasources to expect that or create new ones. And, yes, the graph customization is pretty cool although for the most part the just map what's available from the rrdtool graph functionality. If you do decide to set up Cacti I suggest you go straight to the faq section of the manual and read the part about going from a simple script to a graph. The main manual is almost entirely centered on the built-in networking (e.g. snmp) data sources and, as such, doesn't do much for explaining how to set up other data sources. erik jones <erik@myemma.com> software developer 615-296-0838 emma(r) |