This is a discussion on Re: Performance increase with elevator=deadline within the Pgsql Performance forums, part of the PostgreSQL category; --> On Fri, 11 Apr 2008, Jeff wrote: > Using 4 of these with a dataset of about 30GB across ...
| |||||||
| FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| ||||
| On Fri, 11 Apr 2008, Jeff wrote: > Using 4 of these with a dataset of about 30GB across a few files (Machine has > 8GB mem) I went from around 100 io/sec to 330 changing to noop. Quite an > improvement. If you have a decent controller CFQ is not what you want. I > tried deadline as well and it was a touch slower. The controller is a 3ware > 9550sx with 4 disks in a raid10. I ran Greg's fadvise test program a while back on a 12-disc array. The three schedulers (deadline, noop, anticipatory) all performed pretty-much the same, with the fourth (cfq, the default) being consistently slower. > it also seems changing elevators on the fly works fine (echo schedulername > > /sys/block/.../queue/scheduler I admit I sat there flipping back and forth > going "disk go fast.. disk go slow.. disk go fast... " Oh Homer Simpson, your legacy lives on. Matthew -- I suppose some of you have done a Continuous Maths course. Yes? Continuous Maths? <menacing stares from audience> Whoah, it was like that, was it! -- Computer Science Lecturer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance |
| |||
| Matthew wrote: > On Fri, 11 Apr 2008, Jeff wrote: >> Using 4 of these with a dataset of about 30GB across a few files >> (Machine has 8GB mem) I went from around 100 io/sec to 330 changing to >> noop. Quite an improvement. If you have a decent controller CFQ is >> not what you want. I tried deadline as well and it was a touch >> slower. The controller is a 3ware 9550sx with 4 disks in a raid10. > > I ran Greg's fadvise test program a while back on a 12-disc array. The > three schedulers (deadline, noop, anticipatory) all performed > pretty-much the same, with the fourth (cfq, the default) being > consistently slower. I use CFQ on some of my servers, despite the fact that it's often slower in total throughput terms, because it delivers much more predictable I/O latencies that help prevent light I/O processes being starved by heavy I/O processes. In particular, an Linux terminal server used at work has taken a lot of I/O tuning before it delivers even faintly acceptable I/O latencies under any sort of load. Bounded I/O latency at the expense of throughput is not what you usually want on a DB server, where throughput is king, so I'm not at all surprised that CFQ performs poorly for PostgreSQL. I've done no testing on that myself, though, because with my DB size and the nature of my queries most of them are CPU bound anyway. Speaking of I/O performance with PostgreSQL, has anybody here done any testing to compare results with LVM to results with the same filesystem on a conventionally partitioned or raw volume? I'd probably use LVM even at a performance cost because of its admin benefits, but I'd be curious if there is any known cost for use with Pg. I've never been able to measure one with other workloads. -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance |
| |||
| On Sat, 12 Apr 2008, Craig Ringer wrote: > Speaking of I/O performance with PostgreSQL, has anybody here done any > testing to compare results with LVM to results with the same filesystem on a > conventionally partitioned or raw volume? There was some chatter on this topic last year; a quick search finds http://archives.postgresql.org/pgsql...6/msg00005.php which is a fair statement of the situation. I don't recall any specific benchmarks. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance |
| |||
| I had the opportunity to do more testing on another new server to see whether the kernel's I/O scheduling makes any difference. Conclusion: On a battery-backed RAID 10 system, the kernel's I/O scheduling algorithm has no effect. This makes sense, since a battery-backed cache will supercede any I/O rescheduling that the kernel tries to do. Hardware: Dell 2950 8 CPU (Intel 2GHz Xeon) 8 GB memory Dell Perc 6i with battery-backed cache RAID 10 of 8x 146GB SAS 10K 2.5" disks Software: Linux 2.6.24, 64-bit XFS file system Postgres 8.3.0 max_connections = 1000 shared_buffers = 2000MB work_mem = 256MB max_fsm_pages = 1000000 max_fsm_relations = 5000 synchronous_commit = off wal_buffers = 256kB checkpoint_segments = 30 effective_cache_size = 4GB Each test was run 5 times: drop database test create database test pgbench -i -s 20 -U test pgbench -c 10 -t 50000 -v -U test The I/O scheduler was changed on-the-fly using (for example) "echo cfq >/sys/block/sda/queue/scheduler". Autovacuum was turned off during the test. Here are the results. The numbers are those reported as "tps = xxxx (including connections establishing)" (which were almost identical to the "excluding..." tps number). I/O Sched AVG Test1 Test2 Test3 Test4 Test5 --------- ----- ----- ----- ----- ----- ----- cfq 3355 3646 3207 3132 3204 3584 noop 3163 2901 3190 3293 3124 3308 deadline 3547 3923 3722 3351 3484 3254 anticipatory 3384 3453 3916 2944 3451 3156 As you can see, the averages are very close -- closer than the "noise" between runs. As far as I can tell, there is no significant advantage, or even any significant difference, between the various I/O scheduler algorithms. (It also reinforces what the pgbench man page says: Short runs aren't useful. Even these two-minute runs have a lot of variability. Before I turned off AutoVacuum, the variability was more like 50% between runs.) Craig -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance |
| |||
| On Mon, May 5, 2008 at 5:33 PM, Craig James <craig_james@emolecules.com> wrote: > > (It also reinforces what the pgbench man page says: Short runs aren't > useful. Even these two-minute runs have a lot of variability. Before I > turned off AutoVacuum, the variability was more like 50% between runs.) I'd suggest a couple things for more realistic tests. Run the tests much longer, say 30 minutes to an hour. Crank up your scaling factor until your test db is larger than memory. Turn on autovacuum, maybe raising the cost / delay factors so it doesn't affect performance too negatively. And lastly tuning the bgwriter so that checkpoints are short and don't interfere too much. My guess is if you let it run for a while, you'll get a much more reliable number. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance |
| |||
| On Mon, 5 May 2008, Craig James wrote: > pgbench -i -s 20 -U test That's way too low to expect you'll see a difference in I/O schedulers. A scale of 20 is giving you a 320MB database, you can fit the whole thing in RAM and almost all of it on your controller cache. What's there to schedule? You're just moving between buffers that are generally large enough to hold most of what they need. > pgbench -c 10 -t 50000 -v -U test This is OK, because when you increase the size you're not going to be pushing 3500 TPS anymore and this test will take quite a while. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance |
| |||
| On May 5, 2008, at 7:33 PM, Craig James wrote: > I had the opportunity to do more testing on another new server to > see whether the kernel's I/O scheduling makes any difference. > Conclusion: On a battery-backed RAID 10 system, the kernel's I/O > scheduling algorithm has no effect. This makes sense, since a > battery-backed cache will supercede any I/O rescheduling that the > kernel tries to do. > this goes against my real world experience here. > pgbench -i -s 20 -U test > pgbench -c 10 -t 50000 -v -U test > You should use a sample size of 2x ram to get a more realistic number, or try out my pgiosim tool on pgfoundry which "sort of" simulates an index scan. I posted numbers from that a month or two ago here. -- Jeff Trout <jeff@jefftrout.com> http://www.stuarthamm.net/ http://www.dellsmartexitin.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance |
| |||
| Greg Smith wrote: > On Mon, 5 May 2008, Craig James wrote: > >> pgbench -i -s 20 -U test > > That's way too low to expect you'll see a difference in I/O schedulers. > A scale of 20 is giving you a 320MB database, you can fit the whole > thing in RAM and almost all of it on your controller cache. What's > there to schedule? You're just moving between buffers that are > generally large enough to hold most of what they need. Test repeated with: autovacuum enabled database destroyed and recreated between runs pgbench -i -s 600 ... pgbench -c 10 -t 50000 -n ... I/O Sched AVG Test1 Test2 --------- ----- ----- ----- cfq 705 695 715 noop 758 769 747 deadline 741 705 775 anticipatory 494 477 511 I only did two runs of each, which took about 24 minutes. Like the first round of tests, the "noise" in the measurements (about 10%) exceeds the difference between scheduler-algorithm performance, except that "anticipatory" seems to be measurably slower. So it still looks like cfq, noop and deadline are more or less equivalent when used with a battery-backed RAID. Craig -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance |
| |||
| On Tue, 6 May 2008, Craig James wrote: > I only did two runs of each, which took about 24 minutes. Like the first > round of tests, the "noise" in the measurements (about 10%) exceeds the > difference between scheduler-algorithm performance, except that > "anticipatory" seems to be measurably slower. Those are much better results. Any test that says anticipatory is anything other than useless for database system use with a good controller I presume is broken, so that's how I know you're in the right ballpark now but weren't before. In order to actually get some useful data out of the noise that is pgbench, you need a lot more measurements of longer runs. As perspective, the last time I did something in this area, in order to get enough data to get a clear picture I ran tests for 12 hours. I'm hoping to repeat that soon with some more common hardware that gives useful results I can give out. > So it still looks like cfq, noop and deadline are more or less equivalent > when used with a battery-backed RAID. I think it's fair to say they're within 10% of one another on raw throughput. The thing you're not measuring here is worst-case latency, and that's where there might be a more interesting difference. Most tests I've seen suggest deadline is the best in that regard, cfq the worst, and where noop fits in depends on the underlying controller. pgbench produces log files with latency measurements if you pass it "-l". Here's a snippet of shell that runs pgbench then looks at the resulting latency results for the worst 5 numbers: pgbench ... -l & p=$! wait $p mv pgbench_log.${p} pgbench.log echo Worst latency results: cat pgbench.log | cut -f 3 -d " " | sort -n | tail -n 5 However, that may not give you much useful info either--in most cases checkpoint issues kind of swamp the worst-base behavior in PostgreSQL, and to quantify I/O schedulers you need to look more complicated statistics on latency. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance |
| ||||
| Greg Smith wrote: > On Tue, 6 May 2008, Craig James wrote: > >> I only did two runs of each, which took about 24 minutes. Like the >> first round of tests, the "noise" in the measurements (about 10%) >> exceeds the difference between scheduler-algorithm performance, except >> that "anticipatory" seems to be measurably slower. > > Those are much better results. Any test that says anticipatory is > anything other than useless for database system use with a good > controller I presume is broken, so that's how I know you're in the right > ballpark now but weren't before. > > In order to actually get some useful data out of the noise that is > pgbench, you need a lot more measurements of longer runs. As > perspective, the last time I did something in this area, in order to get > enough data to get a clear picture I ran tests for 12 hours. I'm hoping > to repeat that soon with some more common hardware that gives useful > results I can give out. This data is good enough for what I'm doing. There were reports from non-RAID users that the I/O scheduling could make as much as a 4x difference in performance (which makes sense for non-RAID), but these tests show me that three of the four I/O schedulers are within 10% of each other. Since this matches my intuition of how battery-backed RAID will work, I'm satisfied. If our servers get overloaded to the point where 10% matters, then I need a much more dramatic solution, like faster machines or more machines. Craig -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance |