Unix Technical Forum

Re: Performance increase with elevator=deadline

This is a discussion on Re: Performance increase with elevator=deadline within the Pgsql Performance forums, part of the PostgreSQL category; --> On Fri, 11 Apr 2008, Jeff wrote: > Using 4 of these with a dataset of about 30GB across ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 11:46 AM
Matthew
 
Posts: n/a
Default Re: Performance increase with elevator=deadline

On Fri, 11 Apr 2008, Jeff wrote:
> Using 4 of these with a dataset of about 30GB across a few files (Machine has
> 8GB mem) I went from around 100 io/sec to 330 changing to noop. Quite an
> improvement. If you have a decent controller CFQ is not what you want. I
> tried deadline as well and it was a touch slower. The controller is a 3ware
> 9550sx with 4 disks in a raid10.


I ran Greg's fadvise test program a while back on a 12-disc array. The
three schedulers (deadline, noop, anticipatory) all performed pretty-much
the same, with the fourth (cfq, the default) being consistently slower.

> it also seems changing elevators on the fly works fine (echo schedulername >
> /sys/block/.../queue/scheduler I admit I sat there flipping back and forth
> going "disk go fast.. disk go slow.. disk go fast... "


Oh Homer Simpson, your legacy lives on.

Matthew

--
I suppose some of you have done a Continuous Maths course. Yes? Continuous
Maths? <menacing stares from audience> Whoah, it was like that, was it!
-- Computer Science Lecturer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 11:46 AM
Craig Ringer
 
Posts: n/a
Default Re: Performance increase with elevator=deadline

Matthew wrote:
> On Fri, 11 Apr 2008, Jeff wrote:
>> Using 4 of these with a dataset of about 30GB across a few files
>> (Machine has 8GB mem) I went from around 100 io/sec to 330 changing to
>> noop. Quite an improvement. If you have a decent controller CFQ is
>> not what you want. I tried deadline as well and it was a touch
>> slower. The controller is a 3ware 9550sx with 4 disks in a raid10.

>
> I ran Greg's fadvise test program a while back on a 12-disc array. The
> three schedulers (deadline, noop, anticipatory) all performed
> pretty-much the same, with the fourth (cfq, the default) being
> consistently slower.


I use CFQ on some of my servers, despite the fact that it's often slower
in total throughput terms, because it delivers much more predictable I/O
latencies that help prevent light I/O processes being starved by heavy
I/O processes. In particular, an Linux terminal server used at work has
taken a lot of I/O tuning before it delivers even faintly acceptable I/O
latencies under any sort of load.

Bounded I/O latency at the expense of throughput is not what you usually
want on a DB server, where throughput is king, so I'm not at all
surprised that CFQ performs poorly for PostgreSQL. I've done no testing
on that myself, though, because with my DB size and the nature of my
queries most of them are CPU bound anyway.

Speaking of I/O performance with PostgreSQL, has anybody here done any
testing to compare results with LVM to results with the same filesystem
on a conventionally partitioned or raw volume? I'd probably use LVM even
at a performance cost because of its admin benefits, but I'd be curious
if there is any known cost for use with Pg. I've never been able to
measure one with other workloads.

--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-19-2008, 11:46 AM
Greg Smith
 
Posts: n/a
Default Re: Performance increase with elevator=deadline

On Sat, 12 Apr 2008, Craig Ringer wrote:

> Speaking of I/O performance with PostgreSQL, has anybody here done any
> testing to compare results with LVM to results with the same filesystem on a
> conventionally partitioned or raw volume?


There was some chatter on this topic last year; a quick search finds

http://archives.postgresql.org/pgsql...6/msg00005.php

which is a fair statement of the situation. I don't recall any specific
benchmarks.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 05-07-2008, 11:18 AM
Craig James
 
Posts: n/a
Default RAID 10 Benchmark with different I/O schedulers (was: Performanceincrease with elevator=deadline)

I had the opportunity to do more testing on another new server to see whether the kernel's I/O scheduling makes any difference. Conclusion: On a battery-backed RAID 10 system, the kernel's I/O scheduling algorithm has no effect. This makes sense, since a battery-backed cache will supercede any I/O rescheduling that the kernel tries to do.

Hardware:
Dell 2950
8 CPU (Intel 2GHz Xeon)
8 GB memory
Dell Perc 6i with battery-backed cache
RAID 10 of 8x 146GB SAS 10K 2.5" disks

Software:
Linux 2.6.24, 64-bit
XFS file system
Postgres 8.3.0
max_connections = 1000
shared_buffers = 2000MB
work_mem = 256MB
max_fsm_pages = 1000000
max_fsm_relations = 5000
synchronous_commit = off
wal_buffers = 256kB
checkpoint_segments = 30
effective_cache_size = 4GB

Each test was run 5 times:
drop database test
create database test
pgbench -i -s 20 -U test
pgbench -c 10 -t 50000 -v -U test

The I/O scheduler was changed on-the-fly using (for example) "echo cfq >/sys/block/sda/queue/scheduler".

Autovacuum was turned off during the test.

Here are the results. The numbers are those reported as "tps = xxxx (including connections establishing)" (which were almost identical to the "excluding..." tps number).

I/O Sched AVG Test1 Test2 Test3 Test4 Test5
--------- ----- ----- ----- ----- ----- -----
cfq 3355 3646 3207 3132 3204 3584
noop 3163 2901 3190 3293 3124 3308
deadline 3547 3923 3722 3351 3484 3254
anticipatory 3384 3453 3916 2944 3451 3156

As you can see, the averages are very close -- closer than the "noise" between runs. As far as I can tell, there is no significant advantage, or even any significant difference, between the various I/O scheduler algorithms.

(It also reinforces what the pgbench man page says: Short runs aren't useful. Even these two-minute runs have a lot of variability. Before I turned off AutoVacuum, the variability was more like 50% between runs.)

Craig

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 05-07-2008, 11:18 AM
Scott Marlowe
 
Posts: n/a
Default Re: RAID 10 Benchmark with different I/O schedulers (was: Performance increase with elevator=deadline)

On Mon, May 5, 2008 at 5:33 PM, Craig James <craig_james@emolecules.com> wrote:
>
> (It also reinforces what the pgbench man page says: Short runs aren't
> useful. Even these two-minute runs have a lot of variability. Before I
> turned off AutoVacuum, the variability was more like 50% between runs.)


I'd suggest a couple things for more realistic tests. Run the tests
much longer, say 30 minutes to an hour. Crank up your scaling factor
until your test db is larger than memory. Turn on autovacuum, maybe
raising the cost / delay factors so it doesn't affect performance too
negatively. And lastly tuning the bgwriter so that checkpoints are
short and don't interfere too much.

My guess is if you let it run for a while, you'll get a much more
reliable number.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 05-07-2008, 11:18 AM
Greg Smith
 
Posts: n/a
Default Re: RAID 10 Benchmark with different I/O schedulers (was:Performance increase with elevator=deadline)

On Mon, 5 May 2008, Craig James wrote:

> pgbench -i -s 20 -U test


That's way too low to expect you'll see a difference in I/O schedulers.
A scale of 20 is giving you a 320MB database, you can fit the whole thing
in RAM and almost all of it on your controller cache. What's there to
schedule? You're just moving between buffers that are generally large
enough to hold most of what they need.

> pgbench -c 10 -t 50000 -v -U test


This is OK, because when you increase the size you're not going to be
pushing 3500 TPS anymore and this test will take quite a while.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 05-07-2008, 11:18 AM
Jeff
 
Posts: n/a
Default Re: RAID 10 Benchmark with different I/O schedulers (was:Performance increase with elevator=deadline)


On May 5, 2008, at 7:33 PM, Craig James wrote:

> I had the opportunity to do more testing on another new server to
> see whether the kernel's I/O scheduling makes any difference.
> Conclusion: On a battery-backed RAID 10 system, the kernel's I/O
> scheduling algorithm has no effect. This makes sense, since a
> battery-backed cache will supercede any I/O rescheduling that the
> kernel tries to do.
>


this goes against my real world experience here.

> pgbench -i -s 20 -U test
> pgbench -c 10 -t 50000 -v -U test
>


You should use a sample size of 2x ram to get a more realistic number,
or try out my pgiosim tool on pgfoundry which "sort of" simulates an
index scan. I posted numbers from that a month or two ago here.


--
Jeff Trout <jeff@jefftrout.com>
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 05-07-2008, 11:19 AM
Craig James
 
Posts: n/a
Default Re: RAID 10 Benchmark with different I/O schedulers

Greg Smith wrote:
> On Mon, 5 May 2008, Craig James wrote:
>
>> pgbench -i -s 20 -U test

>
> That's way too low to expect you'll see a difference in I/O schedulers.
> A scale of 20 is giving you a 320MB database, you can fit the whole
> thing in RAM and almost all of it on your controller cache. What's
> there to schedule? You're just moving between buffers that are
> generally large enough to hold most of what they need.


Test repeated with:
autovacuum enabled
database destroyed and recreated between runs
pgbench -i -s 600 ...
pgbench -c 10 -t 50000 -n ...

I/O Sched AVG Test1 Test2
--------- ----- ----- -----
cfq 705 695 715
noop 758 769 747
deadline 741 705 775
anticipatory 494 477 511

I only did two runs of each, which took about 24 minutes. Like the first round of tests, the "noise" in the measurements (about 10%) exceeds the difference between scheduler-algorithm performance, except that "anticipatory" seems to be measurably slower.

So it still looks like cfq, noop and deadline are more or less equivalent when used with a battery-backed RAID.

Craig

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 05-07-2008, 11:19 AM
Greg Smith
 
Posts: n/a
Default Re: RAID 10 Benchmark with different I/O schedulers

On Tue, 6 May 2008, Craig James wrote:

> I only did two runs of each, which took about 24 minutes. Like the first
> round of tests, the "noise" in the measurements (about 10%) exceeds the
> difference between scheduler-algorithm performance, except that
> "anticipatory" seems to be measurably slower.


Those are much better results. Any test that says anticipatory is
anything other than useless for database system use with a good controller
I presume is broken, so that's how I know you're in the right ballpark now
but weren't before.

In order to actually get some useful data out of the noise that is
pgbench, you need a lot more measurements of longer runs. As perspective,
the last time I did something in this area, in order to get enough data to
get a clear picture I ran tests for 12 hours. I'm hoping to repeat that
soon with some more common hardware that gives useful results I can give
out.

> So it still looks like cfq, noop and deadline are more or less equivalent
> when used with a battery-backed RAID.


I think it's fair to say they're within 10% of one another on raw
throughput. The thing you're not measuring here is worst-case latency,
and that's where there might be a more interesting difference. Most tests
I've seen suggest deadline is the best in that regard, cfq the worst, and
where noop fits in depends on the underlying controller.

pgbench produces log files with latency measurements if you pass it "-l".
Here's a snippet of shell that runs pgbench then looks at the resulting
latency results for the worst 5 numbers:

pgbench ... -l &
p=$!
wait $p
mv pgbench_log.${p} pgbench.log
echo Worst latency results:
cat pgbench.log | cut -f 3 -d " " | sort -n | tail -n 5

However, that may not give you much useful info either--in most cases
checkpoint issues kind of swamp the worst-base behavior in PostgreSQL,
and to quantify I/O schedulers you need to look more complicated
statistics on latency.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 05-07-2008, 11:19 AM
Craig James
 
Posts: n/a
Default Re: RAID 10 Benchmark with different I/O schedulers

Greg Smith wrote:
> On Tue, 6 May 2008, Craig James wrote:
>
>> I only did two runs of each, which took about 24 minutes. Like the
>> first round of tests, the "noise" in the measurements (about 10%)
>> exceeds the difference between scheduler-algorithm performance, except
>> that "anticipatory" seems to be measurably slower.

>
> Those are much better results. Any test that says anticipatory is
> anything other than useless for database system use with a good
> controller I presume is broken, so that's how I know you're in the right
> ballpark now but weren't before.
>
> In order to actually get some useful data out of the noise that is
> pgbench, you need a lot more measurements of longer runs. As
> perspective, the last time I did something in this area, in order to get
> enough data to get a clear picture I ran tests for 12 hours. I'm hoping
> to repeat that soon with some more common hardware that gives useful
> results I can give out.


This data is good enough for what I'm doing. There were reports from non-RAID users that the I/O scheduling could make as much as a 4x difference in performance (which makes sense for non-RAID), but these tests show me that three of the four I/O schedulers are within 10% of each other. Since this matches my intuition of how battery-backed RAID will work, I'm satisfied. If our servers get overloaded to the point where 10% matters, then I need a much more dramatic solution, like faster machines or more machines.

Craig


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 07:10 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com