Unix Technical Forum

strange performance regression between 7.4 and 8.1

This is a discussion on strange performance regression between 7.4 and 8.1 within the Pgsql Performance forums, part of the PostgreSQL category; --> On Fri, 2007-03-02 at 10:03, Alex Deucher wrote: > On 3/2/07, Ron <rjpeace@earthlink.net> wrote: > > At 10:16 AM ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #31 (permalink)  
Old 04-19-2008, 10:19 AM
Scott Marlowe
 
Posts: n/a
Default Re: strange performance regression between 7.4 and 8.1

On Fri, 2007-03-02 at 10:03, Alex Deucher wrote:
> On 3/2/07, Ron <rjpeace@earthlink.net> wrote:
> > At 10:16 AM 3/2/2007, Alex Deucher wrote:


> > d= you went from local HD IO to a SAN
> > (many differences hidden in that one line... ...and is the physical
> > layout of tables and things like pg_xlog sane on the SAN?)
> >
> >
> > ...and you did this by just pulling over the old DB onto the new HW?
> >

>
> We rebuild the DB from scratch on the new server. Same table
> structure though. We reloaded from the source material directly.


I would REALLY recommend testing this machine out with a simple software
RAID-1 pair of SCSI or SATA drives just to eliminate or confirm the SAN
as the root problem.


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #32 (permalink)  
Old 04-19-2008, 10:19 AM
Ron
 
Posts: n/a
Default Re: strange performance regression between 7.4 and8.1

At 11:03 AM 3/2/2007, Alex Deucher wrote:
>On 3/2/07, Ron <rjpeace@earthlink.net> wrote:
>
>>May I suggest that it is possible that your schema, queries, etc were
>>all optimized for pg 7.x running on the old HW?
>>(explain analyze shows the old system taking ~1/10 the time per row
>>as well as estimating the number of rows more accurately)
>>
>>RAM is =cheap=. Much cheaper than the cost of a detective hunt
>>followed by rework to queries, schema, etc.
>>Fitting the entire DB into RAM is guaranteed to help unless this is
>>an OLTP like application where HD IO is required to be synchronous..
>>If you can fit the entire DB comfortably into RAM, do it and buy
>>yourself the time to figure out the rest of the story w/o impacting
>>on production performance.

>
>Perhaps so. I just don't want to spend $1000 on ram and have it only
>marginally improve performance if at all. The old DB works, so we can
>keep using that until we sort this out.
>
>Alex

1= $1000 worth of RAM is very likely less than the $ worth of, say,
10 hours of your time to your company. Perhaps much less.
(Your =worth=, not your pay or even your fully loaded cost. This
number tends to be >= 4x what you are paid unless the organization
you are working for is in imminent financial danger.)
You've already put more considerably more than 10 hours of your time
into this...

2= If the DB goes from not fitting completely into RAM to being
completely RAM resident, you are almost 100% guaranteed a big
performance boost.
The exception is an OLTP like app where DB writes can't be done
a-synchronously (doing financial transactions, real time control systems, etc).
Data mines should never have this issue.

3= Whether adding enough RAM to make the DB RAM resident (and
re-configuring conf, etc, appropriately) solves the problem or not,
you will have gotten a serious lead as to what's wrong.

....and I still think looking closely at the actual physical layout of
the tables in the SAN is likely to be worth it.

Cheers,
Ron Peacetree



---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #33 (permalink)  
Old 04-19-2008, 10:19 AM
Anton Rommerskirchen
 
Posts: n/a
Default Re: strange performance regression between 7.4 and 8.1

Am Donnerstag 01 März 2007 21:44 schrieb Alex Deucher:
> Hello,
>
> I have noticed a strange performance regression and I'm at a loss as
> to what's happening. We have a fairly large database (~16 GB). The
> original postgres 7.4 was running on a sun v880 with 4 CPUs and 8 GB
> of ram running Solaris on local scsi discs. The new server is a sun
> Opteron box with 4 cores, 8 GB of ram running postgres 8.1.4 on Linux
> (AMD64) on a 4 Gbps FC SAN volume. When we created the new database
> it was created from scratch rather than copying over the old one,
> however the table structure is almost identical (UTF8 on the new one
> vs. C on the old). The problem is queries are ~10x slower on the new
> hardware. I read several places that the SAN might be to blame, but
> testing with bonnie and dd indicates that the SAN is actually almost
> twice as fast as the scsi discs in the old sun server. I've tried
> adjusting just about every option in the postgres config file, but
> performance remains the same. Any ideas?
>



1. Do you use NUMA ctl for locking the db on one node ?

2. do you use bios to interleave memeory ?

3. do you expand cache over mor than one numa node ?

> Thanks,
>
> Alex
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq


--

ATRSoft GmbH
Rosellstrasse 9
D 50354 Hürth
Deutschland
Tel .: +49(0)2233 691324

Geschäftsführer Anton Rommerskirchen

Köln HRB 44927
STNR 224/5701 - 1010

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #34 (permalink)  
Old 04-19-2008, 10:19 AM
Alvaro Herrera
 
Posts: n/a
Default Re: strange performance regression between 7.4 and 8.1

Florian Weimer escribió:

> Locale settings make a huge difference for sorting and LIKE queries.
> We usually use the C locale and SQL_ASCII encoding, mostly for
> performance reasons. (Proper UTF-8 can be enforced through
> constraints if necessary.)


Hmm, you are aware of varchar_pattern_ops and related opclasses, right?
That helps for LIKE queries in non-C locales (though you do have to keep
almost-duplicate indexes).

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #35 (permalink)  
Old 04-19-2008, 10:19 AM
Alex Deucher
 
Posts: n/a
Default Re: strange performance regression between 7.4 and 8.1

On 3/2/07, Ron <rjpeace@earthlink.net> wrote:
> At 11:03 AM 3/2/2007, Alex Deucher wrote:
> >On 3/2/07, Ron <rjpeace@earthlink.net> wrote:
> >
> >>May I suggest that it is possible that your schema, queries, etc were
> >>all optimized for pg 7.x running on the old HW?
> >>(explain analyze shows the old system taking ~1/10 the time per row
> >>as well as estimating the number of rows more accurately)
> >>
> >>RAM is =cheap=. Much cheaper than the cost of a detective hunt
> >>followed by rework to queries, schema, etc.
> >>Fitting the entire DB into RAM is guaranteed to help unless this is
> >>an OLTP like application where HD IO is required to be synchronous..
> >>If you can fit the entire DB comfortably into RAM, do it and buy
> >>yourself the time to figure out the rest of the story w/o impacting
> >>on production performance.

> >
> >Perhaps so. I just don't want to spend $1000 on ram and have it only
> >marginally improve performance if at all. The old DB works, so we can
> >keep using that until we sort this out.
> >
> >Alex

> 1= $1000 worth of RAM is very likely less than the $ worth of, say,
> 10 hours of your time to your company. Perhaps much less.
> (Your =worth=, not your pay or even your fully loaded cost. This
> number tends to be >= 4x what you are paid unless the organization
> you are working for is in imminent financial danger.)
> You've already put more considerably more than 10 hours of your time
> into this...
>
> 2= If the DB goes from not fitting completely into RAM to being
> completely RAM resident, you are almost 100% guaranteed a big
> performance boost.
> The exception is an OLTP like app where DB writes can't be done
> a-synchronously (doing financial transactions, real time control systems, etc).
> Data mines should never have this issue.
>
> 3= Whether adding enough RAM to make the DB RAM resident (and
> re-configuring conf, etc, appropriately) solves the problem or not,
> you will have gotten a serious lead as to what's wrong.
>
> ...and I still think looking closely at the actual physical layout of
> the tables in the SAN is likely to be worth it.
>


How would I go about doing that?

Thanks,

Alex

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #36 (permalink)  
Old 04-19-2008, 10:19 AM
Ron
 
Posts: n/a
Default Re: strange performance regression between 7.4 and8.1

At 02:43 PM 3/2/2007, Alex Deucher wrote:
>On 3/2/07, Ron <rjpeace@earthlink.net> wrote:
>>
>>...and I still think looking closely at the actual physical layout of
>>the tables in the SAN is likely to be worth it.

>
>How would I go about doing that?
>
>Alex


Hard for me to give specific advice when I don't know what SAN
product we are talking about nor what kind of HDs are in it nor how
those HDs are presently configured...

I quote you in an earlier post:
"The RAID groups on the SAN were set up for maximum capacity rather
than for performance. Using it for the databases just came up recently."

That implies to me that the SAN is more or less set up as a huge 105
HD (assuming this number is correct? We all know how "assume" is
spelled...) JBOD or RAID 5 (or 6, or 5*, or 6*) set.

=IF= that is true, tables are not being given dedicated RAID
groups. That implies that traditional lore like having pg_xlog on
dedicated spindles is being ignored.
Nor is the more general Best Practice of putting the most heavily
used tables onto dedicated spindles being followed.

In addition, the most space efficient RAID levels: 5* or 6*, are not
the best performing one (RAID 10 striping your mirrors)

In short, configuring a SAN for maximum capacity is exactly the wrong
thing to do if one is planning to use it in the best way to support
DB performance.

I assume (there's that word again...) that there is someone in your
organization who understands how the SAN is configured and administered.
You need to talk to them about these issues.

Cheers,
Ron



---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #37 (permalink)  
Old 04-19-2008, 10:19 AM
Alex Deucher
 
Posts: n/a
Default Re: strange performance regression between 7.4 and 8.1

On 3/2/07, Ron <rjpeace@earthlink.net> wrote:
> At 02:43 PM 3/2/2007, Alex Deucher wrote:
> >On 3/2/07, Ron <rjpeace@earthlink.net> wrote:
> >>
> >>...and I still think looking closely at the actual physical layout of
> >>the tables in the SAN is likely to be worth it.

> >
> >How would I go about doing that?
> >
> >Alex

>
> Hard for me to give specific advice when I don't know what SAN
> product we are talking about nor what kind of HDs are in it nor how
> those HDs are presently configured...
>
> I quote you in an earlier post:
> "The RAID groups on the SAN were set up for maximum capacity rather
> than for performance. Using it for the databases just came up recently."
>
> That implies to me that the SAN is more or less set up as a huge 105
> HD (assuming this number is correct? We all know how "assume" is
> spelled...) JBOD or RAID 5 (or 6, or 5*, or 6*) set.
>
> =IF= that is true, tables are not being given dedicated RAID
> groups. That implies that traditional lore like having pg_xlog on
> dedicated spindles is being ignored.
> Nor is the more general Best Practice of putting the most heavily
> used tables onto dedicated spindles being followed.
>
> In addition, the most space efficient RAID levels: 5* or 6*, are not
> the best performing one (RAID 10 striping your mirrors)
>
> In short, configuring a SAN for maximum capacity is exactly the wrong
> thing to do if one is planning to use it in the best way to support
> DB performance.
>
> I assume (there's that word again...) that there is someone in your
> organization who understands how the SAN is configured and administered.
> You need to talk to them about these issues.
>


Ah OK. I see what you are saying; thank you for clarifying. Yes,
the SAN is configured for maximum capacity; it has large RAID 5
groups. As I said earlier, we never intended to run a DB on the SAN,
it just happened to come up, hence the configuration.

Alex

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #38 (permalink)  
Old 04-19-2008, 10:19 AM
Guido Neitzer
 
Posts: n/a
Default Re: strange performance regression between 7.4 and 8.1

On 02.03.2007, at 14:20, Alex Deucher wrote:

> Ah OK. I see what you are saying; thank you for clarifying. Yes,
> the SAN is configured for maximum capacity; it has large RAID 5
> groups. As I said earlier, we never intended to run a DB on the SAN,
> it just happened to come up, hence the configuration.


So why not dumping the stuff ones, importing into a PG configured to
use local discs (Or even ONE local disc, you might have the 16GB you
gave as a size for the db on the local machine, right?) and testing
whether the problem is with PG connecting to the SAN. So you have one
factor less to consider after all your changes.

Maybe it's just that something in the chain from PG to the actual HD
spindles kills your random access performance for getting the actual
rows.

cug

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #39 (permalink)  
Old 04-19-2008, 10:19 AM
Jeff Frost
 
Posts: n/a
Default Re: strange performance regression between 7.4 and 8.1

On Fri, 2 Mar 2007, Guido Neitzer wrote:

> On 02.03.2007, at 14:20, Alex Deucher wrote:
>
>> Ah OK. I see what you are saying; thank you for clarifying. Yes,
>> the SAN is configured for maximum capacity; it has large RAID 5
>> groups. As I said earlier, we never intended to run a DB on the SAN,
>> it just happened to come up, hence the configuration.

>
> So why not dumping the stuff ones, importing into a PG configured to use
> local discs (Or even ONE local disc, you might have the 16GB you gave as a
> size for the db on the local machine, right?) and testing whether the problem
> is with PG connecting to the SAN. So you have one factor less to consider
> after all your changes.
>
> Maybe it's just that something in the chain from PG to the actual HD spindles
> kills your random access performance for getting the actual rows.


I am actually starting to think that the SAN may be introducing some amount of
latency that is enough to kill your random IO which is what all of the queries
in question are doing - look up in index - fetch row from table.

If you have the time, it would be totally worth it to test with a local disk
and see how that affects the speed.

I would think that even with RAID5, a SAN with that many spindles would be
quite fast in raw throughput, but perhaps it's just seek latency that's
killing you.

When you run the bonnie tests again, take note of what the seeks/sec is
compared with the old disk. Also, you should run bonnie with the -b switch to
see if that causes significant slowdown of the writes...maybe minor synced
write activity to pg_xlog is bogging the entire system down. Is the system
spending most of its time in IO wait?

Also, another item of note might be the actual on disk DB size..I wonder if it
has changed significantly going from SQL_ASCII to UTF8.

In 8.1 you can do this:

SELECT datname,
pg_size_pretty(pg_database_size(datname)) AS size
FROM pg_database;

In 7.4, you'll need to install the dbsize contrib module to get the same info.

--
Jeff Frost, Owner <jeff@frostconsultingllc.com>
Frost Consulting, LLC http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #40 (permalink)  
Old 04-19-2008, 10:19 AM
Alex Deucher
 
Posts: n/a
Default Re: strange performance regression between 7.4 and 8.1

On 3/2/07, Jeff Frost <jeff@frostconsultingllc.com> wrote:
> On Fri, 2 Mar 2007, Guido Neitzer wrote:
>
> > On 02.03.2007, at 14:20, Alex Deucher wrote:
> >
> >> Ah OK. I see what you are saying; thank you for clarifying. Yes,
> >> the SAN is configured for maximum capacity; it has large RAID 5
> >> groups. As I said earlier, we never intended to run a DB on the SAN,
> >> it just happened to come up, hence the configuration.

> >
> > So why not dumping the stuff ones, importing into a PG configured to use
> > local discs (Or even ONE local disc, you might have the 16GB you gave as a
> > size for the db on the local machine, right?) and testing whether the problem
> > is with PG connecting to the SAN. So you have one factor less to consider
> > after all your changes.
> >
> > Maybe it's just that something in the chain from PG to the actual HD spindles
> > kills your random access performance for getting the actual rows.

>
> I am actually starting to think that the SAN may be introducing some amount of
> latency that is enough to kill your random IO which is what all of the queries
> in question are doing - look up in index - fetch row from table.
>
> If you have the time, it would be totally worth it to test with a local disk
> and see how that affects the speed.
>
> I would think that even with RAID5, a SAN with that many spindles would be
> quite fast in raw throughput, but perhaps it's just seek latency that's
> killing you.
>
> When you run the bonnie tests again, take note of what the seeks/sec is
> compared with the old disk. Also, you should run bonnie with the -b switch to
> see if that causes significant slowdown of the writes...maybe minor synced
> write activity to pg_xlog is bogging the entire system down. Is the system
> spending most of its time in IO wait?
>
> Also, another item of note might be the actual on disk DB size..I wonder if it
> has changed significantly going from SQL_ASCII to UTF8.
>
> In 8.1 you can do this:
>
> SELECT datname,
> pg_size_pretty(pg_database_size(datname)) AS size
> FROM pg_database;
>
> In 7.4, you'll need to install the dbsize contrib module to get the same info.
>


I'm beginning the think the same thing. I'm planning to try the tests
above next week. I'll let you know what I find out.

Thanks!

Alex

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 05:02 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com