Unix Technical Forum

New to PostgreSQL, performance considerations

This is a discussion on New to PostgreSQL, performance considerations within the Pgsql Performance forums, part of the PostgreSQL category; --> Hi yall, although I've worked with databases for more than 7 years now, I'm petty new to PostgreSQL. I ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 09:52 AM
Daniel van Ham Colchete
 
Posts: n/a
Default New to PostgreSQL, performance considerations

Hi yall,

although I've worked with databases for more than 7 years now, I'm
petty new to PostgreSQL.

I have an application using SQLite3 as an embedded SQL solution
because it's simple and it can handle the load that *most* of my
clients have.

Because of that '*most*' part, because of the client/server way and
because of the license, I'm think about start using PostgreSQL.

My app uses only three tables: one has low read and really high write
rates, a second has high read and low write and the third one is
equally high on both.

I need a db that can handle something like 500 operations/sec
continuously. It's something like 250 writes/sec and 250 reads/sec. My
databases uses indexes.

Each table would have to handle 5 million rows/day. So I'm thinking
about creating different tables (clusters?) to different days to make
queries return faster. Am I right or there is no problem in having a
150 million (one month) rows on a table?

All my data is e-mail traffic: user's quarentine, inbond traffic,
outbond traffic, sender, recipients, subjects, attachments, etc...

What do you people say, is it possible with PostgreSQL? What kind of
hardware would I need to handle that kind of traffic?

On a first test, at a badly tunned AMD Athlon XP 1800+ (ergh!) I could
do 1400 writes/sec locally after I disabled fsync. We have UPSs, in
the last year we only had 1 power failure.

Thank you all for your tips.

Best regards,
Daniel Colchete

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 09:52 AM
Shane Ambler
 
Posts: n/a
Default Re: New to PostgreSQL, performance considerations

Daniel van Ham Colchete wrote:
> although I've worked with databases for more than 7 years now, I'm
> petty new to PostgreSQL.


Same here.

> I need a db that can handle something like 500 operations/sec
> continuously. It's something like 250 writes/sec and 250 reads/sec. My
> databases uses indexes.


Taken from an email to the admin list about a week ago -

Stats about the system:
Postgres 8.1.4
db size: 200+ GB
Inheritance is used extremely heavily, so in figuring out what could
cause a create to hang, it may be of interest to know that there are:
101,745 tables
314,821 indexes
1,569 views
The last averages taken on the number of writes per hour on this
database: ~3 million (this stat is a few weeks old)

Machine info:
OS: Solaris 10
Sunfire X4100 XL
2x AMD Opteron Model 275 dual core procs
8GB of ram


> Each table would have to handle 5 million rows/day. So I'm thinking
> about creating different tables (clusters?) to different days to make
> queries return faster. Am I right or there is no problem in having a
> 150 million (one month) rows on a table?


Sounds to me that a month might be on the large size for real fast
response times - I would think of seperating weekly rather than daily.

Start with
http://www.postgresql.org/docs/8.2/i...l-inherit.html
then the next chapter explains using that to partition data into
different tables dependant on specified criteria.

You may be interested in tsearch2 which is in the contrib dir and adds
full text indexing.


--

Shane Ambler
pgSQL@007Marketing.com

Get Sheeky @ http://Sheeky.Biz

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-19-2008, 09:52 AM
Daniel van Ham Colchete
 
Posts: n/a
Default Re: New to PostgreSQL, performance considerations

Hi Gene,

at my postgresql.conf, the only non-comented lines are:
fsync = off
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
max_connections = 100
shared_buffers = 5000
temp_buffers = 1000
work_mem = 4096

The only two values I changed are shared_buffers and work_mem.

*** BUT ***
I'm using Gentoo Linux, so all my libraries (including glibc that is
very important to PostgreSQL), and all my softwares are compiled with
good CFLAG options to my processor ("-O2 march=athlon-xp (...)"). My
Linux is not an Intel-AMD binary compatible turtle like
Fedora/RedHat/SUSE/... It's really important to have your GLIBC
compiled for your processor. It is essencial for performance.

I can't imagine anyone buying a $1k-dollar quad-core XEON and using an
i585 compatible distro that doesn't even know what the fudge is
SSE/SSE2/vectorized instructions.

Best regards,
Daniel Colchete

On 12/10/06, Gene <genekhart@gmail.com> wrote:
> I have a similar type application, I'm partitioning using constraint
> exclusion so queries only have to look at a few tables. I've found that
> there is some overhead to using partitioning so you should test to see how
> many partitions you want to create. Could I check out you postgresql.conf
> parameters to compare? thanks
>
>
> Gene Hart


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-19-2008, 09:52 AM
Daniel van Ham Colchete
 
Posts: n/a
Default Re: New to PostgreSQL, performance considerations

Shane,

really good answer man!!! It really helped me!

I can't wait to have time to test this partitions thing. This will
really solve a lot of my problems (and another one that I was
delaying).

Instead of making partitions over days/weeks I can tune it to the
seconds (timestamp) and try to find the best partition size for
performance. That partition thing is really good.

That server you sent me handles about 15 times more writes than I would have to.

Thank you very much for your answer.

Best regards,
Daniel Colchete

On 12/10/06, Shane Ambler <pgsql@007marketing.com> wrote:
> Daniel van Ham Colchete wrote:
> > although I've worked with databases for more than 7 years now, I'm
> > petty new to PostgreSQL.

>
> Same here.
>
> > I need a db that can handle something like 500 operations/sec
> > continuously. It's something like 250 writes/sec and 250 reads/sec. My
> > databases uses indexes.

>
> Taken from an email to the admin list about a week ago -
>
> Stats about the system:
> Postgres 8.1.4
> db size: 200+ GB
> Inheritance is used extremely heavily, so in figuring out what could
> cause a create to hang, it may be of interest to know that there are:
> 101,745 tables
> 314,821 indexes
> 1,569 views
> The last averages taken on the number of writes per hour on this
> database: ~3 million (this stat is a few weeks old)
>
> Machine info:
> OS: Solaris 10
> Sunfire X4100 XL
> 2x AMD Opteron Model 275 dual core procs
> 8GB of ram
>
>
> > Each table would have to handle 5 million rows/day. So I'm thinking
> > about creating different tables (clusters?) to different days to make
> > queries return faster. Am I right or there is no problem in having a
> > 150 million (one month) rows on a table?

>
> Sounds to me that a month might be on the large size for real fast
> response times - I would think of seperating weekly rather than daily.
>
> Start with
> http://www.postgresql.org/docs/8.2/i...l-inherit.html
> then the next chapter explains using that to partition data into
> different tables dependant on specified criteria.
>
> You may be interested in tsearch2 which is in the contrib dir and adds
> full text indexing.
>
>
> --
>
> Shane Ambler
> pgSQL@007Marketing.com
>
> Get Sheeky @ http://Sheeky.Biz
>


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-19-2008, 09:52 AM
Daniel van Ham Colchete
 
Posts: n/a
Default Re: New to PostgreSQL, performance considerations

Gene,

at Postgres's docs they say that the "constraint checks are relatively
expensive". From what you're saying, it's really worth studying the
matter deply first.

I never understood what's the matter between the ASCII/ISO-8859-1/UTF8
charsets to a database. They're all simple C strings that doesn't have
the zero-byte in the midlle (like UTF16 would) and that doesn't
require any different processing unless you are doing case insensitive
search (them you would have a problem).

ASCII chars are also correct UTF8 chars as well. The first 127 Unicode
chars are the same as the ASCII chars. So you would not have any
problems changing your table from ASCII to UTF8. My software uses
UTF16 and UTF8 at some of it's internals and I only notice performance
problems with UTF16 (because of the zero-byte thing, the processing I
make is diferent). So, I imagine that you wouldn't have any
performance issues changing from ASCII to UTF8 if necessary.

Nowadays everything is turning to Unicode (thank god). I wouldn't
start anything with any other charset. I would only be asking for a
rewrite in a near future.

Best,
Daniel

On 12/10/06, Gene <genekhart@gmail.com> wrote:
> I'm using gentoo as well, I'm having performance issues as the number of
> partitions is increasing I imagine do due to overhead managing them and
> figuring out where to put each insert/update. I'm switching to weekly
> partitions instead of daily. I believe in PG8.2 constraint exclusion works
> with updates/deletes also so I'm eager to upgrade. I get about 1 million
> records per day in two tables each, each record updated about 4 times within
> 30 minutes.
>
> Do you think using UTF8 vs US-ASCII hurts performance signficantly, some of
> my smaller tables require unicode, and I don't think you can have some
> tables be unicode and some be ASCII.


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-19-2008, 09:52 AM
Gene
 
Posts: n/a
Default Re: New to PostgreSQL, performance considerations

I'm using gentoo as well, I'm having performance issues as the number of
partitions is increasing I imagine do due to overhead managing them and
figuring out where to put each insert/update. I'm switching to weekly
partitions instead of daily. I believe in PG8.2 constraint exclusion works
with updates/deletes also so I'm eager to upgrade. I get about 1 million
records per day in two tables each, each record updated about 4 times within
30 minutes.

Do you think using UTF8 vs US-ASCII hurts performance signficantly, some of
my smaller tables require unicode, and I don't think you can have some
tables be unicode and some be ASCII.

On 12/10/06, Daniel van Ham Colchete <daniel.colchete@gmail.com> wrote:
>
> Hi Gene,
>
> at my postgresql.conf, the only non-comented lines are:
> fsync = off
> lc_messages = 'C'
> lc_monetary = 'C'
> lc_numeric = 'C'
> lc_time = 'C'
> max_connections = 100
> shared_buffers = 5000
> temp_buffers = 1000
> work_mem = 4096
>
> The only two values I changed are shared_buffers and work_mem.
>
> *** BUT ***
> I'm using Gentoo Linux, so all my libraries (including glibc that is
> very important to PostgreSQL), and all my softwares are compiled with
> good CFLAG options to my processor ("-O2 march=athlon-xp (...)"). My
> Linux is not an Intel-AMD binary compatible turtle like
> Fedora/RedHat/SUSE/... It's really important to have your GLIBC
> compiled for your processor. It is essencial for performance.
>
> I can't imagine anyone buying a $1k-dollar quad-core XEON and using an
> i585 compatible distro that doesn't even know what the fudge is
> SSE/SSE2/vectorized instructions.
>
> Best regards,
> Daniel Colchete
>
> On 12/10/06, Gene <genekhart@gmail.com> wrote:
> > I have a similar type application, I'm partitioning using constraint
> > exclusion so queries only have to look at a few tables. I've found that
> > there is some overhead to using partitioning so you should test to see

> how
> > many partitions you want to create. Could I check out you

> postgresql.conf
> > parameters to compare? thanks
> >
> >
> > Gene Hart

>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match
>




--
Gene Hart
cell: 443-604-2679

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-19-2008, 09:52 AM
Michael Glaesemann
 
Posts: n/a
Default Re: New to PostgreSQL, performance considerations


On Dec 11, 2006, at 10:47 , Daniel van Ham Colchete wrote:

> I never understood what's the matter between the ASCII/ISO-8859-1/UTF8
> charsets to a database.


If what you mean by ASCII is SQL_ASCII, then there is at least one
significant difference between UTF8 (the PostgreSQL encoding) and
SQL_ASCII. AIUI, SQL_ASCII does no checking at all with respect to
what bytes are going in, while UTF8 does make sure to the best of its
ability that the bytes represent valid UTF-8 characters, throwing an
error if an invalid byte sequence is detected.

There's more information regarding this here:
http://www.postgresql.org/docs/8.2/interactive/
multibyte.html#MULTIBYTE-CHARSET-SUPPORTED

Michael Glaesemann
grzm seespotcode net



---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-19-2008, 09:52 AM
Michael Stone
 
Posts: n/a
Default Re: New to PostgreSQL, performance considerations

On Sun, Dec 10, 2006 at 11:02:44PM -0200, Daniel van Ham Colchete wrote:
>I'm using Gentoo Linux, so all my libraries (including glibc that is
>very important to PostgreSQL), and all my softwares are compiled with
>good CFLAG options to my processor ("-O2 march=athlon-xp (...)"). My
>Linux is not an Intel-AMD binary compatible turtle like
>Fedora/RedHat/SUSE/... It's really important to have your GLIBC
>compiled for your processor. It is essencial for performance.


Please, point to the benchmarks that demonstrate this for a postgres
application.

Mike Stone

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-19-2008, 09:52 AM
Dave Cramer
 
Posts: n/a
Default Re: New to PostgreSQL, performance considerations

Hi Daniel
On 10-Dec-06, at 8:02 PM, Daniel van Ham Colchete wrote:

> Hi Gene,
>
> at my postgresql.conf, the only non-comented lines are:
> fsync = off

This can, and will result in lost data.
> lc_messages = 'C'
> lc_monetary = 'C'
> lc_numeric = 'C'
> lc_time = 'C'


How much memory does this machine have and what version of postgresql
are you using?
> max_connections = 100
> shared_buffers = 5000
> temp_buffers = 1000
> work_mem = 4096
>
> The only two values I changed are shared_buffers and work_mem.


Dave
>
> *** BUT ***
> I'm using Gentoo Linux, so all my libraries (including glibc that is
> very important to PostgreSQL), and all my softwares are compiled with
> good CFLAG options to my processor ("-O2 march=athlon-xp (...)"). My
> Linux is not an Intel-AMD binary compatible turtle like
> Fedora/RedHat/SUSE/... It's really important to have your GLIBC
> compiled for your processor. It is essencial for performance.
>
> I can't imagine anyone buying a $1k-dollar quad-core XEON and using an
> i585 compatible distro that doesn't even know what the fudge is
> SSE/SSE2/vectorized instructions.
>
> Best regards,
> Daniel Colchete
>
> On 12/10/06, Gene <genekhart@gmail.com> wrote:
>> I have a similar type application, I'm partitioning using constraint
>> exclusion so queries only have to look at a few tables. I've found
>> that
>> there is some overhead to using partitioning so you should test to
>> see how
>> many partitions you want to create. Could I check out you
>> postgresql.conf
>> parameters to compare? thanks
>>
>>
>> Gene Hart

>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match
>



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-19-2008, 09:52 AM
Alexander Staubo
 
Posts: n/a
Default Re: New to PostgreSQL, performance considerations

On Dec 11, 2006, at 02:47 , Daniel van Ham Colchete wrote:

> I never understood what's the matter between the ASCII/ISO-8859-1/UTF8
> charsets to a database. They're all simple C strings that doesn't have
> the zero-byte in the midlle (like UTF16 would) and that doesn't
> require any different processing unless you are doing case insensitive
> search (them you would have a problem).


That's not the whole story. UTF-8 and other variable-width encodings
don't provide a 1:1 mapping of logical characters to single bytes; in
particular, combination characters opens the possibility of multiple
different byte sequences mapping to the same code point; therefore,
string comparison in such encodings generally cannot be done at the
byte level (unless, of course, you first acertain that the strings
involved are all normalized to an unambiguous subset of your encoding).

PostgreSQL's use of strings is not limited to string comparison.
Substring extraction, concatenation, regular expression matching, up/
downcasing, tokenization and so on are all part of PostgreSQL's small
library of text manipulation functions, and all deal with logical
characters, meaning they must be Unicode-aware.

Alexander.

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 05:36 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com