Unix Technical Forum

general PG network slowness (possible cure) (repost)

This is a discussion on general PG network slowness (possible cure) (repost) within the Pgsql Performance forums, part of the PostgreSQL category; --> I set up pg to replace a plain gdbm database for my application. But even running to the same ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 10:51 AM
Peter T. Breuer
 
Posts: n/a
Default general PG network slowness (possible cure) (repost)


I set up pg to replace a plain gdbm database for my application. But
even running to the same machine, via a unix socket

* the pg database ran 100 times slower

Across the net it was

* about 500 to 1000 times slower than local gdbm

with no cpu use to speak of.

I'd heard that networked databases are slow. I might have left it at
that if curiosity hadn't led me to write a network server for gdbm
databases, and talk to _that_ just to get a comparison.

Lo and behold and smack me with a corncob if it wasn't _slower_ than pg.

On a whim I mapped the network bandwidth per packet size with the NPtcp
suite, and got surprising answers .. at 1500B, naturally, the bandwidth
was the full 10Mb/s (minus overheads, say 8.5Mb/s) of my pathetic little
local net. At 100B the bandwidth available was only 25Kb/s. At 10B,
you might as well use tin cans and taut string instead.

I also mapped the network flows using ntop, and yes, the average packet
size for both gdbm and pg in one direction was only about 100B or
so. That's it! Clearly there are a lot of short queries going out and
the answers were none too big either ( I had a LIMIT 1 in all my PG
queries).

About 75% of traffic was in the 64-128B range while my application was
running, with the peak bandwidth in that range being about 75-125Kb/s
(and I do mean bits, not bytes).

Soooo ... I took a look at my implementation of remote gdbm, and did
a very little work to aggregate outgoing transmissions together into
lumps. Three lines added in two places. At the level of the protocol
where I could tell how long the immediate conversation segment would be,
I "corked" the tcp socket before starting the segment and "uncorked" it
after the segment (for "cork", see tcp(7), setsockopt(2) and TCP_CORK in
linux).

Surprise, ... I got a speed up of hundreds of times. The same application
that crawled under my original rgdbm implementation and under PG now
maxed out the network bandwidth at close to a full 10Mb/s and 1200
pkts/s, at 10% CPU on my 700MHz client, and a bit less on the 1GHz
server.

So

* Is that what is holding up postgres over the net too? Lots of tiny
packets?

And if so

* can one fix it the way I fixed it for remote gdbm?

The speedup was hundreds of times. Can someone point me at the relevant
bits of pg code? A quick look seems to say that fe-*.c is
interesting. I need to find where the actual read and write on the
conn->sock is done.

Very illuminating gnuplot outputs available on request.

Peter


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 10:51 AM
Richard Huxton
 
Posts: n/a
Default Re: general PG network slowness (possible cure) (repost)

Peter T. Breuer wrote:
> I set up pg to replace a plain gdbm database for my application. But
> even running to the same machine, via a unix socket
>
> * the pg database ran 100 times slower


For what operations? Bulk reads? 19-way joins?

> Across the net it was
>
> * about 500 to 1000 times slower than local gdbm
>
> with no cpu use to speak of.


Disk-intensive or memory intensive?

> I'd heard that networked databases are slow. I might have left it at
> that if curiosity hadn't led me to write a network server for gdbm
> databases, and talk to _that_ just to get a comparison.
>
> Lo and behold and smack me with a corncob if it wasn't _slower_ than pg.
>
> On a whim I mapped the network bandwidth per packet size with the NPtcp
> suite, and got surprising answers .. at 1500B, naturally, the bandwidth
> was the full 10Mb/s (minus overheads, say 8.5Mb/s) of my pathetic little
> local net. At 100B the bandwidth available was only 25Kb/s. At 10B,
> you might as well use tin cans and taut string instead.


This sounds like you're testing a single connection. You would expect
"dead time" to dominate in that scenario. What happens when you have 50
simultaneous connections? Or do you think it's just packet overhead?

> I also mapped the network flows using ntop, and yes, the average packet
> size for both gdbm and pg in one direction was only about 100B or
> so. That's it! Clearly there are a lot of short queries going out and
> the answers were none too big either ( I had a LIMIT 1 in all my PG
> queries).


I'm not sure that 100B query-results are usually the bottleneck.
Why would you have LIMIT 1 on all your queries?

> About 75% of traffic was in the 64-128B range while my application was
> running, with the peak bandwidth in that range being about 75-125Kb/s
> (and I do mean bits, not bytes).


None of this sounds like typical database traffic to me. Yes, there are
lots of small result-sets, but there are also typically larger (several
kilobytes) to much larger (10s-100s KB).

> Soooo ... I took a look at my implementation of remote gdbm, and did
> a very little work to aggregate outgoing transmissions together into
> lumps. Three lines added in two places. At the level of the protocol
> where I could tell how long the immediate conversation segment would be,
> I "corked" the tcp socket before starting the segment and "uncorked" it
> after the segment (for "cork", see tcp(7), setsockopt(2) and TCP_CORK in
> linux).


I'm a bit puzzled, because I'd have thought the standard Nagle algorithm
would manage this gracefully enough for short-query cases. There's no
way (that I know of) for a backend to handle more than one query at a time.

> Surprise, ... I got a speed up of hundreds of times. The same application
> that crawled under my original rgdbm implementation and under PG now
> maxed out the network bandwidth at close to a full 10Mb/s and 1200
> pkts/s, at 10% CPU on my 700MHz client, and a bit less on the 1GHz
> server.
>
> So
>
> * Is that what is holding up postgres over the net too? Lots of tiny
> packets?


I'm not sure your setup is typical, interesting though the figures are.
Google a bit for pg_bench perhaps and see if you can reproduce the
effect with a more typical load. I'd be interested in being proved wrong.

> And if so
>
> * can one fix it the way I fixed it for remote gdbm?
>
> The speedup was hundreds of times. Can someone point me at the relevant
> bits of pg code? A quick look seems to say that fe-*.c is
> interesting. I need to find where the actual read and write on the
> conn->sock is done.


You'll want to look in backend/libpq and interfaces/libpq I think
(although I'm not a developer).

--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-19-2008, 10:51 AM
Steinar H. Gunderson
 
Posts: n/a
Default Re: general PG network slowness (possible cure) (repost)

On Fri, May 25, 2007 at 10:50:58AM +0200, Peter T. Breuer wrote:
> I set up pg to replace a plain gdbm database for my application.


Postgres and gdbm are completely different. You want to rethink your queries
so each does more work, instead of running a zillion of them over the network.

/* Steinar */
--
Homepage: http://www.sesse.net/

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-19-2008, 10:51 AM
Peter T. Breuer
 
Posts: n/a
Default Re: general PG network slowness (possible cure) (repost)

"Also sprach Richard Huxton:"
[Charset ISO-8859-1 unsupported, filtering to ASCII...]
> Peter T. Breuer wrote:
> > I set up pg to replace a plain gdbm database for my application. But
> > even running to the same machine, via a unix socket
> >
> > * the pg database ran 100 times slower

>
> For what operations? Bulk reads? 19-way joins?


The only operations being done are simple "find the row with this key",
or "update the row with this key". That's all. The queries are not an
issue (though why the PG thread choose to max out cpu when it gets the
chance to do so through a unix socket, I don't know).

> > Across the net it was
> >
> > * about 500 to 1000 times slower than local gdbm
> >
> > with no cpu use to speak of.

>
> Disk-intensive or memory intensive?


There is no disk as such... it's running on a ramdisk at the server
end. But assuming you mean i/o, i/o was completely stalled. Everything
was idle, all waiting on the net.

> > On a whim I mapped the network bandwidth per packet size with the NPtcp
> > suite, and got surprising answers .. at 1500B, naturally, the bandwidth
> > was the full 10Mb/s (minus overheads, say 8.5Mb/s) of my pathetic little
> > local net. At 100B the bandwidth available was only 25Kb/s. At 10B,
> > you might as well use tin cans and taut string instead.

>
> This sounds like you're testing a single connection. You would expect
> "dead time" to dominate in that scenario. What happens when you have 50


Indeed, it is single, because that's my application. I don't have
50 simultaneous connections. The use of the database is as a permanent
storage area for the results of previous analyses (static analysis of
the linux kernel codes) from a single client.

Multiple threads accessing at the same time might help keep the network
drivers busier, which would help. They would always see their buffers
filling at an even rate and be able to send out groups of packets at
once.

> simultaneous connections? Or do you think it's just packet overhead?


It's not quite overhead in the sense of the logical layer. It's a
physical layer thing. I replied in another mail on this thread, but in
summary, tcp behaves badly with small packets on ethernet, even on a
dedicated line (as this was). One needs to keep it on a tight rein.

> > I also mapped the network flows using ntop, and yes, the average packet
> > size for both gdbm and pg in one direction was only about 100B or
> > so. That's it! Clearly there are a lot of short queries going out and
> > the answers were none too big either ( I had a LIMIT 1 in all my PG
> > queries).

>
> I'm not sure that 100B query-results are usually the bottleneck.
> Why would you have LIMIT 1 on all your queries?


Because there is always only one answer to the query, according to the
logic. So I can always tell the database manager to stop looking after
one, which will always help it.

> > About 75% of traffic was in the 64-128B range while my application was
> > running, with the peak bandwidth in that range being about 75-125Kb/s
> > (and I do mean bits, not bytes).

>
> None of this sounds like typical database traffic to me. Yes, there are
> lots of small result-sets, but there are also typically larger (several
> kilobytes) to much larger (10s-100s KB).


There's none here.

> > Soooo ... I took a look at my implementation of remote gdbm, and did
> > a very little work to aggregate outgoing transmissions together into
> > lumps. Three lines added in two places. At the level of the protocol
> > where I could tell how long the immediate conversation segment would be,
> > I "corked" the tcp socket before starting the segment and "uncorked" it
> > after the segment (for "cork", see tcp(7), setsockopt(2) and TCP_CORK in
> > linux).

>
> I'm a bit puzzled, because I'd have thought the standard Nagle algorithm
> would manage this gracefully enough for short-query cases. There's no


On the contrary, Nagle is also often wrong here because it will delay
sending in order to accumulate more data into buffers when only a little
has arrived, then give up when no more data arrives to be sent out, then
send out the (short) packet anyway, late. There's no other traffic
apart from my (single thread) application.

What we want is to direct the sending exactly,n this situation saying
when to not send, and when to send. Disable Nagle for a start, use
async read (noblock), and sync write, with sends from the socket blocked
from initiation of a message until the whole message is ready to be sent
out. Sending the message piecemeal just hurts too.

> way (that I know of) for a backend to handle more than one query at a time.


That's not the scenario.

> > Surprise, ... I got a speed up of hundreds of times. The same application
> > that crawled under my original rgdbm implementation and under PG now
> > maxed out the network bandwidth at close to a full 10Mb/s and 1200
> > pkts/s, at 10% CPU on my 700MHz client, and a bit less on the 1GHz
> > server.
> >
> > So
> >
> > * Is that what is holding up postgres over the net too? Lots of tiny
> > packets?

>
> I'm not sure your setup is typical, interesting though the figures are.
> Google a bit for pg_bench perhaps and see if you can reproduce the
> effect with a more typical load. I'd be interested in being proved wrong.


But the load is typical HERE. The application works well against gdbm
and I was hoping to see speedup from using a _real_ full-fledged DB
instead.

Well, at least it's very helpful for debugging.

> > And if so
> >
> > * can one fix it the way I fixed it for remote gdbm?
> >
> > The speedup was hundreds of times. Can someone point me at the relevant
> > bits of pg code? A quick look seems to say that fe-*.c is
> > interesting. I need to find where the actual read and write on the
> > conn->sock is done.

>
> You'll want to look in backend/libpq and interfaces/libpq I think
> (although I'm not a developer).


I'll look around there. Specific directions are greatly
appreciated.

Thanks.

Peter

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-19-2008, 10:51 AM
Richard Huxton
 
Posts: n/a
Default Re: general PG network slowness (possible cure) (repost)

Peter T. Breuer wrote:
>
> The only operations being done are simple "find the row with this key",
> or "update the row with this key". That's all. The queries are not an
> issue (though why the PG thread choose to max out cpu when it gets the
> chance to do so through a unix socket, I don't know).


> There is no disk as such... it's running on a ramdisk at the server
> end. But assuming you mean i/o, i/o was completely stalled. Everything
> was idle, all waiting on the net.


> Indeed, it is single, because that's my application. I don't have
> 50 simultaneous connections. The use of the database is as a permanent
> storage area for the results of previous analyses (static analysis of
> the linux kernel codes) from a single client.


>> I'm not sure your setup is typical, interesting though the figures are.
>> Google a bit for pg_bench perhaps and see if you can reproduce the
>> effect with a more typical load. I'd be interested in being proved wrong.

>
> But the load is typical HERE. The application works well against gdbm
> and I was hoping to see speedup from using a _real_ full-fledged DB
> instead.


I'm not sure you really want a full RDBMS. If you only have a single
connection and are making basic key-lookup queries then 90% of
PostgreSQL's code is just getting in your way. Sounds to me like gdbm
(or one of its alternatives) is a good match for you. Failing that,
sqlite is probably the next lowest-overhead solution.

Of course, if you want to have multiple clients interacting and
performing complex 19-way joins on gigabyte-sized tables with full-text
indexing and full transaction control then you *do* want a RDBMS.

--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-19-2008, 10:51 AM
Peter T. Breuer
 
Posts: n/a
Default Re: general PG network slowness (possible cure) (repost)

"Also sprach Richard Huxton:"
> I'm not sure you really want a full RDBMS. If you only have a single
> connection and are making basic key-lookup queries then 90% of
> PostgreSQL's code is just getting in your way. Sounds to me like gdbm


Yep - I could happily tell it not to try and compile a special lookup
scheme each time, for example! (how that?). I could presumably also
help it by preloading the commands I will run and sending over the
params only with a "do a no. 17 now!".

> (or one of its alternatives) is a good match for you. Failing that,
> sqlite is probably the next lowest-overhead solution.


Not a bad idea. but PG _will_ be useful when folk come to analyse the
result of the analyses being done. What is slow is getting the data
into the database now via simple store, fetch and update.

> Of course, if you want to have multiple clients interacting and
> performing complex 19-way joins on gigabyte-sized tables with full-text


Well, the dbs are in the tens of MB from a single run over a single
file (i.e analysis of a single 30KLOC source). The complete analysis
space is something like 4000 times that, for 4300 C files in the linux
kernel source. And then there is all the linux kernel versions. Then
there is godzilla and apache source ..

> indexing and full transaction control then you *do* want a RDBMS.


We want one anyway. The problem is filling the data and the simple
fetch and update queries on it.

I really think it would be worthwhile getting some developer to tell me
where the network send is done in PG.

Peter

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-19-2008, 10:51 AM
Tom Lane
 
Posts: n/a
Default Re: general PG network slowness (possible cure) (repost)

"Peter T. Breuer" <ptb@inv.it.uc3m.es> writes:
> Soooo ... I took a look at my implementation of remote gdbm, and did
> a very little work to aggregate outgoing transmissions together into
> lumps.


We do that already --- for a simple query/response such as you are
describing, each query cycle will involve one physical client->server
message followed by one physical server->client message. The only way
to aggregate more is for the application code to merge queries together.

Migrating a dbm-style application to a SQL database is often a real
pain, precisely because the application is designed to a mindset of
"fetch one record, manipulate it, update it", where "fetch" and "update"
are assumed to be too stupid to do any of the work for you. The way
to get high performance with a SQL engine is to push as much of the work
as you can to the database side, and let the engine process multiple
records per query; and that can easily mean rewriting the app from the
ground up :-(

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 04-19-2008, 10:51 AM
Richard Huxton
 
Posts: n/a
Default Re: general PG network slowness (possible cure) (repost)

Peter T. Breuer wrote:
> "Also sprach Richard Huxton:"
>> I'm not sure you really want a full RDBMS. If you only have a single
>> connection and are making basic key-lookup queries then 90% of
>> PostgreSQL's code is just getting in your way. Sounds to me like gdbm

>
> Yep - I could happily tell it not to try and compile a special lookup
> scheme each time, for example! (how that?). I could presumably also
> help it by preloading the commands I will run and sending over the
> params only with a "do a no. 17 now!".


PREPARE/EXECUTE (or the equivalent libpq functions).
Also - if you can have multiple connections to the DB you should be able
to have several queries running at once.

>> (or one of its alternatives) is a good match for you. Failing that,
>> sqlite is probably the next lowest-overhead solution.

>
> Not a bad idea. but PG _will_ be useful when folk come to analyse the
> result of the analyses being done. What is slow is getting the data
> into the database now via simple store, fetch and update.


I'd have an hourly/daily bulk-load running from the simple system into
PG. If you have to search all the data from your app that's not
practical of course.

>> Of course, if you want to have multiple clients interacting and
>> performing complex 19-way joins on gigabyte-sized tables with full-text

>
> Well, the dbs are in the tens of MB from a single run over a single
> file (i.e analysis of a single 30KLOC source). The complete analysis
> space is something like 4000 times that, for 4300 C files in the linux
> kernel source. And then there is all the linux kernel versions. Then
> there is godzilla and apache source ..


If you're doing some sort of token analysis on source-code you probably
want to look into how tsearch2 / trigram / Gist+GIN indexes work. It
might be that you're doing work in your app that the DB can handle for you.

>> indexing and full transaction control then you *do* want a RDBMS.

>
> We want one anyway. The problem is filling the data and the simple
> fetch and update queries on it.


OK

> I really think it would be worthwhile getting some developer to tell me
> where the network send is done in PG.


--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 04-19-2008, 10:51 AM
Alvaro Herrera
 
Posts: n/a
Default Re: general PG network slowness (possible cure) (repost)

Peter T. Breuer escribió:

> I really think it would be worthwhile getting some developer to tell me
> where the network send is done in PG.


See src/backend/libpq/pqcomm.c (particularly internal_flush()).

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 04-19-2008, 10:51 AM
Peter T. Breuer
 
Posts: n/a
Default Re: general PG network slowness (possible cure) (repost)

"Also sprach Alvaro Herrera:"
> > I really think it would be worthwhile getting some developer to tell me
> > where the network send is done in PG.

>
> See src/backend/libpq/pqcomm.c (particularly internal_flush()).


Yes. Thanks. That looks like it. It calls secure_write continually
until the buffer is empty.

Secure_write is located ibe-secure.c, but I'm not using ssl, so the
call reduces to just

n = send(port->sock, ptr, len, 0);

And definitely all those could be grouped if there are several to do.
But under normal circumstances the send will be pushing against a
lttle resistance (the copy to the driver/protocol stack buffer is faster
than the physical network send, by a ratio of GB/s to MB/s, or 1000 to
1), and thus all these sends will probably complete as a single unit
once they have been started.

It's worth a try. I thought first this may be too low level, but it
looks as though internal_flush is only triggered when some other buffer
is full, or deliberately, so it may be useful to block until it fires.

I'll try it.


Peter

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 05:13 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com