Unix Technical Forum

Very long SQL strings

This is a discussion on Very long SQL strings within the Pgsql Performance forums, part of the PostgreSQL category; --> I can't seem to find a definitive answer to this. It looks like Postgres does not enforce a limit ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-19-2008, 11:04 AM
Steven Flatt
 
Posts: n/a
Default Very long SQL strings

I can't seem to find a definitive answer to this.

It looks like Postgres does not enforce a limit on the length of an SQL
string. Great. However is there some point at which a query string becomes
ridiculously too long and affects performance? Here's my particular case:
consider an INSERT statement where you're using the new multi-row VALUES
clause or SELECT ... UNION ALL to group together tuples. Is it always
better to group as many together as possible?

For example, on a toy table with two columns, I noticed about a 20% increase
when bulking together 1000 tuples in one INSERT statement as opposed to
doing 1000 individual INSERTS. Would this be the same for 10000? 100000?
Does it depend on the width of the tuples or the data types?

Are there any values A and B such that grouping together A tuples and B
tuples separately and running two statements, will be faster than grouping
A+B tuples in one statement?

Steve

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-19-2008, 11:05 AM
Heikki Linnakangas
 
Posts: n/a
Default Re: Very long SQL strings

Steven Flatt wrote:
> It looks like Postgres does not enforce a limit on the length of an SQL
> string. Great. However is there some point at which a query string
> becomes
> ridiculously too long and affects performance? Here's my particular case:
> consider an INSERT statement where you're using the new multi-row VALUES
> clause or SELECT ... UNION ALL to group together tuples. Is it always
> better to group as many together as possible?


I'm sure you'll reach a point of diminishing returns, and eventually a
ceiling where you run out of memory etc, but I don't know what the limit
would be.

The most efficient way to do bulk inserts is to stream the data with COPY.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-19-2008, 11:05 AM
Andreas Kretschmer
 
Posts: n/a
Default Re: Very long SQL strings

Steven Flatt <steven.flatt@gmail.com> schrieb:
> For example, on a toy table with two columns, I noticed about a 20% increase
> when bulking together 1000 tuples in one INSERT statement as opposed to doing
> 1000 individual INSERTS. Would this be the same for 10000? 100000? Does it
> depend on the width of the tuples or the data types?


I guess you can obtain the same if you pack all INSERTs into one
transaction.

And, faster than INSERT: COPY.


Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect. (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly." (unknow)
Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-19-2008, 11:05 AM
Tom Lane
 
Posts: n/a
Default Re: Very long SQL strings

"Steven Flatt" <steven.flatt@gmail.com> writes:
> It looks like Postgres does not enforce a limit on the length of an SQL
> string. Great. However is there some point at which a query string becomes
> ridiculously too long and affects performance?


Yes, but it'll depend a whole lot on context; I'd suggest
experimentation if you want to derive a number for your particular
situation. For starters, whether you are on 32- or 64-bit hardware
is hugely relevant.

FYI, when we developed multi-row-VALUES quite a bit of thought was
put into maintaining performance with lots of rows, and IIRC we saw
reasonable performance up into the tens of thousands of rows (depending
on how wide the rows are). Other ways of making a query long, such as
lots of WHERE clauses, might send performance into the tank a lot
quicker.

So the short answer is it all depends.

regards, tom lane

PS: for the record, there is a hard limit at 1GB of query text, owing
to restrictions built into palloc. But I think you'd hit other
memory limits or performance bottlenecks before that one.

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 04-19-2008, 11:05 AM
Chris Browne
 
Posts: n/a
Default Re: Very long SQL strings

tgl@sss.pgh.pa.us (Tom Lane) writes:
> PS: for the record, there is a hard limit at 1GB of query text, owing
> to restrictions built into palloc. But I think you'd hit other
> memory limits or performance bottlenecks before that one.


It would be much funnier to set a hard limit of 640K of query text.
The reasoning should be obvious :-).

I once ran into the situation where Slony-I generated a query that
made the parser blow out (some sort of memory problem / running out of
stack space somewhere thing); it was just short of 640K long, and so
we figured that evidently it was wrong to conclude that "640K ought to
be enough for anybody."

Neil Conway was an observer; he was speculating that, with some
(possibly nontrivial) change to the parser, we should have been able
to cope with it.

The query consisted mostly of a NOT IN clause where the list had some
atrocious number of entries in it (all integers).

(Aside: I wound up writing a "query compressor" (now in 1.2) which
would read that list and, if it was at all large, try to squeeze any
sets of consecutive integers into sets of "NOT BETWEEN" clauses.
Usually, the lists, of XIDs, were more or less consecutive, and
frequently, in the cases where the query got to MBs in size, there
would be sets of hundreds or even thousands of consecutive integers
such that we'd be left with a tiny query after this...)
--
select 'cbbrowne' || '@' || 'linuxfinances.info';
http://linuxfinances.info/info/linux.html
As of next Monday, MACLISP will no longer support list structure.
Please downgrade your programs.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 04-19-2008, 11:05 AM
Steven Flatt
 
Posts: n/a
Default Re: Very long SQL strings

Thanks everyone for your responses. I don't think it's realistic to change
our application infrastructure to use COPY from a stream at this point.
It's good to know that multi-row-VALUES is good up into the thousands of
rows (depending on various things, of course). That's a good enough answer
for what I was looking for and we can revisit this if performance does start
to hurt.

On 6/21/07, Andreas Kretschmer <akretschmer@spamfence.net> wrote:
>
> I guess you can obtain the same if you pack all INSERTs into one
> transaction.



Well the 20% gain I referred to was when all individual INSERTs were within
one transaction. When each INSERT does its own commit, it's significantly
slower.

Steve

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 04-19-2008, 11:05 AM
Tom Lane
 
Posts: n/a
Default Re: Very long SQL strings

Chris Browne <cbbrowne@acm.org> writes:
> I once ran into the situation where Slony-I generated a query that
> made the parser blow out (some sort of memory problem / running out of
> stack space somewhere thing); it was just short of 640K long, and so
> we figured that evidently it was wrong to conclude that "640K ought to
> be enough for anybody."


> Neil Conway was an observer; he was speculating that, with some
> (possibly nontrivial) change to the parser, we should have been able
> to cope with it.


> The query consisted mostly of a NOT IN clause where the list had some
> atrocious number of entries in it (all integers).


FWIW, we do seem to have improved that as of 8.2. Assuming your entries
were 6-or-so-digit integers, that would have been on the order of 80K
entries, and we can manage it --- not amazingly fast, but it doesn't
blow out the stack anymore.

> (Aside: I wound up writing a "query compressor" (now in 1.2) which
> would read that list and, if it was at all large, try to squeeze any
> sets of consecutive integers into sets of "NOT BETWEEN" clauses.
> Usually, the lists, of XIDs, were more or less consecutive, and
> frequently, in the cases where the query got to MBs in size, there
> would be sets of hundreds or even thousands of consecutive integers
> such that we'd be left with a tiny query after this...)


Probably still a win.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 06:17 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com