Unix Technical Forum

Insertion to temp table deteriorating over time

This is a discussion on Insertion to temp table deteriorating over time within the Pgsql Performance forums, part of the PostgreSQL category; --> Our application is such that there is a great deal of activity at the beginning of the hour and ...


Go Back   Unix Technical Forum > Database Server Software > PostgreSQL > Pgsql Performance

FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply

 

LinkBack Thread Tools Display Modes
  #11 (permalink)  
Old 04-19-2008, 09:57 AM
Steven Flatt
 
Posts: n/a
Default Re: Insertion to temp table deteriorating over time

Our application is such that there is a great deal of activity at the
beginning of the hour and minimal activity near the end of the hour. Those
3 vacuums were done at (approximately) 30 minutes past, 40 minutes past, and
50 minutes past the hour, during low activity. Vacuums of pg_class look
like they're being done on the hour. So it's not surprising that the first
vacuum found a lot of dead rows while the latter two found very few.

In fact, I just did another vacuum (about 30 minutes past the hour again)
and got:

INFO: "pg_class": found 5490 removable, 3263 nonremovable row versions in
171 pages
DETAIL: 0 dead row versions cannot be removed yet.

.... and clearly a vacuum was done under an hour ago.

The truncate and re-fill process is done once per hour, at the end of the
high-load cycle, so I doubt that's even a big contributor to the number of
removable rows in pg_class.

For this particular setup, we expect high load for 10-15 minutes at the
beginning of the hour, which is the case when a new connection is
initialized. After a day or so (as is happening right now), the high-load
period spills into the second half of the hour. Within 3-4 days, we start
spilling into the next hour and, as you can imagine, everything gets behind
and we spiral down from there. For now, our workaround is to manually kill
the connection every few days, but I would like a better solution than
setting up a cron job to do this!

Thanks again,
Steve


On 12/15/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Hm, look at the numbers of rows removed:
>
> > INFO: "pg_class": found 5680 removable, 3263 nonremovable row versions

> in
> > 625 pages
> > DETAIL: 0 dead row versions cannot be removed yet.

>
> > INFO: "pg_class": found 24 removable, 3263 nonremovable row versions in

> 625
> > pages
> > DETAIL: 0 dead row versions cannot be removed yet.

>
> > INFO: "pg_class": found 0 removable, 3263 nonremovable row versions in

> 625
> > pages
> > DETAIL: 0 dead row versions cannot be removed yet.

>
> The lack of unremovable dead rows is good, but why were there so many
> dead rows the first time? You didn't say what the cycle time is on your
> truncate-and-refill process, but the last two suggest that the average
> rate of accumulation of dead pg_class rows is only a couple per minute,
> in which case it's been a lot longer than an hour since the previous
> VACUUM of pg_class. I'm back to suspecting that you don't vacuum
> pg_class regularly. You mentioned having an hourly cron job to fire off
> vacuums ... are you sure it's run as a database superuser?
>
> regards, tom lane
>


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #12 (permalink)  
Old 04-19-2008, 09:57 AM
Tom Lane
 
Posts: n/a
Default Re: Insertion to temp table deteriorating over time

"Steven Flatt" <steven.flatt@gmail.com> writes:
> Our application is such that there is a great deal of activity at the
> beginning of the hour and minimal activity near the end of the hour.


OK ...

> The truncate and re-fill process is done once per hour, at the end of the
> high-load cycle, so I doubt that's even a big contributor to the number of
> removable rows in pg_class.


Oh, then where *are* the removable rows coming from? At this point I
think that the truncate/refill thing is not the culprit, or at any rate
is only one part of a problematic usage pattern that we don't see all of
yet.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #13 (permalink)  
Old 04-19-2008, 09:57 AM
Steven Flatt
 
Posts: n/a
Default Re: Insertion to temp table deteriorating over time

Good question, and I agree with your point.

Are the removable rows in pg_class even an issue? So what if 5000-6000 dead
tuples are generated every hour then vacuumed? Performance continues to
steadily decline over a few days time. Memory usage does not appear to be
bloating. Open file handles remain fairly fixed. Is there anything else I
can monitor (perhaps something to do with the odbc connection) that I could
potentially correlate with the degrading performance?

Steve


On 12/15/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Oh, then where *are* the removable rows coming from? At this point I
> think that the truncate/refill thing is not the culprit, or at any rate
> is only one part of a problematic usage pattern that we don't see all of
> yet.
>
> regards, tom lane
>


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #14 (permalink)  
Old 04-19-2008, 09:57 AM
Tom Lane
 
Posts: n/a
Default Re: Insertion to temp table deteriorating over time

"Steven Flatt" <steven.flatt@gmail.com> writes:
> Are the removable rows in pg_class even an issue? So what if 5000-6000 dead
> tuples are generated every hour then vacuumed? Performance continues to
> steadily decline over a few days time. Memory usage does not appear to be
> bloating. Open file handles remain fairly fixed. Is there anything else I
> can monitor (perhaps something to do with the odbc connection) that I could
> potentially correlate with the degrading performance?


At this point I think the most productive thing for you to do is to try
to set up a self-contained test case that reproduces the slowdown. That
would allow you to poke at it without disturbing your production system,
and would let other people look at it too. From what you've said, I'd
try a simple little program that inserts some data into a temp table,
truncates the table, and repeats, as fast as it can, using the same SQL
commands as your real code and similar but dummy data. It shouldn't
take long to observe the slowdown if it occurs. If you can't reproduce
it in isolation then we'll know that some other part of your application
environment is contributing to the problem; if you can, I'd be happy to
look at the test case with gprof or oprofile and find out exactly what's
going on.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #15 (permalink)  
Old 04-19-2008, 09:57 AM
Steven Flatt
 
Posts: n/a
Default Re: Insertion to temp table deteriorating over time

I've been trying to reproduce the problem for days now . I've done pretty
much exactly what you describe below, but I can't reproduce the problem on
any of our lab machines. Something is indeed special in this environment.

Thanks for all your help,

Steve


On 12/15/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> "Steven Flatt" <steven.flatt@gmail.com> writes:
> > Are the removable rows in pg_class even an issue? So what if 5000-6000

> dead
> > tuples are generated every hour then vacuumed? Performance continues to
> > steadily decline over a few days time. Memory usage does not appear to

> be
> > bloating. Open file handles remain fairly fixed. Is there anything

> else I
> > can monitor (perhaps something to do with the odbc connection) that I

> could
> > potentially correlate with the degrading performance?

>
> At this point I think the most productive thing for you to do is to try
> to set up a self-contained test case that reproduces the slowdown. That
> would allow you to poke at it without disturbing your production system,
> and would let other people look at it too. From what you've said, I'd
> try a simple little program that inserts some data into a temp table,
> truncates the table, and repeats, as fast as it can, using the same SQL
> commands as your real code and similar but dummy data. It shouldn't
> take long to observe the slowdown if it occurs. If you can't reproduce
> it in isolation then we'll know that some other part of your application
> environment is contributing to the problem; if you can, I'd be happy to
> look at the test case with gprof or oprofile and find out exactly what's
> going on.
>
> regards, tom lane
>


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #16 (permalink)  
Old 04-19-2008, 09:58 AM
Tom Lane
 
Posts: n/a
Default Re: Insertion to temp table deteriorating over time

"Steven Flatt" <steven.flatt@gmail.com> writes:
> I can fix this problem by using CREATE TEMP TABLE ... LIKE instead of CREATE
> TEMP TABLE ... AS.


That seems ... um ... bizarre. Now are you able to put together a
self-contained test case? Seems like we could have two independent bugs
here: first, why (and how) is the temp table different, and second how
does that result in the observed performance problem.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #17 (permalink)  
Old 04-19-2008, 09:58 AM
Steven Flatt
 
Posts: n/a
Default Re: Insertion to temp table deteriorating over time

I have an update on this.

The reason I couldn't reproduce this problem was because of the way I was
creating the temp table in my tests. I was using:

CREATE TEMP TABLE tmp (LIKE perm);

This did not observe performance degradation over time.

However, the way our application was creating this table (something I should
have observed sooner, no doubt) is:

CREATE TEMP TABLE tmp AS SELECT <column-list> FROM perm LIMIT 0;

This, on its own however, is not enough to reproduce the problem. Next
imagine that perm is actually a view, which is defined as a UNION ALL SELECT
from several other views, and those views are also defined as UNION ALL
SELECTs from a bunch of permanent tables. All views have insert rules
redirecting rows according to some criteria. The whole structure is pretty
convoluted.

I can fix this problem by using CREATE TEMP TABLE ... LIKE instead of CREATE
TEMP TABLE ... AS.

I'm still curious about the root cause of this problem. From the docs, I
see that CREATE TABLE AS evaluates the query just once to create the table,
but based on what I'm seeing, I'm wondering whether this isn't truly the
case. Are there any known issues with CREATE TABLE AS when the table you're
creating is temporary and you're selecting from a view?

Steve


On 12/15/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> "Steven Flatt" <steven.flatt@gmail.com> writes:
> > I've been trying to reproduce the problem for days now . I've done

> pretty
> > much exactly what you describe below, but I can't reproduce the problem

> on
> > any of our lab machines. Something is indeed special in this

> environment.
>
> Yuck. You could try strace'ing the problem backend and see if anything
> is visibly different between fast and slow operation. I don't suppose
> you have oprofile on that machine, but if you did it'd be even better.
>
> regards, tom lane
>


Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #18 (permalink)  
Old 04-19-2008, 09:58 AM
Steven Flatt
 
Posts: n/a
Default Re: Insertion to temp table deteriorating over time

Please ignore my post from earlier today. As strange as it sounds, changing
"CREATE TEMP TABLE ... AS" to "CREATE TEMP TABLE ... LIKE" appeared to fix
my performance problem because things errored out so quickly (and silently
in my test program). After checking the pgsql logs, it became clear to me
that you can't use LIKE on a view. Duh.

Moving forward, I have also discovered that our temp table did in fact have
a varchar column (no specified limit on varchar). With this in mind, I
could easily reproduce the problem on a temp table with one column. So...

Issue #1:

(I'm assuming there's a reasonable explanation for this.) If I create a
temp table with a single varchar column (or text column), do 100 inserts to
that table, copy to a permanent table, truncate the temp table and repeat,
the time required for the 100 inserts grows almost linearly. Maybe the data
is treated as large objects.

Note that if I change the column type to varchar(SOME_LIMIT), integer,
timestamptz, interval, etc., performance does not degrade. Also note that
if I do not use a temp table (but do use a varchar column), inserts are
slower (as expected) but do not degrade over time. So this seems to be
specific to temp tables with varchar/text column(s).

Issue #2:

As I said earlier, the temp table is created via:

CREATE TEMP TABLE tmp AS SELECT <column-list> FROM perm LIMIT 0;

where perm is a view defined as follows:

View definition:
SELECT <column-list>
FROM view2
JOIN tbl USING (col1, col2)
WHERE <some-conditions>
UNION ALL
SELECT <column-list>
FROM view3
JOIN tbl USING (col1, col2)
WHERE <some-conditions>;

Now the varchar columns that end up in the perm view come from the tbl
table, but in tbl, they are defined as varchar(40). Somehow the 40 limit is
lost when constructing the view. After a little more testing, I found that
this problem only occurs when you are creating a view (i.e. CREATE TABLE ...
AS does not observe this problem) and also that the UNION ALL clause must be
present to observe this problem.

This looks like a bug. I know this is Postgres 7.4.6 and I haven't been
able to verify with a later version of Postgres, but does this look familiar
to anyone?

Steve

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #19 (permalink)  
Old 04-19-2008, 09:58 AM
Tom Lane
 
Posts: n/a
Default Re: Insertion to temp table deteriorating over time

"Steven Flatt" <steven.flatt@gmail.com> writes:
> Issue #1:


> (I'm assuming there's a reasonable explanation for this.) If I create a
> temp table with a single varchar column (or text column), do 100 inserts to
> that table, copy to a permanent table, truncate the temp table and repeat,
> the time required for the 100 inserts grows almost linearly.


I still can't reproduce this. Using 7.4 branch tip, I did

create temp table foo(f1 varchar);
create table nottemp(f1 varchar);
\timing
insert into foo select stringu1 from tenk1 limit 100; insert into nottemp select * from foo; truncate foo;
insert into foo select stringu1 from tenk1 limit 100; insert into nottemp select * from foo; truncate foo;
.... repeat several thousand times ...

and couldn't see any consistent growth in the reported times. So either
it's been fixed since 7.4.6 (but I don't see anything related-looking in
the CVS logs), or you haven't provided all the details.

> Now the varchar columns that end up in the perm view come from the tbl
> table, but in tbl, they are defined as varchar(40). Somehow the 40 limit is
> lost when constructing the view.


Yeah, this is a known issue with UNIONs not preserving the length info
--- which is not entirely unreasonable: what will you do with varchar(40)
union varchar(50)? There's a hack in place as of 8.2 to keep the
length if all the union arms have the same length.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #20 (permalink)  
Old 04-19-2008, 09:58 AM
Steven Flatt
 
Posts: n/a
Default Re: Insertion to temp table deteriorating over time

On 12/19/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I still can't reproduce this. Using 7.4 branch tip, I did
>
> create temp table foo(f1 varchar);
> create table nottemp(f1 varchar);
> \timing
> insert into foo select stringu1 from tenk1 limit 100; insert into nottemp
> select * from foo; truncate foo;
> insert into foo select stringu1 from tenk1 limit 100; insert into nottemp
> select * from foo; truncate foo;
> ... repeat several thousand times ...



I can't reproduce the problem that way either (or when using a server-side
PLpgSQL function to do similar). It looks like you have to go through an
ODBC connection, with the looping done on the client side. Each individual
insert to the temp table needs to be sent over the connection and this is
what degrades over time. I can reproduce on 7.4.6 and 8.1.4. I have a
small C program to do this which I can send you offline if you're
interested.


> Now the varchar columns that end up in the perm view come from the tbl
> > table, but in tbl, they are defined as varchar(40). Somehow the 40

> limit is
> > lost when constructing the view.

>
> Yeah, this is a known issue with UNIONs not preserving the length info
> --- which is not entirely unreasonable: what will you do with varchar(40)
> union varchar(50)? There's a hack in place as of 8.2 to keep the
> length if all the union arms have the same length.



I guess it comes down to what your philosophy is on this. You might just
disallow unions when the data types do not match (varchar(40) !=
varchar(50)). But it might come down to what's best for your application.
I tend to think that when the unioned types do match, the type should be
preserved in the inheriting view (as done by the "hack" in 8.2).

Thanks again for all your help. Steve

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 05:14 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com