Unix Technical Forum

Efficient delete from table with compound keys?

This is a discussion on Efficient delete from table with compound keys? within the Ingres forums, part of the Database Server Software category; --> Using II 9.0.4 (hp2.us5/105) What is the most efficient way to do the following? TableA is btree with col1, ...


Go Back   Unix Technical Forum > Database Server Software > Ingres

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 04-20-2008, 10:58 PM
Dennis Roesler
 
Posts: n/a
Default Efficient delete from table with compound keys?

Using II 9.0.4 (hp2.us5/105)

What is the most efficient way to do the following? TableA is btree
with col1, col2, col3 being the compound key and has about 12.5M rows of
data. TableA is also compressed and stored across 3 locations if it
makes a difference.

delete from tableA where col1+col2+col3 in
(select col1+col2+col3 from tableB)

Thanks

Dennis
d underscore roesler at agilent dot com
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 04-20-2008, 10:58 PM
Karl & Betty Schendel
 
Posts: n/a
Default Re: [Info-Ingres] Efficient delete from table with compound keys?

At 11:12 AM -0400 7/25/07, Dennis Roesler wrote:
>Using II 9.0.4 (hp2.us5/105)
>
>What is the most efficient way to do the following? TableA is btree
>with col1, col2, col3 being the compound key and has about 12.5M rows of
>data. TableA is also compressed and stored across 3 locations if it
>makes a difference.
>
>delete from tableA where col1+col2+col3 in
> (select col1+col2+col3 from tableB)
>


It really depends on how big tableB is.

For small tableB (say, up to a few thousand rows), you'll
want to use EXISTS so that Ingres can use a K-join:

delete from tableA where exists
(select * from tableB where tableB.col1 = tableA.col1
and tableB.col2 = tableA.col2 and tableB.col3 = tableB.col3);

(at least, I assume that is what you really meant
with your original query.)

If tableB is into the hundreds-of-thousands of rows,
you can do it the same way as long as you don't run into
transaction log problems. You might have to break
the delete up into a few chunks, perhaps using session
temps to hold a few chunks of tableB keys.

If tableB is a significant fraction of tableA, you
might be better off to copy the surviving rows into
a session temp, truncate tableA, stuff the survivors
back in, and remodify to btree. I don't know where
the breakeven point for this would kick in (and it
is somewhat installation dependent), but I'm guessing
that deleting somewhere around 1/3 to 1/2 of the table
is the breakeven point; probably less if there are
a bunch of secondary indexes on tableA.

dgtt session.foo as select * from tableA
where not exists (same subselect as above);
modify tableA to truncated;
insert into tableA select * from session.foo;
modify tableA to btree ....
commit;

With the current Ingres, this will still log rows doing
the insert. That's unavoidable if you are journaling all
this; if you can get away with a journaling hole for
tableA, you can avoid the tx log by dropping tableA and
doing a create-table-as-select instead of a truncate/insert.
(Or, you could copy the session temp into a file
and do a copy into the truncated tableA! might be
slower though.)

Karl
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 04-20-2008, 10:58 PM
Karl & Betty Schendel
 
Posts: n/a
Default Re: [Info-Ingres] Efficient delete from table with compound keys?

At 11:12 AM -0400 7/25/07, Dennis Roesler wrote:
>Using II 9.0.4 (hp2.us5/105)
>
>What is the most efficient way to do the following? TableA is btree
>with col1, col2, col3 being the compound key and has about 12.5M rows of
>data. TableA is also compressed and stored across 3 locations if it
>makes a difference.
>
>delete from tableA where col1+col2+col3 in
> (select col1+col2+col3 from tableB)
>


It really depends on how big tableB is.

For small tableB (say, up to a few thousand rows), you'll
want to use EXISTS so that Ingres can use a K-join:

delete from tableA where exists
(select * from tableB where tableB.col1 = tableA.col1
and tableB.col2 = tableA.col2 and tableB.col3 = tableB.col3);

(at least, I assume that is what you really meant
with your original query.)

If tableB is into the hundreds-of-thousands of rows,
you can do it the same way as long as you don't run into
transaction log problems. You might have to break
the delete up into a few chunks, perhaps using session
temps to hold a few chunks of tableB keys.

If tableB is a significant fraction of tableA, you
might be better off to copy the surviving rows into
a session temp, truncate tableA, stuff the survivors
back in, and remodify to btree. I don't know where
the breakeven point for this would kick in (and it
is somewhat installation dependent), but I'm guessing
that deleting somewhere around 1/3 to 1/2 of the table
is the breakeven point; probably less if there are
a bunch of secondary indexes on tableA.

dgtt session.foo as select * from tableA
where not exists (same subselect as above);
modify tableA to truncated;
insert into tableA select * from session.foo;
modify tableA to btree ....
commit;

With the current Ingres, this will still log rows doing
the insert. That's unavoidable if you are journaling all
this; if you can get away with a journaling hole for
tableA, you can avoid the tx log by dropping tableA and
doing a create-table-as-select instead of a truncate/insert.
(Or, you could copy the session temp into a file
and do a copy into the truncated tableA! might be
slower though.)

Karl
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 04-20-2008, 10:58 PM
Dennis Roesler
 
Posts: n/a
Default Re: [Info-Ingres] Efficient delete from table with compound keys?

Karl & Betty Schendel wrote:
> At 11:12 AM -0400 7/25/07, Dennis Roesler wrote:
>> Using II 9.0.4 (hp2.us5/105)
>>
>> What is the most efficient way to do the following? TableA is btree
>> with col1, col2, col3 being the compound key and has about 12.5M rows of
>> data. TableA is also compressed and stored across 3 locations if it
>> makes a difference.
>>
>> delete from tableA where col1+col2+col3 in
>> (select col1+col2+col3 from tableB)
>>

>
> It really depends on how big tableB is.
>
> For small tableB (say, up to a few thousand rows), you'll
> want to use EXISTS so that Ingres can use a K-join:


tableB could vary from a few hundred to a ~200k-300k rows.
>
> delete from tableA where exists
> (select * from tableB where tableB.col1 = tableA.col1
> and tableB.col2 = tableA.col2 and tableB.col3 = tableB.col3);
>
> (at least, I assume that is what you really meant
> with your original query.)


yes

>
> If tableB is into the hundreds-of-thousands of rows,
> you can do it the same way as long as you don't run into
> transaction log problems. You might have to break
> the delete up into a few chunks, perhaps using session
> temps to hold a few chunks of tableB keys.


I think I will need to do this, because I have bumped into log full
problems.

>
> If tableB is a significant fraction of tableA, you
> might be better off to copy the surviving rows into
> a session temp, truncate tableA, stuff the survivors
> back in, and remodify to btree. I don't know where
> the breakeven point for this would kick in (and it
> is somewhat installation dependent), but I'm guessing
> that deleting somewhere around 1/3 to 1/2 of the table
> is the breakeven point; probably less if there are
> a bunch of secondary indexes on tableA.
>
> dgtt session.foo as select * from tableA
> where not exists (same subselect as above);
> modify tableA to truncated;
> insert into tableA select * from session.foo;
> modify tableA to btree ....
> commit;
>
> With the current Ingres, this will still log rows doing
> the insert. That's unavoidable if you are journaling all
> this; if you can get away with a journaling hole for
> tableA, you can avoid the tx log by dropping tableA and
> doing a create-table-as-select instead of a truncate/insert.
> (Or, you could copy the session temp into a file
> and do a copy into the truncated tableA! might be
> slower though.)


Thanks Karl. Gives me some things to think about.

Dennis
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 09:23 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com