Unix Technical Forum

Best way to delete million records

This is a discussion on Best way to delete million records within the Oracle Database forums, part of the Database Server Software category; --> Hi all, I have to delete about 43 millions rows from a table, but the following conditions exist: - ...


Go Back   Unix Technical Forum > Database Server Software > Oracle Database

FAQ Members List Calendar Search Today's Posts Mark Forums Read
  #1 (permalink)  
Old 06-30-2008, 10:03 AM
fefe78
 
Posts: n/a
Default Best way to delete million records

Hi all,

I have to delete about 43 millions rows from a table, but the following
conditions exist:

- The table is not partitioned and unfortunately I can't modify the
structure
- The delete procedure must be done without stopping service, so I can't
use CTAS, moving records and renaming table.
- I can't use truncate because the records affected are selected in a
range of dates

Waiting your suggestions and probable problems for this procedure.

Thanks for help.

Bye
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 06-30-2008, 10:03 AM
Robert Klemme
 
Posts: n/a
Default Re: Best way to delete million records

On Jun 30, 10:00*am, fefe78 <pe...@puppurnazzi.com> wrote:
> I have to delete about 43 millions rows from a table, but the following
> conditions exist:
>
> - The table is not partitioned and unfortunately I can't modify the
> structure
> - The delete procedure must be done without stopping service, so I can't
> use CTAS, moving records and renaming table.
> - I can't use truncate because the records affected are selected in a
> range of dates
>
> Waiting your suggestions and probable problems for this procedure.


If you do not have proper indexes in place not matter what you need to
at least once run through the whole table. If this is a task you face
regularly I'd probably bite the bullet and create appropriate indexes
or - even better - partitions. If this is just done on few occasions
it might be better to just issue the DELETE and wait. Watch out for
undo / redo sizes though.

Kind regards

robert
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 07-01-2008, 08:08 AM
fefe78
 
Posts: n/a
Default Re: Best way to delete million records

The table is indexed, and this is a one time procedure.
After this will be scheduled a job to daily delete old records.
In some forums I red that drop indexes and ricreate them after deletion,
could be a good solution.

Robert Klemme ha scritto:
> On Jun 30, 10:00 am, fefe78 <pe...@puppurnazzi.com> wrote:
>> I have to delete about 43 millions rows from a table, but the following
>> conditions exist:
>>
>> - The table is not partitioned and unfortunately I can't modify the
>> structure
>> - The delete procedure must be done without stopping service, so I can't
>> use CTAS, moving records and renaming table.
>> - I can't use truncate because the records affected are selected in a
>> range of dates
>>
>> Waiting your suggestions and probable problems for this procedure.

>
> If you do not have proper indexes in place not matter what you need to
> at least once run through the whole table. If this is a task you face
> regularly I'd probably bite the bullet and create appropriate indexes
> or - even better - partitions. If this is just done on few occasions
> it might be better to just issue the DELETE and wait. Watch out for
> undo / redo sizes though.
>
> Kind regards
>
> robert

Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 07-01-2008, 08:08 AM
xhoster@gmail.com
 
Posts: n/a
Default Re: Best way to delete million records

fefe78 <peppo@puppurnazzi.com> wrote:
> The table is indexed, and this is a one time procedure.


What is it indexed on?

> After this will be scheduled a job to daily delete old records.


Given this, I doubt that all 43 million rows need to be deleted in
a single atomic transaction. If that is the case, breaking it into smaller
transactions could prevent causing havoc with the undo.


> In some forums I red that drop indexes and ricreate them after deletion,
> could be a good solution.


If the indexes are important, then dropping them will effectively "stop
service", which you say is not allowed.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 07-01-2008, 08:08 AM
joel garry
 
Posts: n/a
Default Re: Best way to delete million records

On Jun 30, 1:00*am, fefe78 <pe...@puppurnazzi.com> wrote:
> Hi all,
>
> I have to delete about 43 millions rows from a table, but the following
> conditions exist:
>
> - The table is not partitioned and unfortunately I can't modify the
> structure
> - The delete procedure must be done without stopping service, so I can't
> use CTAS, moving records and renaming table.
> - I can't use truncate because the records affected are selected in a
> range of dates
>
> Waiting your suggestions and probable problems for this procedure.
>
> Thanks for help.
>
> Bye


Xho's answer is the one that immediately comes to mind, how long do
you have to do this, and how many rows are you not deleting?

Sometimes when you calculate all the costs and show all the possible
ways to do it, 7/24 uptime becomes a nice-to-have rather than the
absolute directive.

Another possibility may be to split table in two, replacing it with a
view, and slowly move the rows to the archive table and truncate.
This may have some bad performance implications, depending on how the
table is accessed.

You may very well need to rebuild/coalesce indices or move stuff
around in the table, if you are deleting a large proportion of the
rows.

Your options are version-dependent.

jg
--
@home.com is bogus.
"If the wings are traveling faster than the fuselage, it's probably a
helicopter - and therefore, unsafe."
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 07-01-2008, 08:08 AM
Mladen Gogala
 
Posts: n/a
Default Re: Best way to delete million records

On Mon, 30 Jun 2008 10:00:51 +0200, fefe78 wrote:

> Hi all,
>
> I have to delete about 43 millions rows from a table, but the following
> conditions exist:
>
> - The table is not partitioned and unfortunately I can't modify the
> structure
> - The delete procedure must be done without stopping service, so I can't
> use CTAS, moving records and renaming table. - I can't use truncate
> because the records affected are selected in a range of dates
>
> Waiting your suggestions and probable problems for this procedure.
>
> Thanks for help.
>
> Bye


You can also write a PL/SQL script that would delete million by
million records and commit it every time. That will prevent you
from running out of the UNDO space but will probably run for a week,
in the background. To delete 43 million records is a major intervention
and if you want to do it in a non-invasive fashion, background job running
a PL/SQL delete script is, in my opinion, the best way. Of course, you
should also be checking space on your archive log destination. Something
like this is more then likely to create a major log switching storm.
Expect a checkpoint every few seconds, with an occasional "checkpoint not
complete" message. Users are more then likely to notice that kind of
activity. It it is possible, you can launch several delete scripts, one
deleting Mondays, one Tuesdays and so on. If your machine capacity is
good enough, the machine will survive. To survive a thing like that, you
should have at least 6 CPUs.
Also, if there is a standby, it needs to be carefully monitored, too. This
might flood your 1GB LAN card.



--
http://mgogala.freehostia.com
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 08-18-2008, 05:53 AM
paa.listas@gmail.com
 
Posts: n/a
Default Re: Best way to delete million records

On 30 jun, 05:00, fefe78 <pe...@puppurnazzi.com> wrote:
> Hi all,
>
> I have to delete about 43 millions rows from a table, but the following
> conditions exist:
>
> - The table is not partitioned and unfortunately I can't modify the
> structure
> - The delete procedure must be done without stopping service, so I can't
> use CTAS, moving records and renaming table.
> - I can't use truncate because the records affected are selected in a
> range of dates
>
> Waiting your suggestions and probable problems for this procedure.
>
> Thanks for help.
>
> Bye


Hello,

i have a similar problem and use someting like this:
-----------------------------
DECLARE

-- 1 month
ITERACIONES CONSTANT NUMBER := 30;

-- 2 years. your oldest record timestamp
OLDEST_DATE_ CONSTANT NUMBER := 730;
oldest_date DATE;

KEEP_DATE CONSTANT NUMBER := 60;

STEP_DATE CONSTANT NUMBER := 1;
loop_date DATE;

BEGIN
dbms_output.put_line('.- Iniciando a las: ' || SYSDATE );
-- Inicializamos
-- set server outout on format wrapped;
SELECT SYSDATE - OLDEST_DATE_ INTO oldest_date FROM DUAL;
SELECT oldest_date INTO loop_date FROM DUAL;

FOR x IN 1..ITERACIONES LOOP
dbms_output.put_line('Iniciando procedimiento nro: ' || x);
dbms_output.put_line('borrando registros entre Fechas ' ||
oldest_date || ' y ' || loop_date );

-- DELETE...;
DELETE FROM iwaylogs.cm_leases
WHERE iwaylogs.cm_leases.fecha BETWEEN oldest_date AND
loop_date;
-- COMMIT;
COMMIT;

-- adelanto la fecha
oldest_date := loop_date;
-- adelanto el loop
loop_date := loop_date + STEP_DATE;

-- Si ya llegue a borrar hasta donde quiero salgo
IF ( loop_date >= (SYSDATE - KEEP_DATE) ) THEN
dbms_output.put_line('Llegamos a los 60 dias,
saliendo');
EXIT;
END IF;
dbms_output.put_line('Procedimiento listo.');
-- Esperamos XX segundos para 'descansar' la DB
dbms_lock.sleep(90);
END LOOP;

dbms_output.put_line('Finalizado' );

END;

------------------------------------------------------
)
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 08-18-2008, 05:53 AM
Shakespeare
 
Posts: n/a
Default Re: Best way to delete million records


<paa.listas@gmail.com> schreef in bericht
news:cf74d9a0-64b6-4d92-a415-d6823fcb527b@l42g2000hsc.googlegroups.com...
> On 30 jun, 05:00, fefe78 <pe...@puppurnazzi.com> wrote:
>> Hi all,
>>
>> I have to delete about 43 millions rows from a table, but the following
>> conditions exist:
>>
>> - The table is not partitioned and unfortunately I can't modify the
>> structure
>> - The delete procedure must be done without stopping service, so I can't
>> use CTAS, moving records and renaming table.
>> - I can't use truncate because the records affected are selected in a
>> range of dates
>>
>> Waiting your suggestions and probable problems for this procedure.
>>
>> Thanks for help.
>>
>> Bye

>
> Hello,
>
> i have a similar problem and use someting like this:
> -----------------------------
> DECLARE
>
> -- 1 month
> ITERACIONES CONSTANT NUMBER := 30;
>
> -- 2 years. your oldest record timestamp
> OLDEST_DATE_ CONSTANT NUMBER := 730;
> oldest_date DATE;
>
> KEEP_DATE CONSTANT NUMBER := 60;
>
> STEP_DATE CONSTANT NUMBER := 1;
> loop_date DATE;
>
> BEGIN
> dbms_output.put_line('.- Iniciando a las: ' || SYSDATE );
> -- Inicializamos
> -- set server outout on format wrapped;
> SELECT SYSDATE - OLDEST_DATE_ INTO oldest_date FROM DUAL;
> SELECT oldest_date INTO loop_date FROM DUAL;
>
> FOR x IN 1..ITERACIONES LOOP
> dbms_output.put_line('Iniciando procedimiento nro: ' || x);
> dbms_output.put_line('borrando registros entre Fechas ' ||
> oldest_date || ' y ' || loop_date );
>
> -- DELETE...;
> DELETE FROM iwaylogs.cm_leases
> WHERE iwaylogs.cm_leases.fecha BETWEEN oldest_date AND
> loop_date;
> -- COMMIT;
> COMMIT;
>
> -- adelanto la fecha
> oldest_date := loop_date;
> -- adelanto el loop
> loop_date := loop_date + STEP_DATE;
>
> -- Si ya llegue a borrar hasta donde quiero salgo
> IF ( loop_date >= (SYSDATE - KEEP_DATE) ) THEN
> dbms_output.put_line('Llegamos a los 60 dias,
> saliendo');
> EXIT;
> END IF;
> dbms_output.put_line('Procedimiento listo.');
> -- Esperamos XX segundos para 'descansar' la DB
> dbms_lock.sleep(90);
> END LOOP;
>
> dbms_output.put_line('Finalizado' );
>
> END;
>
> ------------------------------------------------------
> )


Why don't you just do a delete like
where fecha between oldest_date and oldest_date+30 ? It saves you a loop and
you do it all in one transaction.... and you don't have to 'descansar' -
sleep - your DB.
(I noted you'll have to keep track of KEEP_DATE as well, so you'll have to
do a MIN( ) in your comparison somewhere)
Furthermore, your loop sleeps 90 seconds every loop, which makes your
program run at least in 30*90 seconds = 2700 seconds is 45 minutes... even
if there is nothing to delete in the first 29 loops!!! Which will be the
case if you run your script once a day. If you run it several times a day,
it is even doing nothing at all for 45 minutes.
(delete nothing, wait 90 seconds, delete nothing, wait 90 seconds and so
on....)

Shakespeare



Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On
Forum Jump


All times are GMT. The time now is 07:02 AM.


Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.2.0
www.UnixAdminTalk.com