Ignoramus24559 <ignoramus24559@NOSPAM.24559.invalid> wrote:
> I am in big trouble with MySQL 5.
>
> I have a huge table, about 25 GIGABYTES, that is a cache for hard to
> compute items. Each row has two timestamps, last updated and last
> retrieved timestamp, some metadata, key and a BLOB.
>
> I would like to clean that table based on timestamps. I used to clean
> it periodically, but doing a big DELETE...WHERE query locks it up for
> HOURS and that means downtime for my website, lost money, pissed off
> tutors, etc.
First: this design does not seem to be very clever. You have 25GB
worth of "hard to compute" binary stuff? Why do you put it in the
database anyway? Wouldn't it be better to keep that in plain old
files? On the web server(s)? That should scale much better. Plus,
filesystems are quite good at dealing with binary data. Better than
most database engines.
Next: you use a MyISAM table for that. Not good. With variable length
records and heavy fluctuation you will experience tablespace
fragmentation very soon. You would be better off with an InnoDB table.
I can also imagine a set of MyISAM tables, bundled by the MERGE engine.
If you partition your data by timestamp, cleaning out old entries would
become as simple as CREATE fresh_table, ALTER merge_table, DROP
oldest_table.
> So, I thought, I would write a process that would do
>
> SELECT id FROM bincache WHERE ... LIMIT 100
>
> and that would run quickly to give me 100 IDs, that would then DELETE
> at my leisure without locking up the table.
Locking is a MyISAM only problem.
Also you can do DELETE ... LIMIT. No need to break that in two steps.
However, deleting from a big table is expensive:
1. the rows to be deleted must be found
2. rows must be deleted from tablespace (this may cause tablespace
reorganisation)
3. indexes must be updated
If you delete big part of a table (say: 80% of the records) you will be
better off with the following approach:
1. create a new table with the same layout
2. populate the new table with INSERT .... SELECT with all records
you plan to keep
3. rename tables (this is an atomic operation)
4. drop the old table
> Well, I just learned that LIMIT 100 is applied after ALL select work
> is done,
This clearly depends on your query. If MySQL does not need to
materialize the result set (i.e. for sorting), it stops just after
it processed the requested number of rows.
XL
--
Axel Schwenke, Senior Software Developer, MySQL AB
Online User Manual:
http://dev.mysql.com/doc/refman/5.0/en/
MySQL User Forums:
http://forums.mysql.com/