Question :
I have a delete query which is potentially going to delete tens of thousands of rows.
Currently it’s taking quite a long time in the more extreme case (over 100000 rows to delete) using this:
DELETE FROM tablename WHERE id IN (?, ?, ?...)
However, I was wondering whether it might be better to create a transaction and use php to loop through each of the deletes.
Something like…
START TRANSACTION;
DELETE FROM tablename WHERE id=?;
DELETE FROM tablename WHERE id=?;
DELETE FROM tablename WHERE id=?;
.....
COMMIT;
I hope this makes sense… I’m not the best at using the correct terminology.
Thanks
Answer :
Where did you get the list of ids? If you got them from another table, then do a multi-table DELETE. This would be best, except that it stills suffers from #1, below.
-
A single delete with 100K rows will take a long time because that is an awfully large number of things to save for possible ROLLBACK.
-
Deleting one row at a time is awfully slow if each is a separate transaction (autocommit=ON).
-
Deleting individual rows in a huge transaction has the same ROLLBACK issue, plus the overhead of doing separate commands (extra parsing, roundtrips, etc).
Here’s another thought: Use PHP, but gather 1000 ids at a time. That is:
SET autocommit=1;
DELETE FROM tablename WHERE id IN (... 1000 ids ...);
DELETE FROM tablename WHERE id IN (... 1000 ids ...);
...
If convenient, sort the 100K ids before starting this; that will help the “locality” a bit. This breaking up of the ids softens the negative issues mentioned above.
Not quite the same: Blog on Big Deletes , but it might give you some more insight and ideas. (It discusses, among other things, the impact on Replication, and a benefit of using PARTITION
.)