Question :
At my work, I am going to be moving millions of records to a different server, and removing them from the source server. My DBA has set up a backup scenario where change backups are taken every 5 minutes. He had this warning for me regarding deleting the records:
…it is going to create a gigantic transaction log that will probably
cause the 5 minute transaction log backups to fail…
Are there any ways to get around this?
Answer :
We don’t know why your transaction log backups will fail but we can discuss solutions. He could be concerned about transaction log bloat, when the transaction log would grow so large that it filled the drive where the log file is stored.
First, and easiest solution: If you’re moving all the data out of the table entirely, simply truncate the table. This is minimally logged.
Second: Split your DELETE statement into batches. Find a way to divide your data, let’s say in batch sizes of 5,000.
DELETE FROM OldDBTable
WHERE ID IN
(SELECT TOP 5000 ID
FROM OldDBTable
WHERE [Statement to define deleted data here])
You could probably increase that batch size if you want. Then just set a
WAITFOR DELAY '0:05:00'
to wait 5 minutes for your next batch.
Taking transaction log back has almost no effect on ongoing activity and no relation with them so log backup won’t fail.
I think your DBA is more worrying about increasing transaction log size while deleting millions of records.
If that is the case, you can discuss with him and try moving and deleting records in small batches instead of moving and deleting them in a single transaction.
You can go through this link:
https://sqlperformance.com/2013/03/io-subsystem/chunk-deletes
But please confirm first with your DBA.
I recently performed a similar task with minimal overhead. I coordinated with my sr. DBA to
- Identify and store the records you want to keep
select * into foo_table from table where <keep_condition> = 'true'
- Disable any or drop any relevant constraints ( making sure you can recreate them, of course )
- Truncate the target tables
- Re-insert from
foo_table
( watching out for computed/identity columns ) - Clean up
Disclaimer: Obviously it’s best to do this with plenty of rollback ability and during a downtime. But if you can get the downtime to do it, it’s not a headache at all