I have a large table (600 million rows, 60GB data) that I need to migrate to another table. This is for a few reasons:
- The current table is a heap and I want to convert it to having a clustered index
- I have a column which is time(7) that only really needs to be time(4)
- The table has previous deleted columns that are still reserving space from what I can see (they are just marked as is_dropped in the metadata.
The best way I could come up with doing all this in the fewest operations (i.e. not having to rebuild the table multiple times) was to create a new table with the column definitions I want and copy the data across into the new table while deleting the old data, followed by a name switch. I’ve split this into batches in a loop to prevent the log file growing continously, however this doesn’t seem to be working and after about 70million rows, the log file has grown to around 110GB and there is no space left on the server…
The database is using simple recovery and I have a checkpoint after each batch, which I thought was meant to stop the log file growing! Am I missing something, or is there some other reason my log file continues to grow? This is testing on a dev server I’m doing, so there are no other operations currently running on the table.
My query is:
WHILE 1 = 1 BEGIN DELETE TOP(1000000) FROM RunningData with (tablock) OUTPUT deleted.[Data], deleted.[Date], deleted.Time, deleted.Category, deleted.SystemID INTO [dbo].[TmpRunningData] ( Data, Date, [Time], Category, SystemID ) IF @@ROWCOUNT = 0 BREAK CHECKPOINT RAISERROR ('Moving to next batch', 0, 1) WITH NOWAIT waitfor delay '00:00:01' END
I really want to be able to sort this table out, but I’m struggling to find a way of processing this table due to the size! Any suggestions on how I can stop the log growing, or another way to do it would be very appreciated, thanks! In production the database can go down for a weekend, so there should be plenty of time to do it.
First step is to ask SQL Server why the log grows. You should find the answer in the log_reuse_wait_desc column in the sys.databases catalog view. Being in simple mode, it shouldn’t be log backup. Then take it from there.
And don’t do it all in one transaction, since SQL Server can’t truncate past the oldest open transaction.