I have SQL Server 2008 and a single database with full recovery model. Usually, the queue length is less than 1. But sometimes it grows up to several thousands (3000!!) for a few seconds. At this time many of write queries ends up with timeout error.
Using Resource Monitor, I found that at this moment sqlserver.exe writes a large amount of data to the main database file (MDF). Thoughh usually it writes to transaction log (LDF).
Using SQL Server Profiler, I found that heavy queries are not running at that moment.
I think, that it is some kind of SQL server’s background operation, but I wonder what kind?
Database also has READ_COMMITED_SNAPSHOT ON and the mirroring (synchronous mode) enabled. Can this fact be the cause of my issue?
I found that writing to log (not to data file) is default behavior of Full recovery mode. And log can only be copied to data file by Backup Transaction Log operation. Still don’t understand why SQL server copying log every ten minutes…
Sounds like you have:
- A very crappy IO subsystem.
- A checkpoint occurring.
- Crappy IO subsystem + checkpoint = high disk queue.
I found that writing to log (not to data file) is default behavior of
Full recovery mode. And log can only be copied to data file by Backup
Transaction Log operation.
No, not really. Your information or understanding is at fault.
Regardless of recovery model, SQL Server uses write-ahead-logging (WAL). An update/delete/insert results in a change to the in-memory data page + a record of the change written to the transaction log. The data page modification is not immediately flushed to the data file. To all intents and purposes you can consider the log write to disk to be immediate (beneath the covers the log is flushed every 60K or on a
What you are seeing when the IO spikes is those data page modifications being flushed to disk as a result of a checkpoint operation.
A checkpoint writes the current in-memory modified pages (known as
dirty pages) and transaction log information from memory to disk and,
also, records information about the transaction log. [Source]
Relevant past questions that may help you understand the mechanics:
The massive number of writes that you are seeing is the cHECKPOINT process. This is when the SQL Server takes the data which has been modified from the buffer pool in memory and writes those pages to the disk. When you write data into SQL Server the data isn’t written to the data file right away. It only happens later on when the CHECKPOINT happens.
If this is only happening every 10 minutes that’s because someone has changed the default recovery time from the default which should be 0 to 10. If you change it back to 0 it should be running checkpoint every minute which will make SQL Server run CHECKPOINT more often which means there will be smaller batches of data to be written.
Also I’d look at upgrading your storage subsystem to a faster platform.
As mentioned, the issue here is writing to the log file. All update/delete/insert ops require sync writes to the log file. IOW, you can only do these ops as fast as your disk system can write.
1: Confirm that the log file is on a distinct spindle set
2: Improve the IOPS of this spindle set (e.g. RAID1 is better than RAID5, RAID10 with four or more spindles is better than either of those, SSD arrays are better than all of those if you understand the reliability implications of an SSDs in a SQL environment)