We are debating whether to use the SORT_IN_TEMPDB option for our DW tables. My understanding is that there are more writes when using this option, although they are more sequential. We have a SAN (which has has been notoriously slow at times), so in our case we want to limit the number of writes as much as possible. I believe tempdb is on a separate LUN (set of disks).
We have plenty of disk space in our data file and on our tempdb file. In this case, would we benefit from using SORT_IN_TEMPDB?
One thing that struck me was this comment on this Answer
When rebuilding an index you would need twice the space of the index +
20% for the sorting. So in general to rebuild every index in your db
you only need 120% of your biggest index in your DB. If you use
SORT_IN_TEMPDB, you only win 20%, you still need an aditional 100% in
your data file. Further more, using sort in tempdb increases your IO
load drastically, since instead of Writing the index one time to the
datafile, you now write it one time to the tempdb and then write it to
the data file. So that is not always ideal.
We definitely don’t want to increase our IO load with our slow/possibly misconfigured SAN.
What would be the best way to test this? By simply rebuilding the table with and without the option and log the times?
Edit: We have 8 tempdb files, each 15GB. We do have TF 1117/1118 flags set and IFI is enabled. We currently do a mixture of rebuilding with the sort_in_tempdb option and without it.
SQL Server 2012 Enterprise
SORT_IN_TEMPDB means that SQL server will use
tempdb to allocate the temporary space as opposed to allocating space in the user database whose index is being rebuild. This means you will need less free space in your user database during an index rebuild operation and more free space in tempdb.
It gives you better advantage when tempdb is on a different set of disks (LUNs) from the user database.
From SORT_IN_TEMPDB Option – BOL :
If the SORT_IN_TEMPDB option is set to ON and tempdb is on a separate set of disks from the destination filegroup, during the first phase, the reads of the data pages occur on a different disk from the writes to the sort work area in tempdb. This means the disk reads of the data keys generally continue more serially across the disk, and the writes to the tempdb disk also are generally serial, as do the writes to build the final index. Even if other users are using the database and accessing separate disk addresses, the overall pattern of reads and writes are more efficient when SORT_IN_TEMPDB is specified than when it is not.
Make sure you read the disk space requirements when SORT_IN_TEMPDB is ON.
slow/possibly misconfigured SAN
You know the pain point. Why dont you work with your SAN admin to fix it ? Misconfigured and or slow SAN will cause all sort of problems like slowness.
Some important points to note :
- Use MAXDOP=1 when doing an online index rebuild – else it will cause increase fragmentation.
- Intelligently reorg/rebuild your indexs based on the fragmentation level as well as page fragments.
- Read SQLFool’s experience with Creating a 60 GB Index
- Make sure you have configured tempdb properly – TF 1117 (GROW ALL FILES IN A FILEGROUP EQUALLY) and 1118 (FULL EXTENTS ONLY) and have more tempdb datafiles.
What would be the best way to test this?
Yes, you have to test it by analyzing the waitstats when you rebuild the index with and without
SORT_IN_TEMPDB. Measure run time as well and when doing in PROD, make sure you do it during a maintenance window or less server activity. Also check your read/write data and log latency.
I am not sure you have Instant file initialization, but it will benefit when restoring, during autogrowth of data files and when creating a new database (just mentioning for completeness).