Question :
If I have a table that has many indexes, and I run a statement that inserts rows into the table, will SQL Server insert rows one-at-a-time, or will the rows be inserted using parallelism?
Answer :
I run a statement that inserts rows into the table, will SQL Server insert rows one-at-a-time, or use parallelism?
Before SQL Server 2016, an individual T-SQL INSERT
statement always used an execution plan that inserts rows and maintains associated nonclustered indexes using a single logical processor. This answers one aspect of your question.
The question of inserting rows ‘one at a time‘ requires a bit of explanation. Fundamentally, data modifications in an INSERT...SELECT
execution plan always operate as one or more row-at-a-time streams that execute in serial order.
That said, the query optimizer often has a choice between two execution plan strategies:
- Insert rows to the base table and nonclustered indexes using a single plan operator; or
- Use a separate plan operator for each nonclustered index, clustered index, or heap.
The first option means that each row is inserted to the base table, and all nonclustered indexes are updated in a serial sequence, before moving on to the next row. This is known as a narrow or per-row, insert strategy.
The second option means all rows are inserted to the base table (though still one at a time) before doing the same thing for each nonclustered index, again in a serial sequence. This is known as a wide or per-index insert strategy.
The optimizer makes a cost-based choice between narrow and wide strategies for each nonclustered index, so it is quite common to see an execution plan that maintains some nonclustered indexes in the same operator as the base table, while other indexes are maintained using separate operators.
The wide/per-index strategy also allows potential optimizations such as sorting the set of rows in key order for each index in order to promote a sequential access pattern.
You can find more details about this and execution plan examples in my blog article, Optimizing T-SQL queries that change data.
There is a feedback suggestion to allow parallel maintenance of nonclustered indexes. It was received favourably on Connect (and later migrated to the new platform), but the idea has not been implemented yet.
While it is limited to a single processor for data modifications, INSERT...SELECT
can use parallelism in the part of the plan that identifies and creates the data to insert. From SQL Server 2016, INSERT...SELECT
can perform parallel inserts to a heap.
For best insert performance, check that your query meets the criteria for minimally-logged operations, possibly requiring trace flag 610. Details can be found in the Data Loading Performance Guide.
You might also want to investigate using other SQL Server facilities that do allow parallel inserts (with some caveats) using multiple processes:
- SSIS (SQL Server Integration Services)
T_SQL BULK INSERT
(from a file)- The
bcp
utility
SELECT INTO
does allow parallelism (Server 2014+) but this requires that the table does not exist before the process begins, and any nonclustered indexes would have to be added afterward.
If you are using Enterprise Edition, partitioned tables offer additional possibilities to optimize bulk insertions. Again, see the Data Loading Performance Guide for details.
Related articles written by me: