SQL Server not using nonclustered index

Posted on

Question :

I have table which contains about 470mln of rows. I would like to select data based on date. I have two indices created on this table. One is clustered one and other is non clustered on date column (date is stored as INT). I have simple select statement:

select *
from big_table
where [date] BETWEEN 20200820 AND 20200828

The problem is that query plan uses clustered index scan instead of nonclustered seek + keylookup. Plan as follows:
non using clustered index plan

Estimates generated in query plan are fine, statistics are up to date. This date range should deliver about 5mln of rows.
When I provide index hint then this select completes in couple of seconds – without hint it takes couple of minutes to finish.

This is SQL Server 2019 and generaly speaking I noticed that db prefers clustered index scans that using non clustered + keylookups even on bigger tables.

I would rather not using hint because:

  • sometimes I select wider ranges where clustered index scan should be desirable
  • table is used in view and I cannot supply index hint to the view

Is there any explanation why db is not using NC index it this case?

Links to query plans:

Answer :

It looks like SQL Server is not using that index by default because:

  • it’s a filtered index, and
  • your query is parameterized

You can see this warning in the execution plan XML:

<UnmatchedIndexes>
  <Parameterization>
    <Object Database="Database1" Schema="Schema1" Table="Object1" Index="Index1" />
  </Parameterization>
</UnmatchedIndexes>
<Warnings UnmatchedIndexes="1" />

SQL Server doesn’t know what the parameter values are (because they’re in variables), so it can’t safely use the filtered index.

One solution is to use index hints (as you mentioned, this isn’t ideal).

Another way to work around that is to use dynamic SQL, as described by Jeremiah Peschka here:

Filtered Indexes and Dynamic SQL

I don’t know how the filtered index is…filtered. You might be able to get away with embedding the literal on only one of the two values, to limit plan cache bloat.

Is there any explanation why db is not using NC index it this case?

The estimated cost of that plan is lower. The clustered index scan uses more sequential IO and the nonclustered index scan + bookmark lookup uses more random IO. So which one is actually faster may depend on your hardware.

Look at the query wait stats. For the clustered index scan it’s

        <WaitStats>
          <Wait WaitType="PAGEIOLATCH_SH" WaitTimeMs="3188040" WaitCount="31753"/>
          <Wait WaitType="CXPACKET" WaitTimeMs="566095" WaitCount="6329619"/>
          <Wait WaitType="SOS_SCHEDULER_YIELD" WaitTimeMs="21354" WaitCount="29774"/>
          <Wait WaitType="MEMORY_ALLOCATION_EXT" WaitTimeMs="11994" WaitCount="8679127"/>
          <Wait WaitType="SLEEP_BPOOL_STEAL" WaitTimeMs="7435" WaitCount="439"/>
          <Wait WaitType="LATCH_EX" WaitTimeMs="206" WaitCount="35"/>
          <Wait WaitType="SESSION_WAIT_STATS_CHILDREN" WaitTimeMs="8" WaitCount="6"/>
          <Wait WaitType="ASYNC_NETWORK_IO" WaitTimeMs="5" WaitCount="2"/>
        </WaitStats>
        <QueryTimeStats ElapsedTime="247180" CpuTime="232769"/>

For the non-clustered index it’s

        <WaitStats>
          <Wait WaitType="CXPACKET" WaitTimeMs="451425" WaitCount="4834017"/>
          <Wait WaitType="PAGEIOLATCH_SH" WaitTimeMs="43202" WaitCount="41863"/>
          <Wait WaitType="SOS_SCHEDULER_YIELD" WaitTimeMs="11453" WaitCount="11288"/>
          <Wait WaitType="MEMORY_ALLOCATION_EXT" WaitTimeMs="2823" WaitCount="4051831"/>
          <Wait WaitType="LCK_M_S" WaitTimeMs="1366" WaitCount="1"/>
          <Wait WaitType="RESERVED_MEMORY_ALLOCATION_EXT" WaitTimeMs="152" WaitCount="37550"/>
          <Wait WaitType="PAGEIOLATCH_UP" WaitTimeMs="49" WaitCount="4"/>
          <Wait WaitType="LATCH_EX" WaitTimeMs="11" WaitCount="14"/>
          <Wait WaitType="LATCH_SH" WaitTimeMs="1" WaitCount="3"/>
        </WaitStats>
        <QueryTimeStats ElapsedTime="67529" CpuTime="119828"/>

But both plans are very expensive, so you should do something about that. Options include

  1. Replacing the existing clustered index with something more useful, like adding Date to the first index and then partitioning the clustered index by date.
  2. Storing this table as a Clustered Columnstore instead of a Clustered Index
  3. Not running select * and add selected included columns to the Date index.

Leave a Reply

Your email address will not be published. Required fields are marked *