Why isn’t the primary (clustered) key being used in this query?

Posted on

Question :

I have a SQL Server 2008 R2 table whose schema structure looks as follows:

CREATE TABLE [dbo].[CDSIM_BE]
(
    [ID] [bigint] NOT NULL,
    [EquipmentID] [varchar](50) NOT NULL,
    [SerialNumber] [varchar](50) NULL,
    [PyrID] [varchar](50) NULL,
    [MeasMode] [varchar](50) NULL,
    [ReadTime] [datetime] NOT NULL,
    [SubID] [varchar](15) NULL,
    [ProbePosition] [float] NULL,
    [DataPoint] [int] NULL,

    CONSTRAINT [PK_CDSIM_BE] 
    PRIMARY KEY CLUSTERED ([ID] ASC, [EquipmentID] ASC, [ReadTime] ASC)
         WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
               IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, 
               ALLOW_PAGE_LOCKS = ON) ON [MonthlyArchiveScheme9]([ReadTime])
) ON [MonthlyArchiveScheme9]([ReadTime])

CREATE NONCLUSTERED INDEX [idx_CDSIM_BE__SubID_ProbePosition] 
ON [dbo].[CDSIM_BE] ([SubID] ASC, [ProbePosition] ASC)
INCLUDE ([EquipmentID], [ReadTime], [BECorr]) 
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
          SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, 
          ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [MonthlyArchiveScheme9]([ReadTime])

CREATE NONCLUSTERED INDEX [IX_CDSIM_BE_ProbePosition] 
ON [dbo].[CDSIM_BE] ([ProbePosition] ASC)
INCLUDE ([SerialNumber], [SubID]) 
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
          SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, 
          ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [MonthlyArchiveScheme9]([ReadTime])

CREATE NONCLUSTERED INDEX [IX_CSDIM_Readtime] 
ON [dbo].[CDSIM_BE]([ReadTime] ASC)
INCLUDE ([EquipmentID]) 
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
          SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, 
          ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [MonthlyArchiveScheme9]([ReadTime])

And I’m executing this simple query:

Select Max(Id)
From dbo.CDSIM_BE

There are ~2.5B rows in the table.

The query plan shows an index scan being done on the IX_CdSIM_BE_ProbePosition index. I’m wondering why SQL Server simply wouldn’t use the clustered (and primary) index and immediately go to the last row in the table and retrieve the Id value, since that must be the maximum.

Answer :

The clustered index is partitioned on ReadTime so it couldn’t use the PK as you describe. It would need to find the Max(Id) for each partition and then find the max of those. It is possible to rewrite the query to get such a plan however.

Using an example based on the article here a possible rewrite might be

SELECT MAX(ID) AS ID
FROM   sys.partitions AS P
       CROSS APPLY (SELECT MAX(ID) AS ID
                    FROM   [dbo].[CDSIM_BE]
                    WHERE  $PARTITION.MonthlyArchiveFunction9(ReadTime) 
                                                    = P.partition_number) AS A
WHERE  P.object_id = OBJECT_ID('dbo.CDSIM_BE')
       AND P.index_id <= 1; 

To process each partition in turn.

Note the plan still has a scan (with a seek predicate to select the partition) but this is not a full scan of the partition.

The scan is in index order with direction “BACKWARD”. The TOP iterator can stop requesting rows from the scan after the first one is received.

enter image description here

Leave a Reply

Your email address will not be published. Required fields are marked *