Question :
Unless I’m misunderstanding the purpose of the column, the following code indicates that a change of the structure of the clustered index does not change the ordinal position (stats_column_id
)of the column in the sys.stats_columns DMV. (Tested in AdventureWorks2014, AdventureWorks2008R2)
select i.name, c.name, ic.column_id, ic.index_column_id
from sys.indexes i
join sys.index_columns ic
on i.object_id = ic.object_id
and i.index_id = ic.index_id
join sys.columns c
on i.object_id = c.object_id
and ic.column_id = c.column_id
where i.name = 'PK_BusinessEntityAddress_BusinessEntityID_AddressID_AddressTypeID'
order by ic.key_ordinal;
select sh.name,s.name, c.name, c.column_id, sc.column_id, sc.stats_column_id
from sys.stats s
join sys.stats_columns sc
on s.object_id = sc.object_id
and s.stats_id = sc.stats_id
join sys.columns c
on s.object_id = c.object_id
and sc.column_id = c.column_id
join sys.tables t
on s.object_id = t.object_id
join sys.schemas sh
on t.schema_id = sh.schema_id
where s.name = 'PK_BusinessEntityAddress_BusinessEntityID_AddressID_AddressTypeID'
order by sc.stats_column_id;
dbcc show_statistics('[Person].[BusinessEntityAddress]','PK_BusinessEntityAddress_BusinessEntityID_AddressID_AddressTypeID') with density_vector;
ALTER TABLE [Person].[BusinessEntityAddress] DROP CONSTRAINT [PK_BusinessEntityAddress_BusinessEntityID_AddressID_AddressTypeID]
GO
ALTER TABLE [Person].[BusinessEntityAddress] ADD CONSTRAINT [PK_BusinessEntityAddress_BusinessEntityID_AddressID_AddressTypeID] PRIMARY KEY CLUSTERED
(
AddressID ASC,
[BusinessEntityID] ASC,
[AddressTypeID] ASC
)
GO
select i.name, c.name, ic.column_id, ic.index_column_id
from sys.indexes i
join sys.index_columns ic
on i.object_id = ic.object_id
and i.index_id = ic.index_id
join sys.columns c
on i.object_id = c.object_id
and ic.column_id = c.column_id
where i.name = 'PK_BusinessEntityAddress_BusinessEntityID_AddressID_AddressTypeID'
order by ic.key_ordinal;
select sh.name,s.name, c.name, c.column_id, sc.column_id, sc.stats_column_id
from sys.stats s
join sys.stats_columns sc
on s.object_id = sc.object_id
and s.stats_id = sc.stats_id
join sys.columns c
on s.object_id = c.object_id
and sc.column_id = c.column_id
join sys.tables t
on s.object_id = t.object_id
join sys.schemas sh
on t.schema_id = sh.schema_id
where s.name = 'PK_BusinessEntityAddress_BusinessEntityID_AddressID_AddressTypeID'
order by sc.stats_column_id;
dbcc show_statistics('[Person].[BusinessEntityAddress]','PK_BusinessEntityAddress_BusinessEntityID_AddressID_AddressTypeID') with density_vector;
However, the density vectors indicate a change in the leading column of the index/statistics object. Is this a fundamental misunderstanding on my part? If so, how would I find the leading column of a statistics object using DMVs?
Tested SQL Server versions: 2008R2, 2014
Answer :
I was having the same issue while trying to reproduce the way others retrieve index information from the sys.dm views in SQL Server. I just couldn’t figure out the order of the columns in the index.
Following is a script I created to determine the order of the columns in any given index for a given table:
SELECT s.name AS Schema_name,
o.name AS Table_Name,
i.type_desc AS Index_Type,
i.name AS Index_Name,
c.name AS Table_Column,
i.fill_factor AS Indx_Fill_Factor,
ic.key_ordinal AS [Key_ordinal (IDX Column_Order)],
ic.index_column_id AS Index_column_id,
stc.stats_column_id AS Stats_Col_ID,
-- Additional info for each joined table
-- comment out what you don't need
-- 2 lines at a time
--
-- '| table object -->', -- column seperator
-- o.*,
-- '| schema object-->', -- column seperator
-- s.*,
'| index info-->', -- column seperator
i.*,
'| sys index info -->', -- column seperator
si.*,
'| indx cols info -->', -- column seperator
ic.*,
'| tab cols info -->', -- column seperator
c.*,
'| idx stats info -->', -- column seperator
st.*,
'| idx stats columns info -->', -- column seperator
stc.*
FROM sys.objects AS o
JOIN sys.schemas AS s
ON s.schema_id = o.schema_id
JOIN sys.indexes AS i
ON i.object_id = o.object_id
JOIN sys.sysindexes as si
ON si.[id] = i.object_id
AND si.indid = i.index_id
JOIN sys.index_columns AS ic
ON ic.object_id = i.object_id
AND ic.index_id = i.index_id
JOIN sys.columns AS c
ON c.object_id = ic.object_id
AND c.column_id = ic.column_id
JOIN sys.stats AS st
ON st.object_id = i.object_id
and st.stats_id = i.index_id
JOIN sys.stats_columns AS stc
ON c.column_id = stc.column_id
AND stc.stats_id = st.stats_id
AND stc.[object_id] = o.[object_id]
WHERE 1=1
--and i.type <> 1 -- Exclude Clustered Indexes. 0 = Heap; 1 = Clustered Index, 2 = Non-Clustered Index
AND s.name != 'sys' -- Exclude sys items
and o.name = 'BusinessEntityAddress'
AND i.name = 'PK_BusinessEntityAddress_BusinessEntityID_AddressID_AddressTypeID'
ORDER BY
o.object_id,
i.index_id,
ic.key_ordinal
The column key_ordinal
in the sys.index_columns table is the order in which the columns are stored in the index.
There isn’t a key_ordinal
column for the sys.stats_columns
table. The column stats_column_id
just replicates the index_column_id
column of the object it references.
There is a slight difference in the wording of the article sys.stats_columns (Transact-SQL) for the column stats_column_id
:
1-based ordinal within set of stats columns.
…and in the article sys.index_columns (Transact-SQL) for the key_ordinal
column:
Ordinal (1-based) within set of key-columns.
I reckon that the index_column_id
(sys.index_columns) and stats_column_id
(sys.stats_columns) are the equivalent of each other and that only the sys.index_columns table has an ordering column, namely key_ordinal
.
By all accounts this may be a bugged behavior in the sys.stats_columns DMV. This appears to be causing problems when a statistic is updated by way of the parent index. I believe this to be due to the mechanism with which the statistics are being updated in a constraint change.
Should you create a statistic manually and then wish to change the columns you must first drop and re-create which forces the meta-data to be updated in the DMV in question. In the operation you have demonstrated there appears to be a situation where the metadata is not updated under any circumstances (DBCC *, CHECKPOINT, server restart, statistics update through parent index change, etc) once the change has been made. From my initial testing I can find only one case when the metadata is properly updated which is the drop and re-create scenario.
You can take a look at the Connect item on the issue and up-vote as appropriate. There is a work around query posted there but its mechanism is based on matching the index name to the statistic name and utilizing the index meta-data.