I have a weird situation that I don’t quite understand.
I have a table like this kinda:
CREATE TABLE dbo.cc_demo ( id INT IDENTITY PRIMARY KEY, up_action INT, down_action INT, last_action_date DATETIME ); INSERT dbo.cc_demo ( up_action, down_action, last_action_date ) SELECT TOP 5000000 nums.num % 500000, nums.num % 500000, DATEADD(MINUTE, nums.num, GETDATE()) FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY ( SELECT NULL )) AS num FROM master..spt_values AS sv CROSS JOIN master..spt_values AS sv2 ) AS nums;
If I run this query, the plan is just a regular clustered index scan.
SELECT * FROM dbo.cc_demo AS cd WHERE cd.last_action_date >= '20270601' AND cd.last_action_date < '20270901';
But if I add a computed column and index it, my plan changes for the worst.
ALTER TABLE dbo.cc_demo ADD total_actions AS up_action + down_action; CREATE INDEX ix_total_actions ON dbo.cc_demo (total_actions);
Now it scans the clustered index and then filters stuff out way later on!
It asks for an index, and that changes the plan back to what I’d expect, but why do I have to add this?
CREATE NONCLUSTERED INDEX [WORLDSTAR] ON dbo.cc_demo ( last_action_date ) INCLUDE ( up_action, down_action, total_actions );
That doesn’t seem logical.
Table 'cc_demo'. Scan count 1, logical reads 17990, SQL Server Execution Times: CPU time = 234 ms, elapsed time = 224 ms.
Post computed column + index:
Table 'cc_demo'. Scan count 1, logical reads 17990 SQL Server Execution Times: CPU time = 594 ms, elapsed time = 591 ms.
Here’s the breakdown:
- When I add the computed column, the plan doesn’t change
- When I index the computed column, it does
- I’m not too worried about the performance yet
- I’m just curious why the plan changes this way after I index the computed column
I’m just curious why the plan changes this way after I index the computed column
It’s not indexing the computed column that really matters; any additional non-covering index can produce the same effect (so long as the table occupies more than one page). With only a clustered index present (or a fully covering nonclustered index), the optimizer can produce a
TRIVIAL plan, with the filtering condition fully pushed down.
When there are non-trivial access method choices to make, the query goes through cost-based optimization. Cost-based optimization contains more complex and powerful index-matching rules, as you might expect, but the presence of multiple compute scalars due to computed column expansion and projection can prevent predicates (filters) being pushed down the execution plan, when the computed column is not
SQL Server tries to push predicates (filters) as far down the logical query tree as it can early in the compilation process, but it can be quite conservative about doing this when non-persisted computed columns are involved. A filter that is not pushed down early is unlikely to be pushed further during cost-based optimization.
DROP TABLE IF EXISTS #CC; CREATE TABLE #CC ( id integer IDENTITY PRIMARY KEY, c0 integer NULL, c1 integer NULL, c2 integer NULL, c3 AS c1 + c2 ); -- Just enough rows to fill more than one page -- (so not all data access is trivial) INSERT #CC (c0, c1, c2) SELECT SV.number, SV.number, SV.number FROM master.dbo.spt_values AS SV WHERE SV.[type] = N'P' AND SV.number BETWEEN 1 AND 324; -- Trivial plan (only a clustered index to choose from) SELECT * FROM #CC AS C WHERE C.c0 >= 1; -- Add a non-covering index *not* on the computed column CREATE INDEX ic2 ON #CC (c2); -- Filter not pushed SELECT * FROM #CC AS C WHERE C.c0 >= 1;
There are several ways to workaround this limitation in your example:
Do not project the
total_actionscolumn. If the column is not needed, computations to derive its value cannot get in the way of predicate pushing.
Make the computed column
PERSISTED. This allows the optimizer consider plans that read the persisted value from the base table, avoiding the computation.
Specify the clustered index using a hint e.g.
WITH (INDEX(1)). This allows a trivial plan, since it removes the question of access method choice.
Specify the index on the computed column using a
(INDEX(ix_total_actions))hint. While not covering (so no trivial plan), this hint also allows the column to come from storage rather than being computed. The index does not include
last_action_date, so the predicate is applied in a Key Lookup. This plan may therefore be quite inefficient.