Question :
SELECT ISNULL(SUM(CASE WHEN T4.f_in =12 THEN T4.f_money END),0) -
ISNULL(SUM(CASE WHEN T4.f_out=12 THEN T4.f_money END),0)
FROM [Credit] M WITH(NOLOCK)
INNER JOIN [Transaction] T4 WITH(NOLOCK) ON M.f_accounts=T4.f_accounts
AND (T4.f_time>=M.f_time OR M.f_time IS NULL)
AND T4.f_other IS NULL
AND (T4.f_in=12 OR T4.f_out=12)
After adding an index, one of the queries that originally finishes within 1s now takes around 12s. It seems that the index is now seeked by the query with bad estimation of only 1 row to be read. No other indexes are present in both Credit and Transaction tables.
CREATE NONCLUSTERED INDEX IX_Credit_f_time
ON Credit (f_time) INCLUDE (f_accounts,f_credit)
Above is the index added to the table.
Plan with the index, takes 12s. The highlighted parts are number of rows read, estimated number of rows and estimated number of rows to be read respectively.
Statistics are updated with fullscan but results are still the same.
From my understanding the f_time column is pretty concentrated, hash joining the table once would be much more efficient than seeking multiple times.
Full Execution Plan: https://www.brentozar.com/pastetheplan/?id=rJjwvXBA_
Backup Execution Plan: https://pastebin.com/XLwAGXSf
Answer :
I believe this index is confusing SQL Server. The join conditions include
T4.f_time>=M.f_time
This type of join predicate is generally hard for SQL Server to reason about (greater or smaller, with columns on both sides). The index makes things worse, because it’s first key column is f_time and other columns are only included. So for every row from Transaction table it seeks all rows where T4.f_time>=M.f_time
and only then applies other predicates. And here it goes wrong, because optimizer guesses that for every seek there will be one row read. Where actually it’s 57237 on average (89290306 rows read divided by 1560 executions). This is easily visible in the execution plan:
In my opinion the index should be dropped and following one created (as suggested by others in comments):
CREATE NONCLUSTERED INDEX IX_Credit
ON Credit
(f_accounts,f_time)
That way a seek can be done on both of these columns and it should be much more selective. Do note the order of the columns is important here.
As a general note I try to put columns where equality predicate is used first and columns with inequality as last in indexes.