If the answer is ‘No’, I’m OK with that…
I’m looking to see if this can be optimized any…it’s part of a much larger stored proc.
CGCode is varchar(50),
select max(id) from Table where 1=1 and cgcode = 123 and datefromparts(cast(year as char(4)),cast(month as char(2)),'01') < getdate() and totalcount > 0 group by cgcode, year, month, fein
Logical reads from actual execution plan: 1,566,473
Source table raw data 32 million+ records
Estimated rows: 640K, actual 55K, before the Group By kicks in
Implicit conversion warnings on
Year/Month/CGCode (comes in as
Execution time roughly 7.5 seconds, doing a Non-Clustered Index Seek:
End result set is 114 rows (for this
CGCode we test with…others vary)
Performance is roughly the same in Prod on significantly better hardware than the Dev box. This is only going to get worse over time, as its pulling everything older than the current month, to populate a history chart in a UI.
What other info can I provide?
The index being used currently:
CREATE NONCLUSTERED INDEX [COIX_Table_TotalCount] ON [dbo].[Table] ( [TotalCount] ASC ) INCLUDE ( [ID], [CGCode], [Year], [Month], [FEIN]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO
CGCode limits the data most…281 values.
year is only 3-4 years worth of data,
month only 12 options there of course.
TotalCount is a
bigint, 17K distinct values across 32 million records. No idea what the purpose of the
TotalCount column is.
First of all, I would rewrite this condition. As it is, it can’t use any index effectively.:
datefromparts(cast(year as char(4)),cast(month as char(2)),'01') < getdate()
Rewrite it so there is no functions or calculations applied to the columns:
( year < year(getdate()) or year = year(getdate()) and month <= month(getdate()) )
The rewrite would allow an index on
(cgcode, year, month) to be used – or even better a
(cgcode, year, month, fein) one, considering the
GROUP BY parts.
totalcount is always part of the
WHERE clause or thee specific
totalcount > 0 condition is always used, you might squeeze even more efficiency by adding it to the
INCLUDE part or with a more targeted, filtered index:
-- option A index (cgcode, year, month, fein, id) include (totalcount) -- option B index (cgcode, year, month, fein, id) where (totalcount > 0)
I will remove cgcode from Group By since I know what is cgcode and it is not range.So removing cgcode (varchar(60)) will give tremendous push.
So if I am correct my index will change too.
Use Variable in place of getdate()
declare @CurDate date=getdate() declare @yr int=year(@CurDate ) declare @month tinyint=month(@CurDate ) declare @cg varchar(60)='123' select @cg cgcode,max(id) from Table where cgcode = @cg ( year < @yr or [year] = @yr and [month] <= @month ) and totalcount > 0 group by year, month, fein
Also technically why
totalcount will be covering index in option A.I don’t understand,by that logic it should within index group or
option B is correct .
Found it 🙂
The clientid is being passed in as INT…causing an implicit conversion to varchar for the CGCode field.
select max(id) from Table where 1=1 and cgcode = CAST(123 as varchar(50)) ---this fixed it and datefromparts(cast(year as char(4)),cast(month as char(2)),'01') < getdate() and totalcount > 0 group by cgcode, year, month, fein
From 7-30 seconds to 100ms. I’ll take that.