Question :
If the answer is ‘No’, I’m OK with that…
I’m looking to see if this can be optimized any…it’s part of a much larger stored proc. CGCode
is varchar(50), Year
and Month
are smallint
, FEIN
is char(9)
select max(id)
from Table
where 1=1
and cgcode = 123
and datefromparts(cast(year as char(4)),cast(month as char(2)),'01') < getdate()
and totalcount > 0
group by cgcode, year, month, fein
Logical reads from actual execution plan: 1,566,473
Source table raw data 32 million+ records
Estimated rows: 640K, actual 55K, before the Group By kicks in
Implicit conversion warnings on Year/Month/CGCode
(comes in as bigint
)
Execution time roughly 7.5 seconds, doing a Non-Clustered Index Seek:
End result set is 114 rows (for this CGCode
we test with…others vary)
Performance is roughly the same in Prod on significantly better hardware than the Dev box. This is only going to get worse over time, as its pulling everything older than the current month, to populate a history chart in a UI.
What other info can I provide?
The index being used currently:
CREATE NONCLUSTERED INDEX [COIX_Table_TotalCount] ON [dbo].[Table]
(
[TotalCount] ASC
)
INCLUDE ( [ID],
[CGCode],
[Year],
[Month],
[FEIN])
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF,
ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
CGCode
limits the data most…281 values. year
is only 3-4 years worth of data, month
only 12 options there of course. TotalCount
is a bigint
, 17K distinct values across 32 million records. No idea what the purpose of the TotalCount
column is.
Answer :
First of all, I would rewrite this condition. As it is, it can’t use any index effectively.:
datefromparts(cast(year as char(4)),cast(month as char(2)),'01') < getdate()
Rewrite it so there is no functions or calculations applied to the columns:
( year < year(getdate())
or year = year(getdate()) and month <= month(getdate())
)
The rewrite would allow an index on (cgcode, year, month)
to be used – or even better a (cgcode, year, month, fein)
one, considering the GROUP BY
parts.
Next, if totalcount
is always part of the WHERE
clause or thee specific totalcount > 0
condition is always used, you might squeeze even more efficiency by adding it to the INCLUDE
part or with a more targeted, filtered index:
-- option A
index (cgcode, year, month, fein, id)
include (totalcount)
-- option B
index (cgcode, year, month, fein, id)
where (totalcount > 0)
I will remove cgcode from Group By since I know what is cgcode and it is not range.So removing cgcode (varchar(60)) will give tremendous push.
So if I am correct my index will change too.
Use Variable in place of getdate()
declare @CurDate date=getdate()
declare @yr int=year(@CurDate )
declare @month tinyint=month(@CurDate )
declare @cg varchar(60)='123'
select @cg cgcode,max(id)
from Table
where cgcode = @cg
( year < @yr
or [year] = @yr and [month] <= @month
)
and totalcount > 0
group by year, month, fein
Also technically why totalcount will be covering index
in option A.I don’t understand,by that logic it should within index group or option B
is correct .
Found it 🙂
The clientid is being passed in as INT…causing an implicit conversion to varchar for the CGCode field.
select max(id)
from Table
where 1=1
and cgcode = CAST(123 as varchar(50)) ---this fixed it
and datefromparts(cast(year as char(4)),cast(month as char(2)),'01') < getdate()
and totalcount > 0
group by cgcode, year, month, fein
From 7-30 seconds to 100ms. I’ll take that.