Question :
I have a slow query in Sql 2017. Key lookup (Clustered) is taking 62%. There is also an Index Seek (nonclustered) taking 30%.
Is there anyway I can speed this query up?
Execution plan: https://www.brentozar.com/pastetheplan/?id=r1oIcWofd
exec sp_executesql
N'
SELECT TOP(@__p_4)
[v].[County],
[v].[Community],
[v].[DataProvider],
[v].[Site],
[v].[Latitude],
[v].[Longitude],
[v].[Elevation],
[v].[Instrument],
[v].[MonitorId],
[v].[ParameterId],
[v].[Parameter],
[v].[MeasurementStartTime],
[v].[MeasuredValue],
[v].[AdjustedValue],
[v].[Units],
[v].[AdjustmentDesc],
[v].[LowerLimitCheck],
[v].[UpperLimitCheck],
[v].[RepeatingValueCheck],
[v].[SpikeCheck],
[v].[QualityPreliminary]
FROM [vw_DownloadDataSubhourly] AS [v]
WHERE
(
(([v].[CountyId] = @__Parse_0) AND ([v].[MeasurementStartTime] >= @__start_1))
AND ([v].[MeasurementStartTime] <= @__end_2)
)
AND [v].[ParameterType] IN (''PM2.5'')
ORDER BY [v].[Site], [v].[MonitorId], [v].[ParameterId], [v].[MeasurementStartTime]
',
N'@__p_4 int,@__Parse_0 int,@__start_1 datetime,@__end_2 datetime',
@__p_4=750000,@__Parse_0=34,@__start_1='2019-10-01 00:00:00',@__end_2='2020-12-31 23:59:59'
vw_DownloadDataSubhourly
Execution plan: https://www.brentozar.com/PasteThePlan/?id=BkiX1Vsz_
CREATE view [dbo].[vw_DownloadDataSubhourly] as
SELECT
DataProvider.DataProviderId,
County.CountyId,
Community.CommunityId,
County.CountyName as "County",
Community.CommunityName as "Community",
DataProvider.DataProviderShortName as "DataProvider",
Site.SiteLongName as "Site",
SiteConfig.LatitudeTrunc as "Latitude",
SiteConfig.LongitudeTrunc as "Longitude",
SiteConfig.Elevation,
Monitor.ExternalMonitorId as "MonitorId",
Parameter.ParameterId as "ParameterId",
Parameter.ParameterNamePublic "Parameter",
Parameter.ParameterType "ParameterType",
SubHourlyMeasurementData.StartDateTime "MeasurementStartTime",
SubHourlyMeasurementData.AObs as "MeasuredValue",
SubHourlyMeasurementData.AObsAdj as "AdjustedValue",
UOM.UOMAbbreviation as "Units",
AdjCode.AdjCodeDescription as "AdjustmentDesc",
concat(MonitorConfig.MonitorMfr, ' ', MonitorConfig.MonitorModel) as "Instrument",
case
when SubHourlyMeasurementData.QF06 = '-3' then 'QC not computed'
when SubHourlyMeasurementData.QF06 = '-2' then 'Not enough data'
when SubHourlyMeasurementData.QF06 = '-1' then 'Not enough site/monitor info'
when SubHourlyMeasurementData.QF06 = '0' then 'Pass'
when SubHourlyMeasurementData.QF06 = '1' then 'Fail'
else NULL
end as "LowerLimitCheck",
case
when SubHourlyMeasurementData.QF05 = '-3' then 'QC not computed'
when SubHourlyMeasurementData.QF05 = '-2' then 'Not enough data'
when SubHourlyMeasurementData.QF05 = '-1' then 'Not enough site/monitor info'
when SubHourlyMeasurementData.QF05 = '0' then 'Pass'
when SubHourlyMeasurementData.QF05 = '1' then 'Fail'
else NULL
end as "UpperLimitCheck",
case
when SubHourlyMeasurementData.QF04 = '-3' then 'QC not computed'
when SubHourlyMeasurementData.QF04 = '-2' then 'Not enough data'
when SubHourlyMeasurementData.QF04 = '-1' then 'Not enough site/monitor info'
when SubHourlyMeasurementData.QF04 = '0' then 'Pass'
when SubHourlyMeasurementData.QF04 = '1' then 'Fail'
else NULL
end as "RepeatingValueCheck",
case
when SubHourlyMeasurementData.QF07 = '-3' then 'QC not computed'
when SubHourlyMeasurementData.QF07 = '-2' then 'Not enough data'
when SubHourlyMeasurementData.QF07 = '-1' then 'Not enough site/monitor info'
when SubHourlyMeasurementData.QF07 = '0' then 'Pass'
when SubHourlyMeasurementData.QF07 = '1' then 'Fail'
else NULL
end as "SpikeCheck",
case
when SubHourlyMeasurementData.QOverall = '-3' then 'QC not computed'
when SubHourlyMeasurementData.QOverall = '-2' then 'Not enough data'
when SubHourlyMeasurementData.QOverall = '-1' then 'Not enough site/monitor info'
when SubHourlyMeasurementData.QOverall = '0' then 'Passed all checks'
when SubHourlyMeasurementData.QOverall = '1' then 'Suspect'
else NULL
end as "QualityPreliminary"
FROM SubHourlyMeasurementData
INNER JOIN
DataProvider ON SubHourlyMeasurementData.DataProviderId = DataProvider.DataProviderId
INNER JOIN
Site ON SubHourlyMeasurementData.SiteId = Site.SiteId AND Site.DataProviderId = DataProvider.DataProviderId
LEFT JOIN Community ON Site.CommunityId = Community.CommunityId
INNER JOIN
SiteConfig ON SiteConfig.SiteId = SubHourlyMeasurementData.SiteId
INNER JOIN
County ON County.CountyId = SiteConfig.CountyId
INNER JOIN
Monitor ON Monitor.MonitorId = SubHourlyMeasurementData.MonitorId
INNER JOIN
MonitorConfig ON MonitorConfig.MonitorConfigId = SubHourlyMeasurementData.MonitorConfigId
INNER JOIN
Parameter ON Parameter.ParameterId = SubHourlyMeasurementData.ParameterId
INNER JOIN
UOM ON SubHourlyMeasurementData.UOMId = UOM.UOMId
LEFT JOIN
AdjCode ON SubHourlyMeasurementData.AdjCodeId = AdjCode.AdjCodeId
INNER JOIN
StreamSegment ON SubHourlyMeasurementData.StreamSegmentId = StreamSegment.StreamSegmentId
INNER JOIN
MeasurementTechnique ON MeasurementTechnique.MeasurementTechniqueId = StreamSegment.MeasurementTechniqueId
AND MeasurementTechnique.ParameterId = SubHourlyMeasurementData.ParameterId
INNER JOIN
SamplingDuration ON SamplingDuration.SamplingDurationId = StreamSegment.SamplingDurationId
INNER JOIN
SamplingFrequencyUnits ON SamplingFrequencyUnits.SamplingFrequencyUnitsId = StreamSegment.SamplingFrequencyUnitsId
WHERE SubHourlyMeasurementData.StartDateTime >= MonitorConfig.StartDateTime and StreamSegment.IsDownloadable = 1 and SubHourlyMeasurementData.iscalculated = 1
GO
Answer :
J.D. has covered the stats bit – essentially, you might get better estimates if you have a more representative stats sample for the indexes on the SubHourlyMeasurementData
table. Right now it’s only sampling around 1% of the rows in the table:
Here we can see that the stats have been updated (ModificationCount is 0), but the SamplingPercent was only 1.41871. SQL Server picks this percentage automatically based on the number of rows in the table.
You can try updating stats with FULLSCAN
, which might take a while:
UPDATE STATISTICS dbo.SubHourlyMeasurementData
WITH FULLSCAN;
If that helps you can work down to a more manageable percentage that still solves the problem:
UPDATE STATISTICS dbo.SubHourlyMeasurementData
SAMPLE 50 PERCENT;
Regarding the key lookup, that is required in order check the residual predicate IsCalculated=1
:
You can avoid the lookup by adding IsCalculated
to the INCLUDE
list for the <IDX_DownloadDataSubHourly>
index:
CREATE NONCLUSTERED INDEX [<IDX_DownloadDataSubHourly>]
ON dbo.SubHourlyMeasurementData (...)
INCLUDE (..., IsCalculated);
If that predicate is used a lot, and it’s fairly selective, you might consider creating a filtered index for it (or adding WHERE IsCalculated=1
to the current nonclustered index, but that might break other queries trying to use it).
So you definitely have some wild cardinality estimate issues going on based on your execution plan (along with some sub-performant index scans and warnings such as residual I/O issues probably resultant of the cardinality estimate issues).
Please see the following couple of cases where SQL Server thinks your query is only going to return a small number of rows (Estimated Rows) but the Actual Rows is much larger, in the tens of millions:
Unfortunately it looks like this is all coming from the query under the hood of the view you’re querying, so without knowing what that query is actually doing, it’s tough to debug the issue. If you could please provide the view definition of vw_DownloadDataSubhourly
, we should be able to help you trace the source of these cardinality estimate issues.
Cardinality estimate issues occur when the SQL Engine believes that the amount of rows an operation of your execution plan is going to return severely differs from the actual number of rows that are returned (usually I find when an order of magnitude of difference between those two statistics are exhibited, a cardinality estimate issue exists). The further off those two are, generally the worse the issue is. Especially in the case when the Estimated Rows are less than the Actual Rows because that means the SQL Engine provisioned less resources (e.g. Memory) than needed to serve the data for your query, hence the slow runtime.
Barring having the view definition of vw_DownloadDataSubhourly
, the only thing I can recommend at this time is trying to update statistics on the entire table for some of these one’s exhibiting cardinality estimate issues. Here’s an additional resource regarding statistics and how to manually and automatically update them. For now I would recommend trying to manually update them for the following tables roughly in this order:
dbo.SubHourlyMeasurementData
dbo.AdjCode
dbo.Community
dbo.SamplingFrequencyUnits
dbo.SiteConfig
dbo.Site
dbo.SamplingDuration
dbo.MonitorConfig
dbo.DataProvider
dbo.UOM
dbo.Parameter
Unfortunately a majority of the tables referenced by your view vw_DownloadDataSubhourly
are facing cardinality estimate issues.
You’ll want to ensure you update the statistics with the FULLSCAN
option to ensure you get accurate calculations on your larger tables. Also please note for larger tables, updating statistics could take a bit of time to run and potentially be resource intensive on the server during that time period.