Question :
Alright so I have a report stored procedure that was running incredibly slow. Customers were complaining that the report would not run so I started investigating exactly where in the stored procedure the problem was and I found this portion taking up 99.8% of the time.
DECLARE @xmlTemp TABLE (
CompanyID INT,
StoreID INT,
StartDate DATETIME,
DateStaID INT,
EndDate DATETIME,
DateEndID INT,
LastUpdate DATETIME)
INSERT INTO @xmlTemp
VALUES (50,
2,
'3/3/2013',
0,
'3/3/2013',
0,
'3/3/2013')
SELECT DISTINCT T.CompanyID,
CompanyName,
Z.StoreID,
StoreName,
CashedOutBy,
TransactionID,
RegisterID,
BusinessDate,
CashedOut,
Total - Isnull((SELECT Sum(DISTINCT PaymentAmount)
FROM vPullDrawerPayments
WHERE CompanyID = T.CompanyID
AND StoreID = T.StoreID
AND TransactionID = T.TransactionID
AND Isnull(PaymentType, 1) <> 1), 0) AS PaymentAmount,
'Cash' AS PaymentDesc,
CASE
WHEN Z.EndDate >= Z.LastUpdate THEN 1
ELSE 0
END AS MissingData
FROM vPullDrawerPayments AS T
INNER JOIN @xmlTemp AS Z
ON T.CompanyID = Z.CompanyID
AND T.StoreID = Z.StoreID
WHERE BusinessDate BETWEEN Z.StartDate AND Z.EndDate
UNION ALL
SELECT DISTINCT NC.CompanyID,
CompanyName,
Z.StoreID,
StoreName,
CashedOutBy,
TransactionID,
RegisterID,
BusinessDate,
CashedOut,
PaymentAmount,
PaymentDesc,
CASE
WHEN Z.EndDate >= Z.LastUpdate THEN 1
ELSE 0
END AS MissingData
FROM vPullDrawerPayments AS NC
INNER JOIN @xmlTemp AS Z
ON NC.CompanyID = Z.CompanyID
AND NC.StoreID = Z.StoreID
WHERE BusinessDate BETWEEN Z.StartDate AND Z.EndDate
AND Isnull(PaymentType, 1) <> 1
UNION ALL
SELECT DISTINCT C.CompanyID,
CompanyName,
Z.StoreID,
StoreName,
CashedOutBy,
TransactionID,
RegisterID,
BusinessDate,
CashedOut,
Sum(Abs(LineAmount)) AS PaymentAmount,
'Coupons' AS PaymentDesc,
CASE
WHEN Max(Z.EndDate) >= Max(Z.LastUpdate) THEN 1
ELSE 0
END AS MissingData
FROM vPullDrawerPayments AS C
INNER JOIN @xmlTemp AS Z
ON C.CompanyID = Z.CompanyID
AND C.StoreID = Z.StoreID
WHERE BusinessDate BETWEEN Z.StartDate AND Z.EndDate
GROUP BY C.CompanyID,
CompanyName,
Z.StoreID,
StoreName,
CashedOutBy,
TransactionID,
RegisterID,
BusinessDate,
CashedOut
The @xmlTemp portion of this query is normally used to take parameters from our web application and turn them into parameters the report can actually use. For the sake of testing this I am just inserting values that run this for one store for one day. Running this portion can take upwards of 20 minutes.
So I ran this query plan through PlanExplorer and saw it was pulling all of the data from two of my fact tables instead of filtering out just that store and that day. As seen in the picture below.
Obviously this is bad. So the next step I took is to cut out the join the @xml temp and just manually put in the values in the queries WHERE
clause to see how well that worked.
SELECT DISTINCT T.CompanyID,
CompanyName,
T.StoreID,
StoreName,
CashedOutBy,
TransactionID,
RegisterID,
BusinessDate,
CashedOut,
Total - Isnull((SELECT Sum(DISTINCT PaymentAmount)
FROM vPullDrawerPayments
WHERE CompanyID = T.CompanyID
AND StoreID = T.StoreID
AND TransactionID = T.TransactionID
AND Isnull(PaymentType, 1) <> 1), 0) AS PaymentAmount,
'Cash' AS PaymentDesc
--CASE WHEN Z.'3/3/2013' >= Z.LastUpdate THEN 1 ELSE 0 END AS MissingData
FROM vPullDrawerPayments AS T
WHERE CompanyID = 50
AND StoreID = 1
AND BusinessDate BETWEEN '3/3/2013' AND '3/3/2013'
UNION ALL
SELECT DISTINCT NC.CompanyID,
CompanyName,
NC.StoreID,
StoreName,
CashedOutBy,
TransactionID,
RegisterID,
BusinessDate,
CashedOut,
PaymentAmount,
PaymentDesc
--CASE WHEN Z.'3/3/2013' >= Z.LastUpdate THEN 1 ELSE 0 END AS MissingData
FROM vPullDrawerPayments AS NC
WHERE CompanyID = 50
AND StoreID = 1
AND BusinessDate BETWEEN '3/3/2013' AND '3/3/2013'
AND Isnull(PaymentType, 1) <> 1
UNION ALL
SELECT DISTINCT C.CompanyID,
CompanyName,
C.StoreID,
StoreName,
CashedOutBy,
TransactionID,
RegisterID,
BusinessDate,
CashedOut,
Sum(Abs(LineAmount)) AS PaymentAmount,
'Coupons' AS PaymentDesc
--CASE WHEN MAX(Z.'3/3/2013') >= MAX(Z.LastUpdate) THEN 1 ELSE 0 END AS MissingData
FROM vPullDrawerPayments AS C
WHERE CompanyID = 50
AND StoreID = 1
AND BusinessDate BETWEEN '3/3/2013' AND '3/3/2013'
GROUP BY C.CompanyID,
CompanyName,
C.StoreID,
StoreName,
CashedOutBy,
TransactionID,
RegisterID,
BusinessDate,
CashedOut
By changing this to a simple where clause it runs in 4 seconds instead of 20 minutes and up. Also the query Plan is showing the correct Is there any reason I should be seeing this behavior?
Edit here is the full Link to the QueryPlan.
Answer :
[Copying from my answer on SQLPerformance.com.]
Some very brief initial suggestions from discussions elsewhere:
- Try creating @xmlTemp as a #temp table with a clustered index on (StartDate, EndDate) instead of a table variable. This may provide SQL Server with more accurate stats information (though questionably useful if the table only has one row).
- If @xmlTemp has only one row always, use two variables instead of a table in the first place.
- Try adding the (RECOMPILE) option to the statement, especially if you convert to variables instead of the #temp table (parameter sniffing).
- Try using OPTION (MAXDOP 1) – parallelism is definitely in use, and at the lower end the threads seem partially imbalanced. I wonder if parallelism is helping or hurting here – can’t hurt to test duration with and without.
- You may need to perform more rigorous stats updates. A lot of these estimates are way, way off.
- Remove the DISTINCTs. For this set of columns I find it hard to believe this is eliminating any duplicates, but the optimizer has to work as if there are dupes to remove.
- Consider using Table-Valued Parameters (TVPs) instead of shredding XML for the different companies / stores.