Question :
We write a query that include unpivot, partition by and order by.
Query is:
SELECT PersonId
,SalaryDate
,ID
,Type
,SalaryValue
,ROW_NUMBER() OVER (
PARTITION BY PersonId ORDER BY SalaryValue
) AS rn
FROM (
SELECT lp.PersonId
,lp.SalaryDate
,lp.Salary1
,lp.Salary2
,lp.Salary3
,lp.ID
FROM rdd.Salaries AS lp WITH (NOLOCK)
WHERE lp.SalaryDate > DATEADD(day, - 31, getdate())
) AS t
unpivot(SalaryValue FOR Type IN (
lp.Salary1
,lp.Salary2
,lp.Salary3
)) AS UnpivotTable
The query returns about 68.000.000 rows and execution time is 20 minutes.
Can I improve the query’s performance or rewrite effectively?
What is the alternative of partition by?
Answer :
You may find that the following index and query rewrite performs better, because it sorts per person rather than once over the whole set, and row estimates are more likely to be accurate:
-- Index
CREATE INDEX IX_Salaries_PersonId_SalaryDate_Inc_ID_Salary1_Salary2_Salary3
ON rdd.Salaries (PersonId, SalaryDate)
INCLUDE (ID, Salary1, Salary2, Salary3);
-- Query
WITH People AS
(
SELECT DISTINCT
S.PersonId
FROM rdd.Salaries AS S
WHERE
S.SalaryDate > DATEADD(DAY, -31, GETDATE())
)
SELECT
P.PersonId,
CA.SalaryDate,
CA.ID,
CA.SalaryValue,
CA.rn
FROM People AS P
CROSS APPLY
(
SELECT
S.SalaryDate,
S.ID,
V.SalaryValue,
rn = ROW_NUMBER() OVER (ORDER BY V.SalaryValue)
FROM rdd.Salaries AS S
CROSS APPLY
(
SELECT S.Salary1 WHERE S.Salary1 IS NOT NULL
UNION ALL
SELECT S.Salary2 WHERE S.Salary2 IS NOT NULL
UNION ALL
SELECT S.Salary3 WHERE S.Salary3 IS NOT NULL
) AS V (SalaryValue)
WHERE
S.PersonId = P.PersonId
AND S.SalaryDate > DATEADD(DAY, -31, GETDATE())
) AS CA
ORDER BY
P.PersonId,
CA.rn
OPTION (QUERYTRACEON 8649);
You can omit the OPTION
clause if you find a parallel query is generated naturally, or if you find non-parallel performance is good enough. The desired plan shape is roughly as follows: