Question :
I have been given a query that takes about 1:30 to run. I was asked to optomize it as much as possible but am having issues making any headway on it.
SELECT DISTINCT B.PROV,
FACILITY=CASE B.FACILITY
WHEN '' THEN (SELECT FACILITY
FROM CLDOCTOR
WHERE DOCTOR = B.PROVCODE)
ELSE B.FACILITY
END,
C.PLNAME,
C.PFNAME,
C.PMNAME,
C.DOB,
C.SEX,
Cast(Cast(A.DATEBACK AS VARCHAR(12)) AS DATETIME) AS 'DATEBACK',
C.CITY,
C.STATE,
C.ZIP,
C.HPHONE,
C.WPHONE,
A.SSNO,
C.PTYPE,
B.ACCOUNT,
B.ORDERDATE
FROM CLLAB A,
CLORDER B,
CLMASTER C
WHERE A.COMPANY = B.COMPANY
AND A.ORDER_ID = B.ID
AND B.DEPT = 'LAB'
AND B.ACCOUNT = C.ACCOUNT
--AND DATEDIFF(MM, DATEBACK, GETDATE()) <=6
AND A.DATEBACK NOT IN (SELECT DISTINCT DATEBACK
FROM CLLAB
WHERE COMPANY = A.COMPANY
AND DEPT = 'LAB'
AND SSNO = A.SSNO
--AND DATEDIFF(MM, DATEBACK, GETDATE()) <= 6
AND OBSERVE_ID = 'REVIEWED'
)
CLLAB
contains 34,185,327 rows, CLORDER
contains 13,455,600 rows, and CLMASTER
contains 392,797 rows.
Do you see anything that jumps out as inherently wrong that could be cleaned up to help the query run quicker?
Edit 1:
There are indexes on CLLAB
‘s ORDERID
, SSNO
, and a few other unused columns.
There are indexes on CLORDERS
‘s PROV
, ORDERDATE
, and a few other unused columns.
The execution plan says Missing Index (Impace 48.1371): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] on [dbo].[CLORDER] ([DBO].[CLORDER][DEPT]) INCLUDE ([COMPANY],[ID],[PROV],[ORDERDATE],[ACCOUNT],[FACILITY],[PROVCODE])
which makes me think that there could be huge gains made by indexing CLLAB
‘s DEPT
Answer :
First of all be aware though that when you are querying on tables that large you have to expect it to take some time.
That being said assuming you are using SQL 2008 or later you could try a filtered index on CLLab, however that makes several assumptions.
- The number of rows where the Dept = “Lab” and Observe_Id = “Reviewed” is a relatively small percentage of the total number of rows.
- This is an important enough query to warrant the extra write time to CLLAB for an additional index just for it.
If the answers to both of those are Yes (and of course SQL 2008 or higher) then try this.
CREATE NONCLUSTERED INDEX "IX_CLLAB_REVIEWED_LABS"
ON CLLAB (SSNO, COMPANY, DATEBACK)
WHERE DEPT = "LAB"
AND OBSERVE_ID = "REVIEWED";
Really I would rather get rid of the subquery all together if possible but that would require more knowledge of the data and what you are trying to get out of the query.
Last but not least if you comment out the small subquery in the field list does it run significantly faster? If so that part of the query could be incorporated into the main query as a LEFT OUTER JOIN and might save you some time.
It would look like this.
-- In the field list
FACILITY = CASE WHEN B.FACILITY = '' THEN D.FACILITY ELSE B.FACILITY END
-- The FROM & WHERE Clauses
FROM CLLAB A
JOIN CLORDER B
ON A.COMPANY = B.COMPANY
AND A.ORDER_ID = B.ID
JOIN CLMASTER C
ON B.ACCOUNT = C.ACCOUNT
LEFT OUTER JOIN CLDoctor D
ON B.PROVCODE = D.DOCTOR
AND B.FACILITY = '' -- optionally. May be more efficient
-- with different indexing
WHERE B.DEPT = 'LAB'
AND A.DATEBACK NOT IN (.....)
The index is a great idea but may have a live performance impact, everything from the FROM should be something like:
FROM
CLORDER B inner join CLLAB A on A.ORDER_ID = B.ID
inner join CLMASTER C on C.ACCOUNT = B.ACCOUNT
right join (SELECT DATEBACK
FROM CLLAB
AND DEPT = 'LAB',
AND OBSERVE_ID = 'REVIEWED'
) D on D.DATEBACK = A.DATEBACK and AND D.SSNO = A.SSNO
WHERE D.DATEBACK is null