Optimizations Possible – SQLServer

Posted on

Question :

I have been given a query that takes about 1:30 to run. I was asked to optomize it as much as possible but am having issues making any headway on it.

SELECT DISTINCT B.PROV,
                FACILITY=CASE B.FACILITY
                           WHEN '' THEN (SELECT FACILITY
                                         FROM   CLDOCTOR
                                         WHERE  DOCTOR = B.PROVCODE)
                           ELSE B.FACILITY
                         END,
                C.PLNAME,
                C.PFNAME,
                C.PMNAME,
                C.DOB,
                C.SEX,
                Cast(Cast(A.DATEBACK AS VARCHAR(12)) AS DATETIME) AS 'DATEBACK',
                C.CITY,
                C.STATE,
                C.ZIP,
                C.HPHONE,
                C.WPHONE,
                A.SSNO,
                C.PTYPE,
                B.ACCOUNT,
                B.ORDERDATE
FROM   CLLAB A,
       CLORDER B,
       CLMASTER C
WHERE  A.COMPANY = B.COMPANY
       AND A.ORDER_ID = B.ID
       AND B.DEPT = 'LAB'
       AND B.ACCOUNT = C.ACCOUNT
       --AND DATEDIFF(MM, DATEBACK, GETDATE()) <=6 
       AND A.DATEBACK NOT IN (SELECT DISTINCT DATEBACK
                              FROM   CLLAB
                              WHERE  COMPANY = A.COMPANY
                                     AND DEPT = 'LAB'
                                     AND SSNO = A.SSNO
                                     --AND DATEDIFF(MM, DATEBACK, GETDATE()) <= 6
                                     AND OBSERVE_ID = 'REVIEWED'
                                     ) 

CLLAB contains 34,185,327 rows, CLORDER contains 13,455,600 rows, and CLMASTER contains 392,797 rows.

Do you see anything that jumps out as inherently wrong that could be cleaned up to help the query run quicker?

Edit 1:
There are indexes on CLLAB‘s ORDERID, SSNO, and a few other unused columns.
There are indexes on CLORDERS‘s PROV, ORDERDATE, and a few other unused columns.

The execution plan says Missing Index (Impace 48.1371): CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] on [dbo].[CLORDER] ([DBO].[CLORDER][DEPT]) INCLUDE ([COMPANY],[ID],[PROV],[ORDERDATE],[ACCOUNT],[FACILITY],[PROVCODE]) which makes me think that there could be huge gains made by indexing CLLAB‘s DEPT

Answer :

First of all be aware though that when you are querying on tables that large you have to expect it to take some time.

That being said assuming you are using SQL 2008 or later you could try a filtered index on CLLab, however that makes several assumptions.

  1. The number of rows where the Dept = “Lab” and Observe_Id = “Reviewed” is a relatively small percentage of the total number of rows.
  2. This is an important enough query to warrant the extra write time to CLLAB for an additional index just for it.

If the answers to both of those are Yes (and of course SQL 2008 or higher) then try this.

CREATE NONCLUSTERED INDEX "IX_CLLAB_REVIEWED_LABS"
ON CLLAB (SSNO, COMPANY, DATEBACK)
WHERE DEPT = "LAB"
  AND OBSERVE_ID = "REVIEWED";

Really I would rather get rid of the subquery all together if possible but that would require more knowledge of the data and what you are trying to get out of the query.

Last but not least if you comment out the small subquery in the field list does it run significantly faster? If so that part of the query could be incorporated into the main query as a LEFT OUTER JOIN and might save you some time.

It would look like this.

-- In the field list
FACILITY = CASE WHEN B.FACILITY = '' THEN D.FACILITY ELSE B.FACILITY END

-- The FROM & WHERE Clauses
FROM CLLAB A
JOIN CLORDER B
    ON A.COMPANY = B.COMPANY
    AND A.ORDER_ID = B.ID
JOIN CLMASTER C
    ON B.ACCOUNT = C.ACCOUNT
LEFT OUTER JOIN CLDoctor D
    ON  B.PROVCODE = D.DOCTOR  
    AND B.FACILITY = ''          -- optionally. May be more efficient
                                 -- with different indexing
WHERE B.DEPT = 'LAB'
  AND A.DATEBACK NOT IN (.....)

The index is a great idea but may have a live performance impact, everything from the FROM should be something like:

   FROM
   CLORDER B inner join CLLAB A on A.ORDER_ID = B.ID
   inner join CLMASTER C on C.ACCOUNT = B.ACCOUNT
   right join (SELECT DATEBACK
      FROM   CLLAB
      AND DEPT = 'LAB',
      AND OBSERVE_ID = 'REVIEWED'
      ) D on D.DATEBACK = A.DATEBACK and AND D.SSNO = A.SSNO 
   WHERE  D.DATEBACK is null

Leave a Reply

Your email address will not be published. Required fields are marked *