Long Running Query Optimization

Posted on

Question :

I need to optimize this query as this query is taking minutes of time in execution.
Below is the query. i’m giving date parameters and shift id is optional.
as query is returning same in and out from attendance table based on shift ID.

SELECT E.EmpCode,
        E.EmpName,
        e.DesignationName,
        D.DeptName, 
        CONVERT(Date, A.TimeIn) AS Date, a.TimeIn,a.TimeOut, 
        DATEPART(WEEKDAY, CONVERT(date, A.TimeIn)) AS Day,
        sh.ShiftID,CONVERT(TIME,sh.TimeIn) AS STimeIN,
        CONVERT(TIME,sh.TimeOut) AS STimeOut,sh.ShiftName,
        CASE WHEN CONVERT(TIME,a.TimeIn) < CONVERT(TIME,DATEADD(HOUR,4,sh.TimeIn)) 
        THEN 'MarkOut' ELSE 'MarkIN' END AS Status
FROM  dbo.Attendance AS A 
INNER JOIN dbo.Employee E ON A.EmpID = E.EmpID
INNER JOIN  dbo.Department D ON E.DeptID = d.DeptID
INNER JOIN dbo.Shifts sh 
    ON sh.ShiftID = (SELECT s.ShiftID FROM dbo.Shifts s , employee emp 
                                      WHERE s.ShiftID = isnull((select ShiftID from EmpWiseShiftAllot 
                                                                where CONVERT(date,DateFrom) = CONVERT(date, A.TimeIn) 
                                                                and IsActive =1 and IsCancel =0 and empid = E.empid), 
                                                                (select top 1 ShiftID from DeptStrengthLimit L where L.DeptID = E.DeptID and L.CompanyID = 2 AND L.IsCurrent = 1            
                                                                AND CONVERT(date, A.TimeIn)  
                                                                between L.WithEffectFrom 
                                                                and CONVERT(date, A.TimeIn) order by L.shiftid desc)) 
                                            and empid = E.empid )
WHERE (A.TimeIn BETWEEN '2020-11-17'  AND '2020-11-21' ) 
AND (DATEDIFF(MINUTE, A.TimeIn, A.TimeOut) = 0) 
AND (A.EmpID NOT IN (SELECT M.EmpID FROM dbo.LeaveApplicationD AS D 
                                    INNER JOIN  dbo.LeaveApplication AS M ON D.LeaveMID = M.EmpLeaveID
                                    WHERE        (D.IsActive = 1) AND (D.IsCancel = 0) 
                                    AND (M.IsActive = 1) AND (M.IsCancel = 0) 
                                    AND (D.Date = CONVERT(date, A.TimeIn) ))) 
AND (DATEPART(WEEKDAY, CONVERT(date, A.TimeIn)) <> 1)
AND (E.CompanyID = 2) AND (E.IsActive = 1) AND (E.IsCancel = 0) 
AND sh.ShiftID = 5 
ORDER BY date;

Answer :

Similarly to as Randi pointed out, I would start looking into better ways to re-write the following parts of your code that are all possible culprits for your performance issues. (If you add your query’s Execution Plan, then we can zone in on the biggest bottlenecks and provide more specific advice on how to fix them, and I can update my answer accordingly.)

  1. You should try to avoid or limit stuffing subqueries into your JOIN and WHERE clauses. This is anti-pattern to a relational design and I would suggest this as your first place to look into to solve your performance issues. You should be able to re-write all those subqueries as JOINs to your main dataset, or at least materialize them to a temp table or variable first before using them in your main query.

  2. You should also limit the functions (e.g. ISNULL, DATEDIFF, DATEPART) you use in your JOIN and WHERE clauses as this can lead to cardinality estimate issues and result in poor performance. While this may be harder to do for all of them, it’s not always the end of the world if the only logical solution is to use a function in the JOIN and WHERE clauses. (Seeing your Execution Plan would help determine if you do have cardinality estimate issues though.)

  3. You should check that you’re getting the index usage you’re expecting (e.g. the proper indexes are being used, and index seeks are occurring appropriately), which is based on the predicates you’re using in your JOIN and WHERE clauses. This again can be determined by the Execution Plan.

  4. The ORDER BY clause can be a performance heavy operation when there’s a lot of data (especially if the field being sorted on is not part of any of the main indexes being used to fetch the data). Though this is probably least of your concerns as far as I can tell from looking at your query.

Leave a Reply

Your email address will not be published. Required fields are marked *