Question :
There is a heavily queried stored procedure that uses a bad plan once in few days. As the number of execution grows, one of the bad plans causes a 100% CPU spike. In order to mitigate this, we have changed the procedure to do a recompile every time it executes. Now, the CPU usage stayed beyond 90% constant.
-
What is a better way to deal with this problem?
-
How would I do a rollback for an immediate solution?
Answer :
Without more detailed info it is difficult to be more specific regarding what should be done, but I think I could say the following:
-
An increase in CPU is to be expected when recompiling a frequently executed stored procedure upon each execution.
-
For the moment, if it were me, I would remove the
WITH RECOMPILE
because an occasional 100% CPU utilization that can be fixed with a recompile is preferable to a sustained 90% CPU utilization. -
If the bad cached plan / parameter-sniffing issue only happens “once every few days”, then that implies one or both of the following:
-
an infrequently used value that has a widely different amount of rows in the table/index than the more frequently used values
-
statistics getting out of date.
In which case:
-
Make sure you are updating statistics regularly, at least on the table(s) referenced in this stored procedure that are causing the CPU spike.
-
If you can figure out which value(s) and which statement(s) in the stored procedure cause the CPU spike, then it might be possible to place one or two statements in an
IF
block that make use ofOPTION (RECOMPILE)
(i.e. statement-level recompile, or at least handle in a slightly different way that is more appropriate for the infrequent, problem-causing value(s).
-
Something you could try is putting the following in a SQL Server Agent job that runs every few minutes. You just need to change the variable values above the dashed line, and then the database name references below that line, both marked by inline comments.
Please keep in mind that the CPU % calculated here is not the same that you would see in Task Manager, but it seems to be close enough (the value in Task Manager is a little higher).
DECLARE @MaxCPU INT = 75; -- only proceed if SQL Server CPU is at least this much
DECLARE @MaxCpuSeconds INT = 60,
@MaxElapsedSeconds INT = 60,
@ProcName [sysname] = N'{proc_name}',
@KillSession BIT = 0;
--- --- --- --- --- --- --- --- --- --- --- --- ---
DECLARE @DefaultCPU FLOAT,
@DefaultCPUbase FLOAT;
SELECT @DefaultCPU = ISNULL(@DefaultCPU,
CASE cntr.[counter_name]
WHEN N'CPU usage %' THEN cntr.[cntr_value]
ELSE NULL END),
@DefaultCPUbase = ISNULL(@DefaultCPUbase,
CASE cntr.[counter_name]
WHEN N'CPU usage % base' THEN cntr.[cntr_value]
ELSE NULL END)
FROM sys.dm_os_performance_counters cntr
WHERE cntr.[object_name] = N'SQLServer:Workload Group Stats'
AND cntr.[counter_name] LIKE N'CPU usage [%]%'
AND cntr.[instance_name] = N'default';
SELECT @DefaultCPU, @DefaultCPUbase, 100 * (@DefaultCPU / @DefaultCPUbase) AS [Default];
IF ((100 * (@DefaultCPU / @DefaultCPUbase)) >= @MaxCPU)
BEGIN
DECLARE @SessionID INT;
SELECT @SessionID = req.[session_id]
FROM sys.dm_exec_requests req
CROSS APPLY sys.dm_exec_sql_text(req.[sql_handle]) txt
WHERE OBJECT_NAME(txt.[objectid], txt.[dbid]) = @ProcName
AND (req.[cpu_time] > (@MaxCpuSeconds * 1000)
OR req.[total_elapsed_time] > (@MaxElapsedSeconds * 1000))
AND txt.[dbid] = DB_ID(N'{db_name_for_proc}'); -- put your DB name here
IF (@SessionID IS NOT NULL)
BEGIN
EXEC {db_name_for_proc}.sys.sp_recompile @ProcName; -- put your DB name here
PRINT 'Recompiled ' + @ProcName;
IF (@KillSession = 1)
BEGIN
DECLARE @SQL NVARCHAR(4000);
SET @SQL = N'KILL ' + CONVERT(NVARCHAR(50), @SessionID);
PRINT @SQL;
EXEC (@SQL);
END;
END; -- IF (@SessionID IS NOT NULL)
END; -- IF ((100 * (@DefaultCPU / @DefaultCPUbase)) >= @MaxCPU)