Optimize delete query in SQL Server 2008 R2 SP1

Posted on

Question :

I have a delete query which takes around 6 mins to complete. I am looking for ways to optimise it and if its duration can be shortened. Below is its anonymized (due to security constraints) estimated execution plan.

Execution plan:

enter image description here

Query:

DELETE TOP(10000) Object1
   FROM Object1
   LEFT JOIN Object2
   ON 
    Object1.TID = Object2.TID
   WHERE 
    Object2.TID IS NULL

Record count details:

  1. sp_spaceused Object1 — 124164707
  2. sp_spaceused Object2 — 27799877

Index details:

1) Object1

  • CL index on TID(-) DESC, TIndex(-) DESC
  • No other indexes apart from above index

2) Object2

  • CL index on TID(-) DESC
  • NC index [NodeID] ASC
  • NC index [DateTime] ASC
  • NC index [TimeStamp] ASC
  • NC index [TType] ASC

SQL Server Environment details:

  • SQL Server 2008 R2 SP1
  • Edition: Enterprise
  • Version: 10.50.2500.0

DDL:

--Object1
CREATE TABLE [dbo].[Object1](
    [TID] [bigint] NOT NULL,
    [TIndex] [tinyint] NOT NULL,
    [OID] [varchar](1000) NOT NULL,
    [OIDName] [varchar](100) NOT NULL,
    [OIDValue_ANSI] [varchar](1000) NOT NULL CONSTRAINT [DF_Object1_OIDValue_ANSI]  DEFAULT (''),
    [RawValue_ANSI] [varchar](1000) NOT NULL CONSTRAINT [DF_Object1_RawValue_ANSI]  DEFAULT (''),
    [OIDValue_Unicode] [nvarchar](1000) NULL,
    [RawValue_Unicode] [nvarchar](1000) NULL,
    [OIDValue]  AS (isnull([OIDValue_Unicode],[OIDValue_ANSI])),
    [RawValue]  AS (isnull([RawValue_Unicode],[RawValue_ANSI])),
 CONSTRAINT [PK_Object1] PRIMARY KEY CLUSTERED 
(
    [TID] DESC,
    [TIndex] DESC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

--Object2
    CREATE TABLE [dbo].[Object2](
        [TID] [bigint] IDENTITY(97196339,1) NOT NULL,
        [EID] [int] NOT NULL,
        [DateTime] [datetime] NOT NULL CONSTRAINT [DF_Object2_RecivedDateTime]  DEFAULT (getdate()),
        [IP] [varchar](50) NOT NULL CONSTRAINT [DF_Object2_IP]  DEFAULT ('0.0.0.0'),
        [C_ANSI] [varchar](255) NOT NULL CONSTRAINT [DF_Object2_C_ANSI]  DEFAULT (''),
        [C_Unicode] [nvarchar](255) NULL,
        [Community]  AS (isnull([Community_Unicode],[Community_ANSI])),
        [Tag_ANSI] [varchar](100) NOT NULL CONSTRAINT [DF_Object2_Tag_ANSI]  DEFAULT (''),
        [Tag_Unicode] [nvarchar](100) NULL,
        [Tag]  AS (isnull([Tag_Unicode],[Tag_ANSI])),
        [Acknowledged] [tinyint] NOT NULL CONSTRAINT [DF_Object2_Acknowledged]  DEFAULT ((0)),
        [Hname_ANSI] [varchar](255) NOT NULL CONSTRAINT [DF_Object2_Hname_ANSI]  DEFAULT (''),
        [Hname_Unicode] [nvarchar](255) NULL,
        [Hostname]  AS (isnull([Hostname_Unicode],[Hostname_ANSI])),
        [NodeID] [bigint] NOT NULL CONSTRAINT [DF_Object2_NodeID]  DEFAULT ((0)),
        [TType] [varchar](100) NOT NULL CONSTRAINT [DF_Object2_TType]  DEFAULT (''),
        [ColorCode] [int] NULL,
        [TimeStamp] [timestamp] NOT NULL,
     CONSTRAINT [PK_Object2] PRIMARY KEY CLUSTERED 
    (
        [TID] DESC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]

I tried replacing LEFT JOIN with NOT IN and NOT EXISTS but it didn’t help much. Rather NOT IN performed worst and NOT EXISTS was slightly better (5.8 secs) but still not that better. For all these testings, I used below select query since I can’t use actual delete:

SELECT Object1.TID
       FROM Object1
       LEFT JOIN Object2
       ON 
        Object1.TID = Object2.TID
       WHERE 
        Object2.TID IS NULL

I would appreciate if someone could provide thoughts/suggestions on it. If you need more details, I’ll try to provide it as much as I can considering security limitations I have.

Additional info:

The delete runs daily as part of SQL job. I am doing Naive Batching as described in Take Care When Scripting Batches (by Michael J. Swart) to delete rows until all of the rows are deleted.

Number of rows to delete depends on the data at that point of time the SQL job runs. Sorry but don’t have the exact number as it varies daily and I never recorded it.

The following query takes 6 min 30s:

SELECT COUNT(*)
FROM Object1
LEFT JOIN Object2
    ON Object2.TID = Object1.TID
WHERE
    Object2.TID IS NULL;

The result is 0 rows (since we delete it daily we won’t be getting much records).

The following query returns 123,529,024 records:

SELECT COUNT(*)
FROM Object1
LEFT JOIN Object2
    ON Object1.TID = Object2.TID

Answer :

Using Michael J. Swart’s Take Care When Scripting Batches code as a base, I’d like to offer another possible solution which uses a temp table to track the next set of rows to be deleted. The code keeps track of the max TID value from the previous delete and utilizes that information to keep from re-scanning the entire table over and over. I’d be curious to know if it would work for your situation.

--------------------------
--initial demo data set up
--------------------------
SET NOCOUNT ON
IF OBJECT_ID('tempdb..#Object1') IS NOT NULL
    DROP TABLE #Object1
IF OBJECT_ID('tempdb..#Object2') IS NOT NULL
    DROP TABLE #Object2
CREATE TABLE #Object1 (TID BIGINT)
CREATE TABLE #Object2 (TID BIGINT)

DECLARE @UpperLimit INT;

SET @UpperLimit = 1000;

WITH n
AS (
    SELECT x = ROW_NUMBER() OVER (
            ORDER BY s1.[object_id]
            )
    FROM sys.all_objects AS s1
    CROSS JOIN sys.all_objects AS s2
    CROSS JOIN sys.all_objects AS s3
    )
--Populate Object1 with some test data
INSERT INTO #Object1 (tid)
SELECT x
FROM n
WHERE x BETWEEN 1
        AND @UpperLimit;

SELECT *
FROM #Object1

--Populate Object2 with some test data
insert into #Object2(TID) values(10),(20),(30)


------------------------------
--the real delete code
------------------------------
DECLARE @LargestKeyProcessed BIGINT = - 1
    ,@NextBatchMax INT
    ,@RC INT = 1;

IF OBJECT_ID('tempdb..#Object1RowsToDelete') IS NOT NULL
    DROP TABLE #Object1RowsToDelete

--Create a temp table to hold the TID values for
--Object1 rows where there does not exist a corresponding row
--on Object2 for TID
CREATE TABLE #Object1RowsToDelete (TID BIGINT NOT NULL)

--Careful batch loop until all intended rows are deleted
WHILE (@RC > 0)
BEGIN
    --Truncate the temp table
    TRUNCATE TABLE #Object1RowsToDelete

    --Populate the temp table with the next set of TID's 
    --that need to be deleted
    INSERT INTO #Object1RowsToDelete
    SELECT TOP (10) o1.TID      --Alter TOP as needed
    FROM #Object1 o1
    LEFT JOIN #Object2 o2 ON O1.TID = O2.TID
    WHERE o2.TID IS NULL
        AND O1.TID > @LargestKeyProcessed
    ORDER BY O1.TID ASC;

    --Delete from Object1 by joining against the temp table
    DELETE O1
    FROM #Object1 o1
    JOIN #Object1RowsToDelete d ON d.TID = o1.TID

    SET @RC = @@ROWCOUNT;

    --The max TID in the temp table is where we need to start on 
    --the next interation of the while loop
    SET @LargestKeyProcessed = (
            SELECT max(TID)
            FROM #Object1RowsToDelete
            )
print @LargestKeyProcessed
END

--Select the remaining rows from Object1
--In this example, only 10, 20 and 30 remain because
--those TID's were on Object2
SELECT *
FROM #Object1

I feel like the fact that you’re searching for rows to delete using the first column in the clustered index is an important distinction from the scenario in this article and might merit a slightly different approach. http://michaeljswart.com/2014/09/take-care-when-scripting-batches/

I propose first discovering all the values for TID that need to be deleted from Object1 and putting these in a temporary table. Then you’ll never need to read Object2 again and the delete from Object1 won’t be held up by any other operation.

-- First, let's prepare for the delete by identifying which keys will be deleted.
CREATE TABLE #Object1_TID_To_Delete (TID bigint NOT NULL PRIMARY KEY CLUSTERED);

-- By storing the keys to delete in a temporary table, we're reading from Object2 only a single time.
-- This should help reduce locking.
INSERT INTO #Object1_TID_To_Delete
SELECT TID 
  FROM Object1 AS o1
  WHERE NOT EXISTS (SELECT 1 FROM Object2 AS o2 WHERE o1.TID = o2.TID)
  GROUP BY TID;

WHILE EXISTS (SELECT 1 FROM #Object1_TID_To_Delete)
BEGIN
    -- Limit the delete to the first 1000 rows so that we're not locking Object1 for too long.
    -- The ORDER BY is so we can be absolutely sure which 1000 rows were picked.
    -- Matching the clustered index of the temp table with Object1 means we should obtain 
    -- a merge join in our statement plan.

    WITH del AS (
        SELECT TOP 1000 TID FROM #Object1_TID_To_Delete ORDER BY TID)
    DELETE 
      FROM o1
      FROM Object1 AS o1
      JOIN del ON o1.TID = del.TID;

    WITH del AS (
        SELECT TOP 1000 TID FROM #Object1_TID_To_Delete ORDER BY TID)
    DELETE 
       FROM del;
END

DROP TABLE #Object1_TID_To_Delete;

In the approach below, the elapsed time should reduce drastically. The inner sub query is driving the whole operation. Due to clustered index on primary key, it is very fast to find first 10000 candidate rows for deletion. As the row count decreases after the deletion operation, the inner sub query gets faster. You may have to find a happy medium with the set rowcount. Make sure same is reflected in the select top sub query.

set rowcount 10000

while (1=1)
begin
    delete x
    from
        Object1 x join
    (
    -- Perhaps top 10000 not needed anymore, because of rowcount enforcement
    select TOP 10000 Object1.TID 
       FROM Object1
       LEFT JOIN Object2
       ON 
        Object1.TID = Object2.TID
       WHERE 
        Object2.TID IS NULL
    ) y
        on
            x.TID=y.TID
    if @@rowcount = 0 
        break
end

Leave a Reply

Your email address will not be published. Required fields are marked *