Looking for a set-based solution for a rolling 30-day period

Posted on

Question :

I have a stored procedure (MS SQL Server 2008 R2) that needs to see if 10 points/sales fall within a 30-day rolling window. So far, I have a solution that works, but it’s not a “proper” solution as it isn’t set based. I use while loops to process what’s going on. Looking at an agent’s points, I take the date of the first point, and see if 9 more fall within a 30-day window, if so, I mark them as processed and used and proceed to the next one. I’m thinking there has to be a “better” solution that is set-based.

To see what I have, I set up a SQL Fiddle. It shows at the end that we have processed 3 agents and there are 4 incentives to be given. Is there a better way to do this without using while loops that’s more set-based?

SQL Fiddle

Answer :

I tried to put together an approach that would work on 2008 and is more set-based. Here is what I came up with.

It did end up being more complex than I had hoped, but it might be an interesting approach for you to benchmark against your current approach on larger data sets. For what it’s worth, this script runs in about 15ms for the provided data set on my machine (vs. 75ms for the original script).

As others have mentioned, there are probably other better approaches if you were able to use window functions in 2012+ or maybe a natively compiled procedure in 2014. But it can be fun to think about how to do things without the newer features sometimes!

http://sqlfiddle.com/#!3/ad2be/7

-- Assign each point a sequential rank within each agent's history
SELECT p.internalID, ROW_NUMBER() OVER (PARTITION BY internalID ORDER BY date) AS recordRank, date
INTO #orderedPoints
FROM points p
-- Sort the data for efficient lookup of potential incentives
ALTER TABLE #orderedPoints
ADD UNIQUE CLUSTERED (internalId, recordRank)

-- Identify a potential incentive for any point that has 9+ points in the following 30 days
SELECT s.internalId, s.recordRank, ROW_NUMBER() OVER (PARTITION BY s.internalId ORDER BY s.recordRank) AS potentialIncentiveRank
INTO #potentialIncentives
FROM #orderedPoints s
JOIN #orderedPoints e
    ON e.internalId = s.internalId
    AND e.recordRank = s.recordRank + 9
    AND e.date < DATEADD(dd, 30, s.date)
-- Sort the data to allow for efficient retrieval of subsequent incentives
ALTER TABLE #potentialIncentives
ADD UNIQUE CLUSTERED (internalId, recordRank)

-- A table to hold the incentives achieved
CREATE TABLE #incentives (internalId INT NOT NULL, recordRank INT NOT NULL)
-- A couple transient tables to hold the current "fringe" of incentives that were just inserted
CREATE TABLE #newlyProcessedIncentives (internalId INT NOT NULL, recordRank INT NOT NULL)
CREATE TABLE #newlyProcessedIncentives_forFromClause (internalId INT NOT NULL, recordRank INT NOT NULL)

-- Identify the first incentive for each agent
-- Note that TOP clauses and aggregate functions are not allowed in the recursive portion of a CTE
-- If that were allowed, this could serve as the anchor of a recursive CTE and the loop below would be the recursive portion
INSERT INTO #incentives (internalId, recordRank)
OUTPUT inserted.internalId, inserted.recordRank INTO #newlyProcessedIncentives (internalId, recordRank)
SELECT internalId, recordRank
FROM #potentialIncentives
WHERE potentialIncentiveRank = 1

-- Identify the next incentive for each agent, stopping when no further incentives are identified
WHILE EXISTS (SELECT TOP 1 * FROM #newlyProcessedIncentives)
BEGIN
    -- Transfer the most recently identified incentives, so that we can truncate the table to capture the next iteration of incentives
    TRUNCATE TABLE #newlyProcessedIncentives_forFromClause
    INSERT INTO #newlyProcessedIncentives_forFromClause (internalId, recordRank) SELECT internalId, recordRank FROM #newlyProcessedIncentives
    TRUNCATE TABLE #newlyProcessedIncentives

    -- Identify the next incentive for each agent
    INSERT INTO #incentives (internalId, recordRank)
    OUTPUT inserted.internalId, inserted.recordRank INTO #newlyProcessedIncentives (internalId, recordRank)
    SELECT nextIncentive.internalId, nextIncentive.recordRank
    FROM #newlyProcessedIncentives_forFromClause f
    CROSS APPLY (
        SELECT TOP 1 p.*
        FROM #potentialIncentives p
        WHERE p.internalId = f.internalId
            AND p.recordRank >= f.recordRank + 10 -- A new incentive can only start after all 10 points from the previous incentive
        ORDER BY p.recordRank
    ) nextIncentive 
END

-- Present the final results in the same format as the original implementation
SELECT a.internalId, a.points, i.incentives 
FROM (  -- Tabulate points
    SELECT internalId, MAX(recordRank) AS points
    FROM #orderedPoints
    GROUP BY internalId
) a
JOIN (  -- Tabulate incentives achieved
    SELECT internalId, COUNT(*) AS incentives
    FROM #incentives
    GROUP BY internalId
) i
    ON i.internalId = a.internalId

Leave a Reply

Your email address will not be published. Required fields are marked *