Optimising a query of an SSRS dataset [closed]

Posted on

Question :

Consider the query given below:

SELECT * FROM ((SELECT          c.cust_id ,
                c.username ,
                REPLACE(qs.seg_type_ref_key_02, 'N/A', 'Non Vip') casino_group,
                REPLACE(qs.seg_type_ref_key_03, 'N/A', 'Non Vip') bingo_group,
                REPLACE(qs.seg_type_ref_key_04, 'N/A', 'Non Vip') games_group,
                REPLACE(qs.seg_type_ref_key_12, 'N/A', 'Non Vip') poker_group,
                REPLACE(qs.seg_type_ref_key_01, 'N/A', 'Non ViP') sportsbook_group,
                c.country
                , c.contactable
                , c.email
                , c.dob
                , c.[status]
                , c.first_name
                , c.last_name
                , c.[master]
                , c.[language]
                , c.gender
FROM  warehouse.dbo.dim_customer c (nolock)

INNER JOIN warehouse . dbo . dim_segmentationcodehistory ( nolock )sc
 ON sc . cust_id = c . cust_id
INNER JOIN warehouse . dbo . q_dim_segments qs ( nolock )
 ON sc . seg_code_ref_key = qs . seg_code_ref_key
WHERE SC.active=1 and qs.seg_type_ref_key_04 <>'N/A' AND c.active = 1
and (qs.seg_type_ref_key_02 <> 'Prospect' and qs.seg_type_ref_key_03 <> 'Prospect' and qs.seg_type_ref_key_04 <> 'Prospect' and qs.seg_type_ref_key_12 <> 'Prospect' and qs.seg_type_ref_key_01 <> 'Prospect')) A
LEFT JOIN( SELECT   c.cust_id cust_dup,
                    SUM(fc.turnover) AS Turnover_GBP,
                    SUM(fc.grosswin) AS GrossWin_GBP,
                    SUM(fc.chip_purch_amount_gbp) AS chip_purch_amount_gbp
FROM   warehouse.dbo.fact_games fc (nolock)
INNER JOIN warehouse.dbo.dim_date d (nolock)
 ON d.date_key = fc.date_ref_key
INNER JOIN warehouse.dbo.dim_customer c (nolock)
 ON c.cust_ref_key = fc.cust_ref_key
INNER JOIN warehouse.dbo.dim_gamesgame gg(nolock)
 ON gg.games_game_ref_key = fc.game_ref_key
WHERE d.[date] between getdate()- 10 AND getdate()-9
            AND gg.Game_Group_Description <> 'Bingo'
GROUP BY c.cust_id )B
ON A.cust_id = B.cust_dup)

This query takes a little more than an hour. However, I require that this completes in as little time as possible.

Below is the level to which I have been able to optimize it:

IF OBJECT_ID('tempdb..#temp_shash_A') IS NOT NULL
DROP TABLE #temp_shash_A

IF OBJECT_ID('tempdb..#temp_shash_B') IS NOT NULL
DROP TABLE #temp_shash_B

-- A
(SELECT          c.cust_id ,
            c.username ,
            REPLACE(qs.seg_type_ref_key_02, 'N/A', 'Non Vip')     casino_group,
            REPLACE(qs.seg_type_ref_key_03, 'N/A', 'Non Vip')     bingo_group,
            REPLACE(qs.seg_type_ref_key_04, 'N/A', 'Non Vip')     games_group,
            REPLACE(qs.seg_type_ref_key_12, 'N/A', 'Non Vip') poker_group,
            REPLACE(qs.seg_type_ref_key_01, 'N/A', 'Non ViP') sportsbook_group,
            c.country
            , c.contactable
            , c.email
            , c.dob
            , c.[status]
            , c.first_name
            , c.last_name
            , c.[master]
            , c.[language]
            , c.gender
INTO #temp_shash_A              
FROM  warehouse.dbo.dim_customer c (nolock)

INNER JOIN warehouse . dbo . dim_segmentationcodehistory ( nolock )sc
ON sc . cust_id = c . cust_id
INNER JOIN warehouse . dbo . q_dim_segments qs ( nolock )
ON sc . seg_code_ref_key = qs . seg_code_ref_key
WHERE SC.active=1 and qs.seg_type_ref_key_04 <>'N/A'  AND c.active = 1
and (qs.seg_type_ref_key_02 <> 'Prospect' and qs.seg_type_ref_key_03 <> 'Prospect' and      qs.seg_type_ref_key_04 <> 'Prospect' and qs.seg_type_ref_key_12 <> 'Prospect' and      qs.seg_type_ref_key_01 <> 'Prospect')
)

create clustered index S_1 on #temp_shash_A (cust_id)

-- B

( SELECT   c.cust_id cust_dup,
                SUM(fc.turnover) AS Turnover_GBP,
                SUM(fc.grosswin) AS GrossWin_GBP,
                SUM(fc.chip_purch_amount_gbp) AS chip_purch_amount_gbp
INTO #temp_shash_B
FROM   warehouse.dbo.fact_games fc (nolock)
INNER JOIN warehouse.dbo.dim_date d (nolock)
ON d.date_key = fc.date_ref_key
INNER JOIN warehouse.dbo.dim_customer c (nolock)
ON c.cust_ref_key = fc.cust_ref_key
INNER JOIN warehouse.dbo.dim_gamesgame gg(nolock)
ON gg.games_game_ref_key = fc.game_ref_key
WHERE d.[date] between getdate()- 10 AND getdate()-9
        AND gg.Game_Group_Description <> 'Bingo'
GROUP BY c.cust_id )


create clustered index S_2 on #temp_shash_B (cust_dup)


SELECT * FROM #temp_shash_A A
LEFT JOIN #temp_shash_B B
ON A.cust_id = B.cust_dup

This took just around 5-6 minutes when ran initially. However, it took around 35 minutes when ran today. Can anyone suggest a way for me to optimize this? Any help appreciated.

PS: I’m working on SQL Server 2008 R2 DB. The query is a dataset query for an SSRS report.

Answer :

from some research I have found the <> or does not equal takes longer to run than = operations

in this answer to SQL Server “<>” operator is very slow compared to “=” on table with a few million rows the Poster goes into Detail about the <> operator and why it takes longer than the = operator.

the Poster suggests a LEFT JOIN for the OP’s Query that would make it so that they shouldn’t have to use the <> operator.

if you could do something similar to weed out the records that you don’t want using a LEFT JOIN or something of the sort. I think the post that I linked to should set you in the right direction.

P.S.

it sounds like the same thing happens when you use NOT IN

Leave a Reply

Your email address will not be published. Required fields are marked *