Oracle query optimization

Posted on

Question :

We have a query that runs on a view and its execution time is about 6 min.

SELECT COUNT(*) 
FROM someschema.item_total_violations 
WHERE channel_id = 122507833
AND item_date >= TIMESTAMP'2015-02-01 00:00:00';

Its not a materialized view so there is no index on the item_date.

The base query of the view is below:

SELECT i.id
    ,i.title
    ,i.body
    ,i.description
    ,i.item_date
    ,i.created_date
    ,i.channel_id
    ,i.community_id
    ,i.person_id
    ,i.message_type
    ,ch.client_id
    ,upper(p.NAME) AS person_name_upper
    ,upper(ch.NAME) AS channel_name_upper
    ,p.NAME AS person_name
    ,ch.NAME AS channel_name
    ,ch.connector_type AS channel_type
    ,cm.NAME AS community_name
    ,cm.network_id
    ,nvl(ct2.cnt, 0) + nvl(ct4.cnt, 0) AS total_violations_count
    ,nvl(ct1.cnt, 0) + nvl(ct3.cnt, 0) AS quarantined_violations_count
    ,nvl(ct5.cnt, 0) AS quarantined_attributes_count
    ,nvl(lh_info.cnt_lh, 0) AS legal_holds_count
    ,nvl(lh_info.cnt_lh_w_no_date, 0) AS legal_holds_count_w_no_date
    ,nvl(lh_info.max_hold_until, to_date('1970-01-01', 'yyyy-mm-dd')) AS legal_hold_max_hold_until
FROM item i
    ,channel ch
    ,person p
    ,community cm
    ,(
        SELECT /*+ NO_MERGE */ pv.item_id
            ,count(*) AS cnt
        FROM policy_violation pv
        WHERE pv.item_id IS NOT NULL
            AND pv.quarantine_status = 'QUARANTINED'
        GROUP BY pv.item_id
        ) ct1
    ,(
        SELECT /*+ NO_MERGE */ pv.item_id
            ,count(*) AS cnt
        FROM policy_violation pv
        WHERE pv.item_id IS NOT NULL
        GROUP BY pv.item_id
        ) ct2
    ,(
        SELECT /*+ NO_MERGE */ aa.item_id AS item_id
            ,COUNT(*) AS cnt
        FROM item_attachment aa
            ,policy_violation pv
        WHERE aa.id = pv.item_attachment_id
            AND pv.quarantine_status = 'QUARANTINED'
        GROUP BY aa.item_id
        ) ct3
    ,(
        SELECT /*+ NO_MERGE */ aa.item_id AS item_id
            ,COUNT(*) AS cnt
        FROM item_attachment aa
            ,policy_violation pv
        WHERE aa.id = pv.item_attachment_id
            AND pv.item_id IS NULL
        GROUP BY aa.item_id
        ) ct4
    ,(
        SELECT /*+ NO_MERGE */ pv.item_id
            ,count(DISTINCT (pv.item_attribute_id)) AS cnt
        FROM policy_violation pv
        WHERE pv.item_id IS NOT NULL
            AND pv.item_attribute_id IS NOT NULL
            AND pv.quarantine_status = 'QUARANTINED'
        GROUP BY pv.item_id
        ) ct5
    ,(
        SELECT lh_rel.item_id AS item_id
            ,count(DISTINCT (lh.id)) AS cnt_lh
            ,max(lh.HOLD_UNTIL) AS max_hold_until
            ,sum(CASE 
                    WHEN lh.hold_until IS NULL
                        THEN 1
                    ELSE 0
                    END) AS cnt_lh_w_no_date
        FROM LEGAL_HOLD_RELATION lh_rel
            ,LEGAL_HOLD lh
        WHERE lh.id = lh_rel.legal_hold_id
            AND lh.STATUS = 'A'
        GROUP BY lh_rel.item_id
        ) lh_info
WHERE ch.STATUS = 'A'
    AND i.channel_id = ch.id
    AND i.person_id = p.id
    AND i.community_id = cm.id(+)
    AND i.id = ct1.item_id(+)
    AND i.id = ct2.item_id(+)
    AND i.id = ct3.item_id(+)
    AND i.id = ct4.item_id(+)
    AND i.id = ct5.item_id(+)
    AND i.id = lh_info.item_id(+)

The item table is the biggest and we have tried partitioning the table in chaannel_id but that didn’t help our query. I understand that there is no index on item_date in original query and it has effect on performance. Apart from trying out materialized view, is there something that I can improve on the base query of the view?

Answer :

Why do you use this view at all for this query?

The view collects violations and other stuff, but your query does not care about those at all, just the number of items. The view lists all items regardless of these because of the outer joins, so you basically perform a lot of unnecessary extra work to collect violations and other stuff (the NO_MERGE hint makes this even worse), then you throw it away and count the items based on a totally different criteria.

The query should be something like this:

select
  count(*) 
from
  item i
  join channel ch on (i.channel_id = ch.id)
  join person p on (i.person_id = p.id)
where
  ch.status = 'A'
  and i.channel_id = 122507833
  and i.item_date >= timestamp'2015-02-01 00:00:00';

Additionally, you don’t need materialized views for indexing a column. Just create the index on the column(s) of the base table(s).

On top of the must-have indexes on primary and foreign keys, another possible candidate is item_date column depending on your data distribution.

You can improve efficiency of your query by (for example) replacing:

    ,(
        SELECT /*+ NO_MERGE */ pv.item_id
            ,count(*) AS cnt
        FROM policy_violation pv
        WHERE pv.item_id IS NOT NULL
            AND pv.quarantine_status = 'QUARANTINED'
        GROUP BY pv.item_id
        ) ct1
    ,(
        SELECT /*+ NO_MERGE */ pv.item_id
            ,count(*) AS cnt
        FROM policy_violation pv
        WHERE pv.item_id IS NOT NULL
        GROUP BY pv.item_id
        ) ct2
    ,

with:

    ,(
        SELECT /*+ NO_MERGE */ pv.item_id
            ,count(*) AS cnt
            ,count(case when pv.quarantine_status = 'QUARANTINED' then 1 end) as cnt_quarantined
        FROM policy_violation pv
        WHERE pv.item_id IS NOT NULL
        GROUP BY pv.item_id
        ) ct2
    ,

This rewrite can be applied for other sub-selects in your query as well. That should give us:

 [...]
 ,nvl(ct2.cnt_quarantined, 0) + nvl(ct4.cnt_quarantined, 0) AS quarantined_violations_count
 [...]
,(
    SELECT /*+ NO_MERGE */ pv.item_id
        ,count(*) AS cnt
        ,count(case when pv.quarantine_status = 'QUARANTINED' then 1 end) as cnt_quarantined 
    FROM policy_violation pv
    WHERE pv.item_id IS NOT NULL
    GROUP BY pv.item_id
    ) ct2
,(
    SELECT /*+ NO_MERGE */ aa.item_id AS item_id
        ,COUNT(*) AS cnt
        ,count(case when pv.quarantine_status = 'QUARANTINED' then 1 end) as cnt_quarantined   
    FROM item_attachment aa
        ,policy_violation pv
    WHERE aa.id = pv.item_attachment_id
        AND pv.item_id IS NULL
    GROUP BY aa.item_id
    ) ct4

You might be able to further merge these two rewrite with the use of a left join, but it is difficult to say without knowing more details of you model. Anyhow, this rewrite should reduce the amount of work that your query has to do.

As an advise I would suggest that you for such large queries use ansi joins since it makes it easier to read and understand the query

Leave a Reply

Your email address will not be published. Required fields are marked *