Performance issue when multiple columns of same table are grouped

Posted on

Question :

I am dealing with large amount of data approx 1 million rows with 100s of columns.
I have this proc which performs some calculation over this data based on grouped by a colum1.

Now the same calculation is performed over this data grouped by column1, column2.

I can optimise the whole operation by creating indexes on column1 and column2.
But how i can achive the performance if column1 or column2 are dynamic and up to nth column.

Example:
For n columns the group by operation is going to be like as follows

operation 1 : group by column1
Operation 2 : group by column1, column2

Operation n : group by column1, column2….. upto columN

Answer :

If you need all those aggregates at once use ROLLUP or GROUPING SETS to calculate multiple aggregate grains in a single scan.

If you want to optimize a large table for many different aggregates use a Columnstore index.

Leave a Reply

Your email address will not be published. Required fields are marked *