Using aggregate function on data masked column returns zeroes

Posted on

Question :

Recently we have implemented dynamic data masking in SQL Server 2019 database in order to hide sensitive information from developers. However, for testing purposes I’d like them to see close-to-real values, so my mask looks like this:

CREATE TABLE fin.Salaries
(
    TargetMonth    DATE            NOT NULL,
    Login          VARCHAR(100)    NOT NULL,
    Department     VARCHAR(100)    NOT NULL,
    AmountUSD      MONEY           MASKED WITH (FUNCTION = 'random(500, 5000)') NOT NULL,
    CONSTRAINT PK_Salaries PRIMARY KEY CLUSTERED (TargetMonth, Login, Department)
)

It works fine for simple queries, however aggregate functions seem to always return zero values. I haven’t found anything on this in the documentation, so the question is whether this is an intended behavior and is there any way to get non-zero values from aggregated data masked column?

SELECT   TargetMonth,
         AVG(AmountUSD) -- Returns all zeroes
FROM     fin.Salaries
GROUP BY TargetMonth

Answer :

Since dynamic data masking is performed right before returning the query result to the user, applying functions such as AVG will not have the masked data as a resultset.

You can see this in the execution plan, the masking function appears right before the select (in other words, returning the data to the user):

enter image description here

While the actual aggreggate happens in the compute scalar right behind the masking function:

enter image description here

With [EXPR1008] and [EXPR1009] being a count and a sum:

<ColumnReference Column="Expr1008" />
<ScalarOperator ScalarString="Count(*)">
<Aggregate AggType="countstar" Distinct="false" />
<ColumnReference Column="Expr1009" />
<ScalarOperator ScalarString="SUM([Test].[dbo].[Salaries].[AmountUSD])">

And, with the datamask function being applied after the aggregate functions, it is set to 0 as a result of the DataMask([Expr1002],0x05000000,(1),(0),(0),(0)) function.

This does not mean that all DataMask function executions on aggreggates and the corresponding datatypes will return 0, only that in this case it does.

Leave a Reply

Your email address will not be published. Required fields are marked *