Recently we have implemented dynamic data masking in SQL Server 2019 database in order to hide sensitive information from developers. However, for testing purposes I’d like them to see close-to-real values, so my mask looks like this:
CREATE TABLE fin.Salaries ( TargetMonth DATE NOT NULL, Login VARCHAR(100) NOT NULL, Department VARCHAR(100) NOT NULL, AmountUSD MONEY MASKED WITH (FUNCTION = 'random(500, 5000)') NOT NULL, CONSTRAINT PK_Salaries PRIMARY KEY CLUSTERED (TargetMonth, Login, Department) )
It works fine for simple queries, however aggregate functions seem to always return zero values. I haven’t found anything on this in the documentation, so the question is whether this is an intended behavior and is there any way to get non-zero values from aggregated data masked column?
SELECT TargetMonth, AVG(AmountUSD) -- Returns all zeroes FROM fin.Salaries GROUP BY TargetMonth
Since dynamic data masking is performed right before returning the query result to the user, applying functions such as
AVG will not have the masked data as a resultset.
You can see this in the execution plan, the masking function appears right before the select (in other words, returning the data to the user):
While the actual aggreggate happens in the compute scalar right behind the masking function:
[EXPR1009] being a count and a sum:
<ColumnReference Column="Expr1008" /> <ScalarOperator ScalarString="Count(*)"> <Aggregate AggType="countstar" Distinct="false" /> <ColumnReference Column="Expr1009" /> <ScalarOperator ScalarString="SUM([Test].[dbo].[Salaries].[AmountUSD])">
And, with the datamask function being applied after the aggregate functions, it is set to 0 as a result of the
This does not mean that all
DataMask function executions on aggreggates and the corresponding datatypes will return 0, only that in this case it does.