EMP_DEPT database. Try to SELECT records with smallest salary, NULL column record is missing

Posted on

Question :

SQL Server 2017: I have the EMP_DEPT (SCOTT schema) database with a few additional records inserted. Two of them have ZERO base salary (PORTMAN, RENO) and one employee (MCQUEEN) has NULL in the salary column.

It looks like this:

enter image description here

Then I’m learning SQL on these records and I want to SELECT the employees with the highest and lowest salaries.

To do this I write 3 queries:

/*      Return employee record with max/min salary.    */
        SELECT * FROM EMP
        WHERE SAL = (SELECT MAX(SAL) FROM EMP);

        SELECT * FROM EMP
        WHERE SAL = (SELECT MIN(SAL) FROM EMP);

        SELECT * FROM EMP
        WHERE SAL = (SELECT MIN(ISNULL(SAL, 0)) FROM EMP);

        --ISNULL(source column, replacement_value) function transforms NULL value into
        --any other value specified in the second ISNULL attribute.

The results are as follows:

enter image description here

The first query results are not a surprise – I understand them.

The second query results are not a surprise – I understand them.

But in the third query I would have expected MCQUEEN returned as well, but he’s not!

The question is why MCQUEEN is not returned in the third query results??

I’ve treated SAL column with ISNULL function to change this NULL value to ZERO.

So why isn’t MCQUEEN among PORTMAN and RENO employees with zero salary??

Answer :

So why isn’t MCQUEEN among PORTMAN and RENO employees with zero
salary?

The issue is this predicate:

WHERE SAL = (SELECT MIN(ISNULL(SAL, 0)) FROM EMP);

At run time, the above comparison for the MCQUEEN row with NULL salary is:

WHERE NULL = 0;

The result of any comparison to NULL is UNKNOWN per the ISO SQL standard. The UNKNOWN result is neither TRUE nor FALSE so the row is eliminated from the result regardless of the value returned by the subquery.

Even though the right side subquery value will never be NULL due the the subquery ISNULL(SAL, 0) expression, the left side SAL value may be NULL so you need to similarly consider NULL values for SAL:

WHERE ISNULL(SAL, 0) = (SELECT MIN(ISNULL(SAL, 0)) FROM EMP);

Rows with NULL or zero SAL values will then be returned unless you have employees with negative salaries. Similarly, one could use COALESCE(SAL, 0), CASE WHEN SAL IS NULL THEN 0 ELSE SAL END), etc. to coerce a non-NULL for NULL values.

A better approach to include NULL may be a UNION ALL query:

SELECT * FROM EMP
WHERE SAL = (SELECT MIN(SAL) FROM EMP)
UNION ALL
SELECT * FROM EMP
WHERE SAL IS NULL;

The advantage of this technique is the queries can leverage an index on the SAL column (if one exists) to execute more efficiently.

I hope this helps you understand the technical reason why the row isn’t returned. In the real world, the semantics of a NULL value need to be considered. For example, does the company really have employees with unknown salaries? If not, it’s a flaw in the data model. If unknown salaries are allowed, why should a query for minimum salaries return unknown salaries since the actual employee salary could be higher?

You have an NULL value in your table, so your last query for given row is like this:

Select * from EMP WHERE NULL = 0. 

Where NULL is the value in the table and the 0 is returned as the minsalary value.

Correct would be:

Select * from EMP
where coalesce(SAL,0) = coalesce((select min(SAL) from EMP),0)

Leave a Reply

Your email address will not be published. Required fields are marked *