Is a join optimized to a where clause at runtime?

Posted on

Question :

When I write a query like this…

select *
from table1 t1
join table2 t2
on t1.id = t2.id

Does the SQL optimizer, not sure if that is the correct term, translate that to…

select *
from table1 t1, table2 t2
where t1.id = t2.id

Essentially, is the Join statement in SQL Server just an easier way to write sql? Or is it actually used at run-time?

Edit: I almost always, and will almost always, use the Join syntax. I am just curious what happens.

Answer :

Those will usually collapse to the same thing internally. The former is the one you should always write, IMHO. More importantly, why does it matter? They’re identical in terms of execution plan and performance (assuming you don’t mess it up, which is easier to do with the lazy, old-style syntax).

Here is proof using AdventureWorks that there is no CROSS JOIN and filter going on.


The explicit join:

enter image description here


The implicit join:

enter image description here


Look, ma! Identical plans, identical results, no cross joins or filters seen anywhere.

(For clarity, the warning on the SELECT operator in both cases is a cardinality-affecting implicit convert, nothing to do with the join in either case.)

Strictly speaking, there is a difference in the input to the query optimizer between the two forms:

-- Input tree (ISO-89)
SELECT
    p.Name,
    Total = SUM(inv.Quantity)
FROM 
    Production.Product AS p,
    Production.ProductInventory AS inv
WHERE
    inv.ProductID = p.ProductID
GROUP BY
    p.Name
OPTION (RECOMPILE, QUERYTRACEON 8605, QUERYTRACEON 3604);

ISO-89 input tree

-- Input tree (ISO-92)
SELECT
    p.Name,
    Total = SUM(inv.Quantity)
FROM Production.Product AS p
JOIN Production.ProductInventory AS inv ON
    inv.ProductID = p.ProductID
GROUP BY
    p.Name
OPTION (RECOMPILE, QUERYTRACEON 8605, QUERYTRACEON 3604);

ISO-92 input tree

As you can see, the ON clause predicate is tightly bound to the join using the modern syntax. With the older syntax, there is a logical cross join followed by a relational select (a row filter).

The query optimizer almost always collapses the relational select into the join during optimization, meaning the two forms will very likely produce equivalent query plans, but there is no actual guarantee.

For inner join they are interchangeable, but for Outer Joins they have different meanings – the ON is matching and WHERE is simple filtering. So its better to stick to the correct JOIN syntax matching on the ON.

OK, I was curious so I did a test. I got actual execution plans for the following.

select * 
from sys.database_principals prin, sys.database_permissions perm
WHERE prin.principal_id = perm.grantee_principal_id

and

select * 
from sys.database_principals prin
JOIN sys.database_permissions perm
    ON prin.principal_id = perm.grantee_principal_id

I compared them object by object and they were identical. So at least for a very simple example, they came out to the same thing. I also checked statistics IO and time and they were close enough to be the same thing.

That being said, you should use the JOIN syntax because it’s easier to read and you are less likely to make mistakes, particularly in complicated queries. And the *= / =* syntax for OUTER joins has already been removed as of SQL-Server 2005.

Leave a Reply

Your email address will not be published. Required fields are marked *