Question :
The following EXCEPT query produces a logical plan with a seemingly purposeless projection. This also occurs for INTERSECT.
What is the purpose of the projection? e.g. Is there a different EXCEPT query where the outer projection would specify something?
Query:
use AdventureWorks2017
select p.ProductId
from Production.Product as p
except
select pinv.ProductID
from Production.ProductInventory as pinv
option (recompile, querytraceon 8605, querytraceon 3604)
Converted Tree:
LogOp_Select
LogOp_GbAgg OUT(QCOL: [p].ProductID,) BY(QCOL: [p].ProductID,)
LogOp_Project -- << ?? PASSIVE PROJECTION ??
LogOp_Project
LogOp_Get TBL: Production.Product(alias TBL: p) Production.Product TableID=482100758 TableReferenceID=0 IsRow: COL: IsBaseRow1000
AncOp_PrjList
AncOp_PrjList
AncOp_PrjList
ScaOp_Exists
LogOp_Select
LogOp_Project
LogOp_Get TBL: Production.ProductInventory(alias TBL: pinv) Production.ProductInventory TableID=914102297 TableReferenceID=0 IsRow: COL: IsBaseRow1001
AncOp_PrjList
ScaOp_Comp x_cmpEq
ScaOp_Identifier QCOL: [p].ProductID
ScaOp_Identifier QCOL: [pinv].ProductID
Answer :
There are various things we do inside the query optimizer that don’t really have a reason that we can explain externally. The projects you see in the optimizer eventually get rewritten at the end of the optimizer to “flatten” the expressions. So, the optimizer has things in the search that are part of how the code is implemented that may introduce extra projects that don’t really do anything functionally but do allow us to stitch things together where one subtree was created in one part of the optimizer but is used later in a place where it was not originally intended. There are some physical implementation details that can cause this to happen. None of them matter for users in terms of plan quality (in almost any case – we have done work to make these not matter in the search).
So, I’d just wave my hand and say “you don’t need to see his identification. He can go about his business. Move along” ;).
I don’t have a fully satisfying answer to this, but DISTINCT
is translated to a project plus group-by aggregate.
Both EXCEPT
and INTERSECT
come with an implied DISTINCT
on the first table expression. It is this DISTINCT
that results in the ‘blank’ project in the tree. It is harmless.
If you write a DISTINCT
as the equivalent GROUP BY
, you don’t see the blank project. The project appears after the aggregate:
SELECT DISTINCT P.ProductID
FROM Production.Product AS P
OPTION (QUERYTRACEON 3604, QUERYTRACEON 8605);
LogOp_GbAgg OUT(QCOL: [P].ProductID,) BY(QCOL: [P].ProductID,)
LogOp_Project
LogOp_Project QCOL: [P].ProductID
LogOp_Get TBL: Production.Product(alias TBL: P)
AncOp_PrjList
AncOp_PrjList
AncOp_PrjList
SELECT P.ProductID
FROM Production.Product AS P
GROUP BY p.ProductID
OPTION (QUERYTRACEON 3604, QUERYTRACEON 8605);
LogOp_Project QCOL: [P].ProductID
LogOp_GbAgg OUT(QCOL: [P].ProductID,) BY(QCOL: [P].ProductID,)
LogOp_Project
LogOp_Get TBL: Production.Product(alias TBL: P)
AncOp_PrjList
AncOp_PrjList
AncOp_PrjList