I’m trying to identify an issue I sometime experience on a stored procedure.
I have to say that the stored procedure uses an UDF in the select clause (perhaps that’s the cause of the problem) but anyway, whether it’s not, here below a part of the execution plan:
where both the output estimated rows of the Hash Match and the estimated of the Index_Seek on the non clustered index IX_Job_OperationID are completely wrong:
I’ve tried to update the statistic related to the index IX_Job_OperationID, even with fullscan option, but it didn’t help.
UPDATE STATISTICS Job [IX_Job_OperationID] with fullscan
I’m also using the
recompile option in the stored because given the parameters provided the dataset involved may change a lot.
Can someone point me to the right direction and help me understand why even though statistics have been updated the Estimated values are so far from Actual?
Here a link to the actual execution plan
This execution plan causes a total of 8,610,665 logical reads, while other better plan that somehow are chosen can be around 122,692
Based on your description of the problem it is unlikely that the UDF is the cause of the cardinality estimation errors. You can try commenting out the UDF and checking if the join estimate changes if you want to test that.
It is possible to experience cardinality estimation errors even when statistics have been updated with
FULLSCAN. The histograms created by SQL Server may not model your data in a way for the optimizer to create good enough plans, SQL Server may not have histograms that exactly match the joined data (you could be filtering on the inner and/or outer table), you may be joining on multiple columns which complicate estimations, and so on. For example, Microsoft changed some of the join cardinality estimate calculations with the new CE released in SQL Server 2014. Some tables will have data that better matches the assumptions for the legacy CE and some tables will have data that better matches the assumptions for the new CE.
For the first estimation issue (38999 estimated rows versus 701680 actual rows), based on what you’ve provided I can only recommend reading about join cardinality calculations. Microsoft released a white paper for the new cardinality estimator in SQL Server 2014. I also know of a blog post that dives into the internals of join cardinality quite a bit. As the question is written there are too many possible causes to give a good answer. Please post an actual execution plan if you’d like further guidance.
For the second estimation issue (1 estimated row versus 1546894 actual rows), note that the estimated number of rows is for a single iteration of the inner loop for the join. The actual number of rows is for all rows returned from the table over all iterations of the join, so they will not often match for nested loop joins. What you are seeing is common and not necessarily a sign of a problem.