I know that when data types are the same it is good and it hurts performance when they are not.
But sometimes it seems to have very little impact and basically a waste of time to fix it. Other times it will have a huge impact and performance gets a lot better.
Of course there are many scenarios, e.g. a insert select where the data type of the select is not the same, but compatible, with the insert column. Or it could be on the join predicates.
Where does implicit conversion hurt the most?
Also it seems that one also gets an implicit conversion when the lengths don’t match, e.g. joining varchar(10) with varchar(20). How big a deal is these scenarios when we are only dealing with the length of the columns and not their type?
Basically I would like to know when I should worry about the compute scalar operator in the execution plan. How can I know if removing it will have a significant impact?
There are two main issues which cause implicit (or explicit) conversion to make a big difference to the query plan:
The main point of problems is where a join or filter predicate, or an ordering or grouping, is over a converted column. Note also that this applies equally to any function on a column used in such a context. A column which is just selected has far less impact.
This means firstly, that the compiler cannot use an index lookup to get the data, nor can it rely on the conversion output to be in the correct order for operations such as a Merge join or Stream Aggregate.
This particular point is far more insidious than some realize. With some conversions, for example
float, the compiler understands these to be order-preserving. But conversions such as
nvarchar have no such semantic.
Therefore often, when joining on say
nvarchar column, it is necessary to decide which table will be the one to be seeked, and therefore often an explicit conversion is better.
The second issue is the more well-known one: no statistics. The compiler has no understanding of what the result of the conversion implies, therefore it cannot use statistics to determine estimated rowcounts and densities. This can often affect the whole plan, and often makes a bigger difference the deeper in the plan it is.
If the conversion or function is only in the select (and is not predicated in outer parts of the query), then there is far less impact, even though the same warning occurs. In this case, normally the conversion will happen just before the row is output from the query, it does not normally affect anytthe query plan.
Two cases where
select can make a difference are:
- Conversions to much larger types, such as the LOB
maxtypes. it is beneficial to ensure that the conversion happens as late as possible.
- Very complex nested string functions, that are passed to each other via
CROSS APPLY. Here, the opposite is true: do the calculation earlier, and prevent the compiler trying to nest the whole thing as one big Compute Scalar.
OUTER APPLYis your friend, it seems to give an optimization-fence effect.
As Justin implies, the when it’s a problem will vary from query to query and execution plan to execution plan. There’s no definitive list of scenarios that guarantee you’ll experience performance issues as a result of implicit conversion.
One way to tell if a query facing implicit conversion is experiencing performance issues is to check for cardinality estimate issues, generally the two go hand-in-hand when the implicit conversion is causing performance issues. You can do so by checking the Actual Number of Rows vs the Estimated Number of Rows for each operation of the query’s execution plan, and if there’s a large deviation between the two (generally an order of magnitude or more), then there’s cardinality estimate issues. Generally you can look for the operations involved with the thickest lines to quickly spot which operations you can start checking cardinality estimate issues for. As Justin also mentioned, generally this is when the implicit conversion is happening on one of the indexed fields, specifically in a predicate (
HAVING clause) of the query.
But not all implicit conversion warnings are worth a grain of salt, and if you see such a warning but your query is running optimally, then you likely no need to spend time debugging it, especially if the execution plan doesn’t also have cardinality estimate issues.
I always keep attention to data types for implicit conversion.
This is because maybe in the current query it is something that do not affect the performance.
But in future reviews and chages to that query, the dormant implicit conversion will rise again.
SELECT * FROM table WHERE varchar_filed = 2
In this case sql server is doing an implicit conversion of 2 to varchar. A quick scalar compute.
But anyone in a future may change this query; he can eventually replace the filter with a join, without seeing that the varchar_filed is not an int and so generating a possible issue.