Using COLLATE with UNION

Posted on

Question :

How to use COLLATE with UNION? I want to union 2 tables (both have the same columns and the same types: varchar,int, int, decimal).

I got the following error:

sg 468, Level 16, State 9, Line 1
Cannot resolve the collation conflict between “Serbian_Latin_100_CI_AS” and “Croatian_CI_AS” in the UNION operation.

My SQL statement:

select * from #IA_BIH
union 
select * from #IA_MNE  

Where should I insert collate database_default? I tried different combinations, but it didn’t work.

Answer :

Based on the collation names I assume that you are using Microsoft SQL Server.

COLLATE can be used at the database level or the column level. Since you are trying to UNION two tables, using the column collation on the needed columns will resolve your query.

Here is a sample bit of code to help you:

use testdb
GO
CREATE TABLE dbo.Serbian  (Name VARCHAR(20) COLLATE Serbian_Latin_100_CI_AS);
CREATE TABLE dbo.Croatian (Name VARCHAR(20) COLLATE  Croatian_CI_AS);
GO
INSERT INTO dbo.Serbian VALUES ('serbian');
INSERT INTO dbo.Croatian VALUES ('croation');
GO

-- Collate to a particular named collation
SELECT Name COLLATE Serbian_Latin_100_CI_AS as CollatedNameSerbian from dbo.Serbian 
UNION ALL
SELECT Name COLLATE Serbian_Latin_100_CI_AS from dbo.Croatian 
GO
-- Collate to the database default collation
SELECT Name  COLLATE database_default as CollatedNameDBDefault from dbo.Serbian 
UNION ALL
SELECT Name COLLATE database_default from dbo.Croatian 
GO

DROP TABLE dbo.Serbian;
DROP TABLE dbo.Croatian;
GO

Of course, if you have several columns with conflicting collations you will need to define their collations as well.

Collations can be a bit tricky, especially for VARCHAR data. For both NVARCHAR and VARCHAR data, Collations control sorting and comparison rules. But for VARCHAR data, Collations also control the character set (i.e. which code page is used to determine what character each 8-bit value represents). So, changing Collations on VARCHAR data can lead to data loss:

SELECT CHAR(230) AS [DB-Collation],
       CHAR(230) COLLATE Korean_100_CI_AS AS [Korean],
       CHAR(230) COLLATE Albanian_CI_AS AS [Albanian],
       CHAR(230) COLLATE Greek_100_CI_AS AS [Greek];
-- æ            a   ?

However, since you have a collation mismatch, you need to change one of them. I would advise against using database_default in this particular case since that will be relative to the database that the query is being executed from, which might lead to inconsistent behavior.

Fortunately, both Collations that are mentioned in the error have the same code page: 1250. You can find this info out using the following query:

SELECT col.[name], COLLATIONPROPERTY(col.[name], 'CodePage') AS [CodePage]
FROM   sys.fn_helpcollations() col
WHERE  [name] IN (N'Serbian_Latin_100_CI_AS', N'Croatian_CI_AS');

At this point, you need to pick one of those two to force the other one to be the same. Because both Collations use the same code page, the choice won’t affect the character set being used. The only difference you need to be concerned about is the rules for sorting and comparison between Serbian and Croatian; pick the one that most closely matches the end-users’ expectations.

One option is to force the Collation in the SELECT statement as it appears you are trying to do, and as is shown in @RLF’s answer. The downside here is that you can no longer use SELECT * (if this is code in a stored procedure it is probably best to not use SELECT * anyway).

Another option is to force the Collation for one of those tables when you create the temporary table, whether that is done using CREATE TABLE or SELECT INTO:

SELECT ac.*
FROM   sys.all_columns ac
WHERE  ac.[object_id] = OBJECT_ID(N'sys.objects')
AND    ac.[name] = N'name';
-- SQL_Latin1_General_CP1_CI_AS (on my system, at least)

SELECT [name] COLLATE Hebrew_100_CI_AS AS [name]
INTO   #coltest
FROM   sys.objects;

SELECT sc.*
FROM   [tempdb].sys.columns sc
WHERE  sc.[object_id] = OBJECT_ID(N'tempdb..#coltest');
-- Hebrew_100_CI_AS

The benefits here are that you can then:

  1. use SELECT * in your UNION query.
  2. execute several queries against these tables without needing to add a COLLATE option to each one.

And, assuming that the default Collation, for the database in which the CREATE TABLE or SELECT INTO is being executed, is one of the two Collations noted in the error message, then you can go ahead and use COLLATE database_default.

Leave a Reply

Your email address will not be published. Required fields are marked *