Conditional Connection in SSIS based on Availability Group status

Posted on

Question :

I’m working on an SSIS package that will refresh dev databases from production backups. The package has been in place for years, but we’re deploying Availability Groups on the dev instance, so the tasks have to be changed to work with both the primary and secondary replicas.
Background: All instances are SQL Server 2014 SP2. One production instance (PD1), two dev instances (DV3 & DV7) and one utility instance (DV1), where the SSIS package is deployed to SSISDB and where the SQL Agent job runs.
In the SSIS package, after the full backups are obtained and put on a network share from production, the next set of tasks (in a separate SSIS package) needs to figure out which node hosts the primary replica. I’m following the steps listed here to go about this work, and everything works great if it’s hard-coded (which, of course won’t work if/when a failover occurs).
The only way I’ve been able to think about approaching this is, at runtime, evaluate which instance is primary and then proceed with the tasks (which, by the way are all “Execute SQL Task” objects–requiring their own connection definition). I’ve seen a few examples of how to use expressions in ConnectionStrings and variables, but can’t figure out how to set the value returned from the below query in SQL Server to return the correct instance to set the connection string in the package.

    select cs.replica_server_name
from sys.dm_hadr_availability_replica_states rs
join sys.dm_hadr_availability_replica_cluster_states cs
    on rs.replica_id = cs.replica_id
join sys.dm_hadr_name_id_map n
    on rs.group_id = n.ag_id
where rs.role = 2
and n.ag_name = 'DVAG001'

Ideally, there would be two of these connection managers, one for the primary and one for the secondary, which I could then use in the “Execute SQL Task” objects. Then everything would work perfectly.
There very well may be other ways to do this, but I’m not very experienced in SSIS or development outside of TSQL.

UPDATE: The above process is done every day…which is why I’m using an SSIS package and SQL Agent. So, the whole process has to be automated and have all the values necessary–especially the determination of the replica states (primary and secondary)–inside the package so that no manual intervention is required.

Answer :

Let’s assume that you have an availability group with only one secondary replica (you could easily expand this sample for more replicas)

1. Connections

You will need 3 connections:

  • AG listener (to run the query that will return which server is holding the primary an secondary)
  • Primary (to be parameterized with the primary server)
  • Secondary (to be parameterized with the secondary server)

2. Variables

Create 2 variables with data type String

  • ServernamePrimary
  • ServernameSecondary

enter image description here

3. Query

To map a query result set to a variable using Execute SQL Task, your query needs to return only one row. You will map each column to the variable that you wish. Knowing that, I’ve changed you query to the following:

WITH CTE_AGStatus as (
    select cs.replica_server_name, rs.role
    from sys.dm_hadr_availability_replica_states rs
    join sys.dm_hadr_availability_replica_cluster_states cs
        on rs.replica_id = cs.replica_id
    join sys.dm_hadr_name_id_map n
        on rs.group_id = n.ag_id
    where n.ag_name = 'DVAG001'
)

select  ServernamePrimary = MAX(case when role = 1 then replica_server_name end)
,       ServernameSecondary = MAX(case when role = 2 then replica_server_name end)
from CTE_AGStatus

(Feel free to change it as you wish, just remember to return only one row)

4. Mapping the result set to variables

On the execute SQL Task Editor, paste the query above, set the OLEDB connection ‘AG listener’ and change the result set to ‘Single row’
Execute SQL Task Editor - General tab

Then go to the ‘Result Set’ tab and map the columns to the variables
Result Set map to variable

5. Change the connection strings

Click on the ‘Primary’ connection manager and go to the ‘Properties’ tab, then click on ‘Expressions’, select the property ‘ServerName’ and add your variable on the Expression.

enter image description here

Repeat the process to the ‘Secondary’ connection manager

6. Check the process

To validate, you could add other steps and use breakpoints to check how your variables are running. The next steps are up to you. Now you have one connection pointing to each server of your AG.
enter image description here

As an alternate strategy, I might consider simplify things without the need for SSIS parameters using T-SQL execution (no parameters necessary). SSIS, especially legacy packages, always seem so potentially error prone and buggy that I just don’t want to touch them if it’s not needed:

A To begin, consider running your “Primary AG Check” on the production servers themselves rather than the dev servers.

B. Then, import your SSIS packages to production if they’re not there already–to be run there. One might consider dev restores from prod more of a production process anyway.

C. And finally, avoid changing your legacy SSIS packages as much as possible by running T-SQL Agent job that invokes the SSIS package execution (database restore) from each “potential” primary AG using the following code–or something like it — as shown in Listing A.

Listing A: Execute SSIS package (database restore) only when the SQL Server is the primary–via a SQL Agent job.

--If current server is the primary AG then run the legacy package

if (select a.role_desc FROM sys.dm_hadr_availability_replica_states AS a JOIN sys.availability_replicas AS b  ON b.replica_id = a.replica_id WHERE b.replica_server_name like @@Servername) like 'Primary' 
begin
            Declare @execution_id bigint  
            EXEC [SSISDB].[catalog].[create_execution] @package_name=N'Package.dtsx', @execution_id=@execution_id OUTPUT, @folder_name=N'AlwaysOn', @project_name=N'MyTestPackage', @use32bitruntime=False, @reference_id=Null  
                exec ssisdb.catalog.start_execution @execution_id 
end

Note:

If I’m understanding you right, the dev servers will always be named the same. It’s just the production server names that have the potential to change. Given that, I believe the code show in Listing A will allow you to accomplish what you need to with a bit less time and effort. You could even piggy-back this execution at the end of your SQL backups to get a more immediate refresh of the dev environments.

Leave a Reply

Your email address will not be published. Required fields are marked *