I’m working on an SSIS package that will refresh dev databases from production backups. The package has been in place for years, but we’re deploying Availability Groups on the dev instance, so the tasks have to be changed to work with both the primary and secondary replicas.
Background: All instances are SQL Server 2014 SP2. One production instance (PD1), two dev instances (DV3 & DV7) and one utility instance (DV1), where the SSIS package is deployed to SSISDB and where the SQL Agent job runs.
In the SSIS package, after the full backups are obtained and put on a network share from production, the next set of tasks (in a separate SSIS package) needs to figure out which node hosts the primary replica. I’m following the steps listed here to go about this work, and everything works great if it’s hard-coded (which, of course won’t work if/when a failover occurs).
The only way I’ve been able to think about approaching this is, at runtime, evaluate which instance is primary and then proceed with the tasks (which, by the way are all “Execute SQL Task” objects–requiring their own connection definition). I’ve seen a few examples of how to use expressions in ConnectionStrings and variables, but can’t figure out how to set the value returned from the below query in SQL Server to return the correct instance to set the connection string in the package.
select cs.replica_server_name from sys.dm_hadr_availability_replica_states rs join sys.dm_hadr_availability_replica_cluster_states cs on rs.replica_id = cs.replica_id join sys.dm_hadr_name_id_map n on rs.group_id = n.ag_id where rs.role = 2 and n.ag_name = 'DVAG001'
Ideally, there would be two of these connection managers, one for the primary and one for the secondary, which I could then use in the “Execute SQL Task” objects. Then everything would work perfectly.
There very well may be other ways to do this, but I’m not very experienced in SSIS or development outside of TSQL.
UPDATE: The above process is done every day…which is why I’m using an SSIS package and SQL Agent. So, the whole process has to be automated and have all the values necessary–especially the determination of the replica states (primary and secondary)–inside the package so that no manual intervention is required.
Let’s assume that you have an availability group with only one secondary replica (you could easily expand this sample for more replicas)
You will need 3 connections:
- AG listener (to run the query that will return which server is holding the primary an secondary)
- Primary (to be parameterized with the primary server)
- Secondary (to be parameterized with the secondary server)
Create 2 variables with data type String
To map a query result set to a variable using Execute SQL Task, your query needs to return only one row. You will map each column to the variable that you wish. Knowing that, I’ve changed you query to the following:
WITH CTE_AGStatus as ( select cs.replica_server_name, rs.role from sys.dm_hadr_availability_replica_states rs join sys.dm_hadr_availability_replica_cluster_states cs on rs.replica_id = cs.replica_id join sys.dm_hadr_name_id_map n on rs.group_id = n.ag_id where n.ag_name = 'DVAG001' ) select ServernamePrimary = MAX(case when role = 1 then replica_server_name end) , ServernameSecondary = MAX(case when role = 2 then replica_server_name end) from CTE_AGStatus
(Feel free to change it as you wish, just remember to return only one row)
4. Mapping the result set to variables
5. Change the connection strings
Click on the ‘Primary’ connection manager and go to the ‘Properties’ tab, then click on ‘Expressions’, select the property ‘ServerName’ and add your variable on the Expression.
Repeat the process to the ‘Secondary’ connection manager
6. Check the process
As an alternate strategy, I might consider simplify things without the need for SSIS parameters using T-SQL execution (no parameters necessary). SSIS, especially legacy packages, always seem so potentially error prone and buggy that I just don’t want to touch them if it’s not needed:
A To begin, consider running your “Primary AG Check” on the production servers themselves rather than the dev servers.
B. Then, import your SSIS packages to production if they’re not there already–to be run there. One might consider dev restores from prod more of a production process anyway.
C. And finally, avoid changing your legacy SSIS packages as much as possible by running T-SQL Agent job that invokes the SSIS package execution (database restore) from each “potential” primary AG using the following code–or something like it — as shown in Listing A.
Listing A: Execute SSIS package (database restore) only when the SQL Server is the primary–via a SQL Agent job.
--If current server is the primary AG then run the legacy package if (select a.role_desc FROM sys.dm_hadr_availability_replica_states AS a JOIN sys.availability_replicas AS b ON b.replica_id = a.replica_id WHERE b.replica_server_name like @@Servername) like 'Primary' begin Declare @execution_id bigint EXEC [SSISDB].[catalog].[create_execution] @package_name=N'Package.dtsx', @execution_id=@execution_id OUTPUT, @folder_name=N'AlwaysOn', @project_name=N'MyTestPackage', @use32bitruntime=False, @reference_id=Null exec ssisdb.catalog.start_execution @execution_id end
If I’m understanding you right, the dev servers will always be named the same. It’s just the production server names that have the potential to change. Given that, I believe the code show in Listing A will allow you to accomplish what you need to with a bit less time and effort. You could even piggy-back this execution at the end of your SQL backups to get a more immediate refresh of the dev environments.