Question :
Our company is looking for functionalilty/or automated way to archive data in SQL Server.
Currently, we create archive tables with identity surrogate, and develop stored procedures to import data. We are conducting this for 100+ tables, data is 2 years or older. Does SQL Server have functionality or automated way to archive data (Wizard tool, UI) or does person have to develop stored procedures? I am thinking of creating dynamic sql to write script for 100+ tables.
Thanks,
-- If no functionality exists, will create dynamic sql for tables using template below
create table dbo.CustomerTransaction
(
CustomerTransactionId bigint primary key identity(1,1) not null ,
CustomerName varchar(255) null ,
ProductName varchar(255) null,
Quantity int null ,
CreateDatetimestamp datetime not null
)
create table dbo.CustomerTransactionArchive
(
CustomerTransactionArchiveId bigint primary key identity(1,1) not null ,
CustomerTransactionId bigint null,
CustomerName varchar(255) null ,
ProductName varchar(255) null,
Quantity int null ,
CreateDatetimestamp datetime not null
)
create procedure dbo.CustomerTransactionArchive
as
declare rowmin bigint = (Select min(CustomerTransactionId) from dbo.CustomerTransaction where Createdatetimestamp < dateadd(year,-2,CreateDatetimestamp ))
declare rowmax bigint = (Select max(CustomerTransactionId) from dbo.CustomerTransaction where Createdatetimestamp < dateadd(year,-2,CreateDatetimestamp ))
declare rowcounter bigint = rowcountermin
-- insert into 100,000 record block, do not want to lock too many pages in table
while rowcounter <= rowmax
begin
insert into dbo.CustomerTransactionArchive (
select * from dbo.CustomerTransactionArchive
where CustomerTransactionId between rowcounter and (rowcounter +100000)
and Createdatetimestamp < dateadd(year,-2,CreateDatetimestamp ))
rowcounter = rowcounter + 100001
end
Archive tables will be in different database
Answer :
There is no built in functionality.
The stored procedure route will work, but you might also want to look into an ETL package (like SSIS) to do this as it may be easier to maintain.
You could also look into partitioning on CreateDate for all your tables in order to efficiently drop older data, but that would still require some custom code to rotate the partitions. They could, however, use a common partition scheme and function if the DDL on the CreateDate is the same for each.
If you were to upgrade, you could also look at using Temporal Tables or Stretch Database from SQL 2016, though those aren’t out of the box auto-archive features necessarily.
This is my list on how I would approach :
-
If the data needs to be reside in the same database :
- If using enterprise edition
- Use table partitioning (since you are using SQL Server 2012). For your case, you can partition by CreateDatetimestamp (daily / monthy or yearly). Partitioning has some requirements so refer to my answer here. I have also described how to partition an existing non partitioned table.
- You can also use partition management utility to automate the process.
- you can create the tables in seperate filegroups. This way, when doing restore, you can do Piece Meal Restore and unrestored filegroups can be restored at a later time.
- If partitioning is not suited for your environment, then what you are doing currently – creating archive tables and building process to move the data to archive as per your needs.
- In your approach, I want to highlight few things :
- Create the archive tables on separate filegroups.
- after you move the data from main to archive tables, make sure to update stats. This way the optimizer can create better plans based on latest statistics.
- Use view to tie up newer and archived data. This way your application just references the view and you dont have to worry about underlying main and archive tables.
- In your approach, I want to highlight few things :
- If using enterprise edition
–
- If the data can reside in a different database e.g. dbname_archive, you can develop an ETL process to periodically move data from your main database to archive database using SSIS. You can reference the old data using 3 part –
dbname.schema.tableName
. This was you can keep your archive database seperate from main database on a different filesystem and possibly make it readonly. Your main database will be small depending on how often you archive your data and this will speed up restore / recovery times along with maintenance (backups, index / stats etc) for your main database.
Additionally, you can use data compression (since you are on sql server 2012) to further compress the data in archive/history tables.
Read: