Understanding Backup Types

Posted on

Question :

I have recently started a role as a junior DBA… Yesterday I created a new database for a co-worker and my boss asked me to make sure we get a backup plan on it asap.

I didn’t think much about it, went into the agent, created a job that took a full backup every night @ 1AM. Being proud of myself I went to bed and didn’t think anymore about it. This morning I started thinking that was not “Good enough” – if the database died at 12AM, then they would lose almost 23 hours of data, which would probably get me fired. 🙂

So to make sure I understand this – I think I need to do Differential Backups in addition to my full backup.. After looking at this tutorial, I had a few questions:

  1. Would I just go into the agent and add a job that happens at every hour (for example) that creates a differential backup?
  2. If my understanding is correct, that would backup the Transactional log every hour until 1AM when it does a Full backup, then it would “Reset” the T-Log and start over again from the next day, so at the most would be 1 hour of data loss – Is that correct?
  3. So in the end, I would have 2 jobs on the agent, one that fires off every day @ 1AM to do a “FULL” backup, and one that fires off every hour that does a differential backup?

Answer :

Kin is correct to point you to Ola Hallengren’s backup and maintenance solution.

It sounds like you’re new, so also consider looking into:

Aaron Bertrand is correct to point you to Why Does the Transaction Log Keep Growing or Run Out of Space?.

Usr is correct; until you test restores, assume your backups are worthless.

The commenters are also correct, you must consult with the business.

In more detail, you need:

  • RPO: Recovery Point Objective. How much data loss is acceptable – i.e. restore back to within an hour before the loss? A day?
  • RTO: Recovery Time Objective. How long it takes to get the system back up again.
  • At least a skeleton DR plan, in particular, what types of “disasters” are your RPO and RTO good for?
    • corrupt database
    • hard drive crash
    • multiple drive crash resulting in RAIDset loss
    • accidental file deletion
    • server dropped during move
    • server stolen
    • building burned down
    • regional natural disaster (large hurricanes, earthquakes, typhoons, tsunami, major volcanic eruptions, major flooding)
  • Budget
    • After you get this, you can go back and renegotiate the previous points.

As far as a general backup plan, I would start with considering:

  • Search for anywhere someone’s truncating the log… and stop that! You don’t want to breaking your log chain!
    • Likewise if you’re switching in and out of SIMPLE recovery model.
  • Figure out where your backups need to go; backing up to the same disk spindles that hold your databases or logs is pretty pointless by itself; lose the disks, and you lose both the active database and the backups all in one fell swoop.
  • In all cases, set CHECKSUM on.
  • Full backups for everything, regularly.
    • Maybe this is daily, maybe it’s weekly, maybe it’s biweekly, maybe even monthly.
    • NOTE: Keeping more than one around is useful; for FULL (and BULK-LOGGED if there are no bulk-logged changes) recovery model databases, you can skip a corrupt full backup if you have a prior full backup and an unbroken log chain.
  • Full backups are the ONLY backup allowed for Master.
    • Timing on this must be at least as frequent as your RPO.
  • Don’t forget to back up msdb too, and you might as well toss model in.
    • yes, model. Sometimes it has user defined types, etc.
  • Differential backups on SIMPLE mode databases
    • Timing on this must be at least as frequent as your RPO.
  • OPTIONAL: Differential backups on FULL and BULK-LOGGED databases
    • these can allow you to have faster restores
    • these can also allow you to “skip over” damaged/lost/corrupt transaction log backups that are BETWEEN your full backup and a DIFFERENTIAL, after which you can continue restoring transaction logs as long as your log chain is unbroken from that point.
  • Log backups on all FULL and BULK-LOGGED database.
    • Timing on this must be at least as frequent as your RPO.
    • This is MANDATORY in order to keep t-log size down
  • Run test restores; it doesn’t matter if they’re on the same server or a different server, just run them.
  • Who has access to the backups
  • Encryption of the backups
    • Key management of the encryption
  • Offsite storage
    • And how that affects RTO
      • during the “larger” disasters; even so much as a blizzard or mudslide can add hours or days to this.

I think this the most popular question: “How often should I backup my databases”. So, here are some simple answers: “Everything depends on how large your database is? How often you make changes to your database? and How important your data is?”

For example, you can make a full backup every 24 hours (as you do), differential backup every 6 hours and backup your transaction log every hour.
So, now let’s dig deeper:

Full database backup:

A full database backup backs up all data files and active part of the transaction log. The active part of the transaction log is necessary to restore a database to a transactionally consistent point. Remember: if you use simple recovery model the transaction log will truncate during the checkpoint process.And point-in-time recovery is not possible.

BACKUP DATABASE database TO DISK = 'd:/full.bak';

Differential database backup:

A differential backup is created similarly to a full backup, but with one important difference – the differential backup only contains the data that has changed since the last full backup (the active portion of the transaction log). Differential backups are cumulative not incremental. This means that a differential backup contains all changes made since the last full backup, in spite of the fact that they have already been included in previous differential backups.

BACKUP DATABASE database TO DISK = 'd:/diff.bak' WITH DIFFERENTIAL;

Transaction log backup

A transaction log backup contains all transaction log records that have been made between the last transaction log backup or the first full backup and the last log record that is created upon completion of the backup process. The transaction log backup allows to restore a database to a particular point-in-time before the failure has occurred. It is incremental, meaning that in order to restore a database to a certain point-in-time, all transaction log records are required to replay database changes up to that particular point-in-time. Please note that transaction log backup is available only for full or bulk-logged recovery models.

BACKUP LOG database TO DISK = 'd:/log.bak';

Assume that you use such database backup schedule with full backup every 24 hours, differential backup every 6 hours, and transaction log every hour. And your database runs under full recovery model. Look at the picture below:

enter image description here

Some crucial data were deleted at 20:30, the best we can do in this case is to restore the database to 20:29. The restore process will be next:

RESTORE DATABASE database FROM DISK = 'd:/full_10_00.bak' WITH NORECOVERY, REPLACE
RESTORE DATABASE database FROM DISK = 'd:/diff_16_00.bak' WITH NORECOVERY
RESTORE LOG database FROM DISK = 'd:/log_17_00.bak' WITH NORECOVERY
RESTORE LOG database FROM DISK = 'd:/log_18_00.bak' WITH NORECOVERY
RESTORE LOG database FROM DISK = 'd:/log_19_00.bak' WITH NORECOVERY
RESTORE LOG database FROM DISK = 'd:/log_20_00.bak' WITH NORECOVERY
RESTORE LOG database FROM DISK = 'd:/log_21_00.bak' WITH 
    STOPAT = '2015-11-26 20:29:00.000', RECOVERY

In this case data from 20:29 till 21:00 will be lost.

Leave a Reply

Your email address will not be published. Required fields are marked *