You want to protect your VM’s and databases in case of disaster. You implement a backup strategy to meet your requirements, but they’re separate technologies. What can go wrong? What should you watch out for? Let me tell you about something I came across recently…
We were alerted to an issue with a client’s backups where the differential backup job had failed. Their backup routine was to perform a FULL backup for all databases once a week and take a differential at the same time for the remaining six days. We were alerted to the differential backup job failing midweek. The client’s SQL instance utilised Ola Hallengren’s maintenance solution and upon reviewing the job’s output logs, we could see that for all databases on the instance the differential backup could not be taken, returning the following error message:
“Cannot perform a differential backup for database "DATABASE_NAME", because a current database backup does not exist. Perform a full database backup by reissuing BACKUP DATABASE, omitting the WITH DIFFERENTIAL option. BACKUP DATABASE is terminating abnormally.”
We then reviewed the backup history for one of these databases, using the following script:
DECLARE @DbName sysname;
SET @DbName = 'DATABASE_NAME';
d.[name] AS [Database name],
CASE b.[type] WHEN 'L' THEN 'LOG' WHEN 'D' THEN 'FULL' WHEN 'I' THEN 'DIFF' ELSE b.[type] END AS [Type],
b.backup_finish_date AS [Backup finish date],
CAST(DATEDIFF(minute, b.backup_finish_date, GETDATE()) / 60.0 AS decimal(10,1)) AS [Backup age (hours)],
b.[user_name] AS [Username],
m.physical_device_name AS [Backup device],
FROM [master].sys.databases d
LEFT OUTER JOIN msdb.dbo.backupset b
ON d.name COLLATE SQL_Latin1_General_CP1_CI_AS = b.database_name COLLATE SQL_Latin1_General_CP1_CI_AS
LEFT OUTER JOIN msdb.dbo.backupmediafamily AS m
ON b.media_set_id = m.media_set_id
WHERE d.is_in_standby = 0 /* Not a log shipping target database */
AND d.state_desc = 'ONLINE'
AND d.source_database_id IS NULL /* Excludes database snapshots */
AND d.[state] <> 1
AND b.[type] <> 'L' /* exclude logs */
AND d.[name] = CASE WHEN @DbName = '' THEN DB_NAME() ELSE @DbName END
ORDER BY [Backup finish date] DESC;
Interestingly, amongst the backups that were taken by the Ola Hallengren solution job, we could see that there were additional FULL backups which were not copy-only being taken to a GUID location, as below:
This type of GUID indicates that the backup was taken via VSS (Volume Shadow copy Service). SQL Server supports virtualisation-aware backup solutions that use VSS to take volume snapshots. Examples of such tools include, but is not limited to:
- Hyper-V backup
Our suspicions were confirmed when reviewing the SQL error logs where we could see messages around the time that these backups were taken, such as
“I/O is frozen on database DATABASE_NAME. No user action is required. However, if I/O is not resumed promptly, you could cancel the backup.”
This, with the knowledge that this was an Azure VM, looked to be a VM snapshot.
Snapshots taken at the VM level can potentially interrupt the differential bitmap and backup chain.
In this case, a VM snapshot was taken mid-week on their Azure VM. This mid-week VM snapshot included the SQL instance and took a full backup of all databases. As these backups were not copy-only backups, this meant that any database recovery would require this backup to be utilised, thus breaking the differential backup chain set up as part of the existing Ola Hallengren backup solution.
To rectify this issue, we needed to implement a registry edit for the server in question. This is as follows:
This registry setting directs VSS to take full backups as copy-only as a part of its snapshots. This enables these VM snapshots to be taken without disrupting the backup chain.
This is documented by Microsoft at the following link:
After applying this registry edit, we could see that the backups were being taken by the SQL server backup jobs uninterrupted, and the now daily scheduled snapshots were taking copy-only backups each morning.
As you can see, these technologies do not always play nicely together, but they can very easily be made to. We have given you the (registry) key to unlock this problem and avoid it in your own implementation…
Have you encountered this issue? Got any questions? Let us know in the comments or contact us at firstname.lastname@example.org