+44 (0)20 3051 3595 | info@coeo.com | Client portal login

Return of the (DBCC) Clones

Andy Jones

SQL Server 2014 SP 2 introduced an interesting feature to provide a data free copy of a production database.

A little late to the party, I was using SQL Server 2016 CU1 before starting my investigation. Nevertheless, I was excited to see what DBCC CLONEDATABASE had to offer. Typing out this fresh syntax and clicking F5 gave:

Msg 2526, Level 16, State 3, Line 1
Incorrect DBCC statement. Check the documentation for the correct DBCC syntax and options.

 

#sqlhelp

 

The unwelcome red text informed me of a typo. After double and triple checking, I did what any good DBA does and took to Twitter:

 

clone-database-twitter

 

Ever #helpful, Twitter soon notified me that the command had been removed from the product in SQL Server 2016 RTM.

 

SP1

 

Fast forward 3 short months and the release of SP1 (https://www.microsoft.com/en-us/download/details.aspx?id=54276) for SQL Server 2016. In addition to many appealing announcements, DBCC CLONEDATABASE (https://support.microsoft.com/en-gb/kb/3177838) made a welcome return.

DBCC CLONEDATABASE will create a copy of a database without the data but containing data distribution statistics.

 

Create Statistics

 

To demonstrate, I will create 1,000,001 customers, 1 of whom is inactive, and a stored procedure to select them.

 

USE CloneDemo;
DROP TABLE IF EXISTS dbo.Customer;
GO
CREATE TABLE dbo.Customer(
CustomerID INT IDENTITY(1,1) NOT NULL CONSTRAINT PK_Customer PRIMARY KEY CLUSTERED,
Active CHAR(1) NOT NULL,
Created DATE NOT NULL);
GO
INSERT INTO dbo.Customer(Active,Created)
SELECT TOP 1000000 'Y','20160101'
FROM DBA.dbo.Numbers n1 CROSS JOIN DBA.dbo.Numbers n2;
GO
INSERT INTO dbo.Customer(Active,Created) VALUES ('N','20160101');
GO
CREATE NONCLUSTERED INDEX ix_Customer_Active ON dbo.Customer (Active);
GO
DROP PROCEDURE IF EXISTS dbo.CustomerSelect;
GO
CREATE PROCEDURE dbo.CustomerSelect (@Active CHAR(1)) AS
SELECT CustomerID,
Active,
Created
FROM dbo.Customer pp
WHERE pp.Active = @Active
GO
SET STATISTICS IO ON;
SET STATISTICS TIME ON;
EXECUTE dbo.CustomerSelect @Active = 'Y';
SET STATISTICS IO OFF;
SET STATISTICS TIME OFF;

 

Clone

 

A copy of CloneDemo is created via

DBCC CLONEDATABASE ('CloneDemo','CloneDemo_Copy');

Resulting in a read-only database named CloneDemo_Copy and successful output of:

Database cloning for 'CloneDemo' has started with target as 'CloneDemo_Copy'.
Database cloning for 'CloneDemo' has finished. Cloned database is 'CloneDemo_Copy'.
Database 'CloneDemo_Copy' is a cloned database. A cloned database should be used for diagnostic purposes only and is not supported for use in a production environment.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

 

clone-database-list

 

Note the warning "not supported for use in a production environment". It is a good idea to backup and restore the cloned database to a non-production server for further analysis.

 

Statistics Retained

 

Repeating the customer select query

SET STATISTICS IO ON;
SET STATISTICS TIME ON;
EXECUTE dbo.CustomerSelect @Active = 'Y';
SET STATISTICS IO OFF;
SET STATISTICS TIME OFF;
Shows minimal elapsed time and zero logical reads.
SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 0 ms.
Table 'Customer'. Scan count 1, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
It does however, show an estimated number of active customers of 1,000,000
clone-database-query

 

DBCC SHOW_STATISTICS

USE CloneDemo_Copy;

DBCC SHOW_STATISTICS('dbo.Customer','ix_Customer_Active');

Viewing the statistics in the cloned database reveals the production data distribution even though no data is present

 

dbcc-show-statistics-in-clone

 

No data means some data

 

Although no user data exists in the cloned database, good security practice still needs to be implemented.

Table and column names could be vital information for a would-be hacker to attempt a data breach elsewhere.

The bucket values within statistics contain the actual data values. In our example, this is simply 'N' or 'Y' to indicate customer status. This could just as easily be our customer names or email addresses so data governance for cloned databases should be as strict as elsewhere.

 

Use cases

From https://support.microsoft.com/en-gb/kb/3177838,  "DBCC CLONEDATABASE should be used to create a schema and statistics only copy of a production database in order to investigate query performance issues."

You should proceed with caution before using this method for others means. Objects are copied from the source database with some limitations as listed at the same link. An example is "Encrypted objects get copied but aren’t supported in clone in this release of SQL Server."

If production statistics are not required, my preference for creating a database with no data would be a "get" from a current source code control system.

Nevertheless, some reasons why cloning a database might prove useful are:

  1. Test query execution plans in non-production environments using representative statistics.
  2. To provide developers with a copy of the live schema without sensitive data. This could be combined with a third-party tool to generate realistic test data after making the database read-write.
  3. Create multiple test and development environments where disk space is at a premium.
  4. Run Microsoft Data Migration Assistant (https://www.microsoft.com/en-us/download/details.aspx?id=53595) to test your schema for compatibility issues prior to migration.
  5. Comparison of the live schema to source code control.
  6. Testing schema modification scripts after making the database read-write.

 

Summary

 

DBCC CLONEDATABASE is a useful addition to the DBA toolset. It can quickly create a production schema copy without requiring large amounts of I/O, time and disk space.

The database schema can be copied without exposing the most valuable of assets, the data.

 

 

Subscribe to Email Updates

Back to top