Introduction
In this article, I would like to provide an in-depth look at change detection in SQL Server. SQL Server 2012 provides two features that can be used for identification of modified records. These features track DML changes (insert, update and delete operations) in a database.
Data change detection can be used for many purposes, such as auditing, handling concurrency issues and tracking content changes/manipulation. There is no need to add columns, add triggers, or create a side table in which to track deleted rows or to store change tracking information if columns cannot be added to the user tables. You do not have to develop a custom solution. In this article I will discuss two mechanisms for data capture and tracking data changes. Another two will be discussed in my next article. Two mechanisms for data capture and tracking data changes are:
ROWVERSION
ROWVERSION is just an incrementing number and does not store date/time. Each database has a counter that increments for every insert, update or delete operation performed on a table. A table can have only one ROWVERSION column. This is a mechanism provided by SQL Server for version-stamping of rows. Let's see an example of how ROWVERSION works.
Create a table with one of the column data type as ROWVERSION. Use the following code snippet to make the table:
--Create a test table
CREATE TABLE RowVersionExample
(
Id INT IDENTITY (1,1),
EmployeeName VARCHAR(50),
City VARCHAR(50),
VersionData ROWVERSION
)
GO
--Insert some test data
INSERT INTO RowVersionExample (EmployeeName,City)
SELECT 'John','Boston' UNION ALL
SELECT 'Tim','London' UNION ALL
SELECT 'Robert','New York' UNION ALL
SELECT 'Jim','Orlando'
GO
Now use SELECT statement to have a look at the ROWVERSION values.
Let us update one of the records, and then have a look at the value of the VersionData column.
--Modify some data,
UPDATE RowVersionExample
SET City = 'Paris'
WHERE EmployeeName = 'Robert';
GO
--Check RowVersions
SELECT * FROM RowVersionExample
GO
OUTPUT:
You can see in the above image, the VersionData column has changed for the updated record, since ROWVERSION detects changes. But what happens if you run the preceding query again.
--FALSE UPDATE SCENARIO:
--Update the record value with itself and
--then check the RowVersion
UPDATE RowVersionExample
SET City = 'Paris'
WHERE EmployeeName = 'Robert';
GO
SELECT * FROM RowVersionExample
GO
OUTPUT:
False updates also change the ROWVERSION. So it fails to identify if the data actually changed. So its not a good choice for keys. This problem is removed by CHECKSUM. Let's discuss, how CHECKSUM works.
CHECKSUM
In the above example, one of the limitations with ROWVERSION is the detection of "real" data changes. CHECKSUM is a builtin function that returns the checksum value over the row of a table. It is just a hash value generated over given arguments. Use the following code snippet to make the table:
--Create a test table
CREATE TABLE CHECKSUMExample
(
Id INT IDENTITY (1,1),
EmployeeName VARCHAR(50),
City VARCHAR(50),
CHECKSUMVALUE AS CHECKSUM (EmployeeName,City)
)
GO
--Insert some test data
INSERT INTO CHECKSUMExample (EmployeeName,City)
SELECT 'John','Boston' UNION ALL
SELECT 'Tim','London' UNION ALL
SELECT 'Robert','New York' UNION ALL
SELECT 'Jim','Orlando'
GO
Now use a SELECT statement to have a look at the CHECKSUM values.
Let us update one of the records, and then have a look at the value of the CHECKSUMVALUE column. Use the following code snippet:
UPDATE CHECKSUMExample
SET City = 'Paris'
WHERE EmployeeName = 'Robert';
GO
SELECT * FROM CHECKSUMExample
GO
OUTPUT:
You can see in the above image, the CHECKSUMVALUE column has changed for the updated record, since CHECKSUM detects data modifications. But what happens if you run the preceding query again?
--Update the record value with itself
--Check CHECKSUMVALUE column
UPDATE CHECKSUMExample
SET City = 'Paris'
WHERE EmployeeName = 'Robert';
GO
SELECT * FROM CHECKSUMExample
GO
OUTPUT:
CHECKSUM is actually able to distinguish between an actual data change and a false data change. So this method can be used to track data changes.