Capturing Changes in SQL Server Tables

For capturing the changes of DML statements we must be familiar with the Merge statement and Output clause in SQL Server. I am giving a general overview of these statements.

The Merge statement is a new feature of SQL Server 2008. It allows modification of data in one table based upon the data in the other table by applying a join condition between the tables. By using the merge statement it is possible to do inserts, deletes and updates by using a single statement that otherwise might be done separately.

Merge statement Syntax

MERGE target

USING source

ON join

When Matched

Update | Delete

When Not Matched [By Target]

Insert

When Not Matched By Source

Update | Delete

Now to show a demo, we will create a table named employee as in the following:

  1. CREATE TABLE [dbo].[Employee]  
  2. (  
  3.     [Id] INT NOT NULL,  
  4.     [FirstName] VARCHAR(50) NOT NULL ,  
  5.     [LastName] VARCHAR(50) NOT NULL ,  
  6.     [Email] VARCHAR(100) NOT NULL   

Since we need another table for comparing the data we will create a temporary table for that.

  1. Create table #EmployeeTemp  
  2. (  
  3.     Id int,  
  4.     FirstName varchar(50),  
  5.     LastName varchar(50),  
  6.     Email varchar(80)  

For demonstration purposes I am writing insert queries for inserting data into the temporary table that will be compared with the Employee table. If matches are found then records will be updated in the Employee table, otherwise the records will be inserted into the Employee table. In real scenarios we can use a table value parameter that can be populated by the client application and send that parameter in a Stored Procedure and that table value parameter can be used instead of a temporary table.

  1. insert into #EmployeeTemp values(1,'Test1','M','[email protected]')  
  2. insert into #EmployeeTemp values(2,'Test2','P',[email protected]')  
  3. insert into #EmployeeTemp values(3,'Test3','R',[email protected]'

The following  is a samle of a Merge statement for the employee table.

  1. Merge Employee emp  
  2.  using #EmployeeTemp temp  
  3.  on emp.Id=temp.Id  
  4.  When Matched And (emp.FirstName!=temp.FirstName or emp.LastName!=temp.LastName or emp.Email!=temp.Email)Then  
  5.     Update set emp.FirstName=temp.FirstName,emp.LastName=temp.LastName,emp.Email=temp.Email  
  6. When Not Matched Then  
  7.     insert values(temp.Id,temp.FirstName,temp.LastName,temp.Email)  
  8.       
  9.     Output $action,inserted.*,deleted.*  
  10.     ; 

In the preceding statement we are using Employee as the target table and our temporary table as source and we are applying an inner join condition on them. Then we will check if a match has occurred. If the data is different then the Employee table will be updated. If a match does not occur then the target table must be inserted with the new record.

Note: When we use When Not Matched in a merge statement it means that it is not matched by the target table. If we want to check the preceding condition with the source table then we will write When Not Matched by Source.

Output Clause

The Output clause in SQL Server is used to track the changes done by DML statements. The Output clause uses inserted and deleted pseudo tables just like triggers and returns the affected rows by insert, update and delete statements. The Output clause can be used with insert, delete, update and merge statements in SQL Server.

$action returns the type of action that has been performed by a DML statement. It is only available in the merge statement.

Capturing Data using Output into

The Output clause is good for tracking the data since it displays all the records that have been changed along with the type of change in them but it does not persist the changes, it only displays that. In order to save all the changes we can use an output into clause and create the table where we want to store the changes. For example we have created an EmployeeHistory table for storing all the changes that have been made on our table Employee.

  1. CREATE TABLE [dbo].[EmployeeHistory]  
  2. (  
  3.         [Optionvarchar(50),  
  4.         [Id] INT,  
  5.         [FirstName] VARCHAR(50) NULL ,  
  6.         [LastName] VARCHAR(50) NULL ,  
  7.         [Email] VARCHAR(100) NULL,   
  8.            [OldID] INT NULL,   
  9.         [OldFirstName] VARCHAR(50) NULL,   
  10.         [OldLastName] VARCHAR(50) NULL,   
  11.         [OldEmail] VARCHAR(100) NULL   

From table design we can observe that we are storing the record before and after an update through which we can easily see what value was modified. The Merge statement for storing history is:

  1. Merge Employee emp  
  2.  using #EmployeeTemp temp  
  3.  on emp.Id=temp.Id  
  4.  When Matched And (emp.FirstName!=temp.FirstName or emp.LastName!=temp.LastName or emp.Email!=temp.Email)Then  
  5.     Update set emp.FirstName=temp.FirstName,emp.LastName=temp.LastName,emp.Email=temp.Email  
  6. When Not Matched Then  
  7.     insert values(temp.Id,temp.FirstName,temp.LastName,temp.Email)  
  8.       
  9.     Output $action,inserted.*,deleted.*  
  10.     into EmployeeHistory  
  11.     ; 

The preceding approach has some problems, such as if in the employee table their occurs more insertions and updates then we will have a table (EmployeeHistory) that stores more inserted rows (new records) than updates and the EmployeeHistory table becomes a kind of replica for the Employee table that may not be our intentions. And the second problem is that we need to create another table that stores all the changes with many duplicate columns. So a better approach would be if we store only updates but not insertions, for that we need to filter out in an output clause for the changes that occurred.

For filtering the changes made by DML statements I will update the employee table by creating a new column ArchivedAt(of datetime) to track all the changes that have been made to a record. Whenever a change has occurred in any record we will update that record and also store the old record with ArchivedAt date. It will duplicate the record but it prevents the creation of a new table. So in my Employee table I will have duplicate data with Archived date that indicate that this is an old record and changes have been made on this date. For doing that we will use an insert over DML statement that uses the changes clause on which we can apply a where condition.

  1. insert into Employee  
  2. Select Id,FirstName,LastNAme,Email,SysDateTime()  
  3. From  
  4. (Merge Employee emp  
  5.  using #EmployeeTemp temp  
  6.  on emp.Id=temp.Id  
  7.  When Matched And (emp.FirstName!=temp.FirstName or emp.LastName!=temp.LastName or emp.Email!=temp.Email)Then  
  8.     Update set emp.FirstName=temp.FirstName,emp.LastName=temp.LastName,emp.Email=temp.Email  
  9. When Not Matched Then  
  10.     insert(Id,FirstName,LastName,Email) values(temp.Id,temp.FirstName,temp.LastName,temp.Email)  
  11.       
  12.     Output $action,temp.Id,Deleted.FirstName,Deleted.LastName,Deleted.Email  
  13.  )Changes(MergeAction,ID,FirstName,LastName,Email)  
  14. where MergeAction='UPDATE'

By the above method we can maintain all the changes made to the Employee table.

Up Next
    Ebook Download
    View all
    Learn
    View all