Data Compression in SQL Server 2008

Dhananjay Kumar
15y
12.4k
0
0

Article

SQL Server 2008 automatically compresses data stored in database. SQL server does Lossless data compression.

SQL server uses Dictionary based compression algorithm.

Row Level Data Compression

For Row level Data Compression SQL Server does not use explicitly any standard compression algorithm. It works on very simple algorithm. Say,

You created a column of CHAR(50)
Normally SQL server requires 50 bytes regardless of the actual byte needed by your Data.
If you are storing "DEBUG MODE" in that column then you really need 10 bytes of storage.

So in Row Level Data Compression, rather than fixed format data storage, SQL Server stores data in a variable format.

We need to say at time of table creation that DATA_COMPRESSION = ROW

Estimate Row Level compression saving

In above query dbo is name of the schema and TempTable is name of the table and ROW parameter says to estimate Row level estimation.

Page Level Data Compression

In SQL server 2005 page level compression was done by minimizing data redundancy but in SQL Server 2008 it is performed by

Reducing Data Redundancy
Lossless Data Compression algorithm
Column Prefix compression

With column prefix compression first SQL Server identifies repeated byte sequence in beginning of column in all rows on the page. If the same column has the same byte pattern in more than one row then SQL server stores the byte pattern once and replaces the other byte patterns with the pointer.

SQL Server creates Dictionary per page and stores repeating vales of the page in the dictionary and performs the compression on Dictionary.

Compression Saving is directly proportional to repeated byte patterns

Estimate Page Level compression saving

In the above query dbo is the name of the schema and TempTable1 is the name of the table and PAGE parameter says to estimate Row level estimation.