SQL Server 2008 automatically compresses data stored in database. SQL server does Lossless data compression.
SQL server uses Dictionary based compression algorithm.
Row Level Data Compression
For Row level Data Compression SQL Server does not use explicitly any standard compression algorithm. It works on very simple algorithm. Say,
- You created a column of CHAR(50)
- Normally SQL server requires 50 bytes regardless of the actual byte needed by your Data.
- If you are storing "DEBUG MODE" in that column then you really need 10 bytes of storage.
So in Row Level Data Compression, rather than fixed format data storage, SQL Server stores data in a variable format.
We need to say at time of table creation that DATA_COMPRESSION = ROW
Estimate Row Level compression saving
In above query dbo is name of the schema and TempTable is name of the table and ROW parameter says to estimate Row level estimation.
Page Level Data Compression
In SQL server 2005 page level compression was done by minimizing data redundancy but in SQL Server 2008 it is performed by
- Reducing Data Redundancy
- Lossless Data Compression algorithm
- Column Prefix compression
With column prefix compression first SQL Server identifies repeated byte sequence in beginning of column in all rows on the page. If the same column has the same byte pattern in more than one row then SQL server stores the byte pattern once and replaces the other byte patterns with the pointer.
SQL Server creates Dictionary per page and stores repeating vales of the page in the dictionary and performs the compression on Dictionary.
Compression Saving is directly proportional to repeated byte patterns
Estimate Page Level compression saving
In the above query dbo is the name of the schema and TempTable1 is the name of the table and PAGE parameter says to estimate Row level estimation.