SQL Server Performance Tuning: Data Compression - Part One

Pankaj Kumar Choudhary
May 08, 2016
10.1k
0
0
Article

Database is an important part of any application or business. If performance of database server decreases, then it generate an adverse effect over all systems. So tuning of a database system is very crucial. System tuning is not a single step task, it is a series of tasks (steps) that helps you to increase the performance of your database. Now, I will try to post a series of articles related to this topic and try to explain, “How do we tune up the performance of SQL Server?” .

Today, we will learn about data compression in SQL Server.

What is Data Compression

By taking advantage of SQL Server's data compression feature, we can increase SQL Server's performance without changing any application code. Generally, the performance of SQL Server is decided by the disk I/O efficiency so we can increase the performance of SQL Server by improving the I/O performance.

Generally data compression reduce the space occupied by the data. Data compression can help improve performance of I/O intensive workloads because the data is stored in fewer pages and queries need to read fewer pages from disk. Data compression can be performed for a table, clustered index, non-clustered index.

Facts about Data Compression

New pages added for DML will not use the compression until the heap is rebuilt.
ROW and PAGE level compression can be Enabled and Disabled either offline or online.
If setting for heap is changed then all Nonclustered indexes are require to be rebuilt.
In compressed index leaf level pages can be compressed both Row and Page compression but Non-leaf level doesn’t perform page compression.
Large data types are stored separately and don't contain data compression.
When partition is split using the ALTER Partition then each partition inherits the compression of original partition.
When two partitions are merged, the resultant partition inherits the data compression attribute of the destination partition.
When a clustered index is dropped, the corresponding heap partitions retain their data compression setting unless the partitioning scheme is modified. To overcome this issue first drop the clustered index and after that modify table using the ALTER TABLE with REBUILD option that specify the compression option.
Data compression change the query plan because the data is compressed in different number pages and rows.

Types of Data Compression:

SQL Server supports two type of data compression: row compression and page compression.

Row Compression

Row compression is an inner level of compression that stores the fixed character strings by using variable-length format by not storing the blank characters. The following steps are performed in row level compression.

All numeric data types like int, float, decimal and money are converted into variable length data types. For example, 125 is stored in column and data type of the column is integer then we know that 4 bytes are used to store the integer value. But 125 can be stored in 1 byte because 1 byte can store values from 0 to 255. So, 125 can be store as tiny int. such that 3 bytes can be saved.
Char and Nchar data types are stored as variable length data types. For example, “SQL” is stored in char(20) type column. But after compression only 3 bytes will use. After the data compression no blank character is stored with this type of data.
The metadata of record is reduced.
NULL and 0 values are optimized and no space consumed.

Page Compression

Page compression is an advancedlevel of data compression. A page compression also implements the row level compression automatically. Page compression is categorized into two type prefix compression and dictionary compression.

Prefix Compression

In prefix compression for each page, for each column in the page a common value is retrieved from all rows and stored below the header in each column. Now in each row instead of common value areference of that value is stored.

Dictionary Compression

Dictionary compression is similar to prefix compression but common values are retrieved from all columns and stored in second row after the header. Dictionary compression looks for exact value matches across all the columns and rows on each page.

We can perform row and page level compression for following database objects.

Table stored in heap.
Whole table stored as clustered index.
Indexed view.
Non Clustered index.
Partitioned indexes and tables.

Note: We can perform data compression either at the time of creation like CREATE TABLE, CREATE INDEX or after the creation using ALTER command with REBUILD option like ALTER TABLE …. REBUILD WITH.

Let us take some example and check that how data compression reduces the size of data.

Firstly, create a table.

CREATE TABLE Compress_Table
(
Id int,
Name [char](50)
);

Now insert some data into table.

INSERT INTO dbo.Compress_Table
(
Id,
Name
)
VALUES
(
125,
'Demo Data '
)
GO 10000

Now we check the space used by the data of table.

table

Now first we perform the ROW level compression and check how much data is compressed.

code

We can see that reserved space of table is reduced from 712KB to 344KB and unused space is reduced from 48KB to 40KB.

Now we perform the page level compression and check the difference between row level and page level compression.

code

Above image shows that in page level data compression, size of data is less than compare to row level data compression.

Limitations of Data Compression

Data compression is not available for every edition of SQL Server. Only available for Enterprise edition.
Data compression not allow for data tables.
Data compression can’t change the maximum row size and table can’t be compressed if maximum row size plus the compression overhead exceed the row size of 8060byte.
If specified partition exceed the range then error will occur.
Non clustered index doesn’t inherit the data compression until we don’t define compression property explicitly.
Data compression in database with ROW/PAGE compression can’t be restored ,attached or used in other SQL Server edition.
Data compression is incompatible with sparse columns so table can’t be compressed.
When table is compressed and data is inserted into table then CPU usage increased compare to uncompressed table.

Conclusion

Data Compression is a very efficient technique to reduce the size of data, reduced data requires less I/O processes but when data is inserted in compressed table then it requires more efforts compared to uncompressed tables So before bulk insertion is uncompressed the data performs insertion operation after that. After insertion, the operation again compresses the data.

Thanks for reading the article, if you have any query or doubt, then don’t hesitate and ask your query in the comment section.