name="msvalidate.01" content="439BE949614BBF57E29AEDBC5087737D" /> learning sql server: Data and Backup Compression

Rabu, 03 Juni 2009

Data and Backup Compression

Data and Backup Compression
SQL Server Enterprise (and thus Developer) edition now allows you to effortlessly implement
data and backup compression.
Data Compression
Data compression can compress table data, nonclustered indexes, and indexed views. Of
course, because a clustered index contains the table data in its leaf level, clustered indexes
are compressed when a table is compressed. There are no changes required to your code
to access data that is in a compressed state because the compression and decompression is
done behind the scenes.
Row Compression
Row compression uses variable length storage to reduce row storage requirements. Not all
data types can be compressed. For example, an INT of 4 bytes can be reduced such that only
the bytes needed for storage are used. So if an INT column has a value of 12,345, then it will
Chapter 2 Performance 47
use only 2 bytes (because a 2 byte integer can hold values from –32,768 to 32,767). The topic
“Row Compression Implementation” in SQL Server Books Online describes all the data types
and if and how much they can be compressed.
Note When compression is implemented, 0 (numeric) and NULL values do not take up any
space.
Page Compression
Page compression is actually a combination of three different compression types: row, prefix,
and dictionary. First a row compression operation is done, in an attempt to reduce the data
type storage.
Note When using page compression on a nonclustered index, the non-leaf nodes of the index
are compressed using only row compression. The leaf nodes use all of the three compressions
(discussed in this section).
Next, a prefix compression operation is performed. Prefix compression looks for patterns in
the start of the data and repeated prefix values in the page header. For example, review the
pieces of data shown in Figure 2-2.
Column 1 Column 2
Data abcdefg abcefg abcd abcdefg xxyyzzz xxyyz
FIGURE 2-2 Data before compression
Note The prefix and dictionary values are actually stored in a compression information (CI)
structure in an area just after the page header.
When prefix compression is applied, different initial patterns within the same column are
moved to the CI structure, and the data is changed to reference the values in the CI structure
(pointers are represented here by the equal border color), as shown in Figure 2-3.
48 Introducing SQL Server 2008
abcd xxyyz
Data 4efg 3efg [empty] 4efg 5zz [empty]
Column 1 Column 2
CIStructure
FIGURE 2-3 Data after prefix compression
A value of [empty] in the data means that data is an exact match and just a pointer to the
prefix value in the CI structure is stored. The value 4efg replaces the prefix of abcd (four characters)
and points to the prefix value in the CI structure. The value 3efg only uses three of the
characters from the prefix in the CI structure.
Now that the prefix compression has been completed, a third compression operation occurs.
Dictionary compression looks for repeated patterns throughout the data and replaces them,
as shown in Figure 2-4.
abcd xxyyz
Data [empty]

Tidak ada komentar:

Posting Komentar