DC CH1
DC CH1
Data Compression
Chapter 1 Introduction
Anokhi Shah
4/15/12
1 1 Anokhi
Data Compression
Data compression is art or science of representing information in a compact form form is created by identifying and using structure that exist in the data.
Structure could be any Pattern, Redundancy
Compact
or Model.
Example
(Every
required
to
represent
Anokhi
22
4/15/12
Examples
ZIP
Code
4/15/12
33
Anokhi Shah
Anokhi
the consumption of expensive resources such as Hard disk, Bandwidth transformation of Data compressed data can only be understood if the decoding method is known by the receiver data must be decompressed to be viewed or heard.
44
Compressed
Anokhi Shah
4/15/12
Anokhi
generates By
Anokhi Shah
Reconstruc
4/15/12
Anokhi
No
Lossless
compression is reversible so that the original data can be reconstructed exactly used for applications that cant tolerate any difference between the original and reconstructed data compression algorithms usually exploit statistical redundancy
4/15/12 66
Generally
Lossless
Anokhi Shah
Anokhi
Compression
identical to original
Very For
in
a Bank application consider the sentences Do not send money and Do now send money
Radiological
/ Satellite image
77
Anokhi Shah
Anokhi
Some
Compressed Obtains
data cannot be recovered or reconstructed exactly much higher compression ratio than is possible with lossless compression used for applications that can afford to have difference between the original and reconstructed data
4/15/12 88
Generally
Anokhi Shah
Anokhi
/ Image
video scene might not notice if some of its finest details are removed or not represented perfectly
Sound
/ Speech
Anokhi
algorithm algorithm
relative required
complexity to
of
the the
implement
single machine
Amount of compression How closely the reconstruction resembles
the original
10 10
Anokhi Shah
4/15/12
Anokhi
Compression Ratio
Number
of bits required to represent the data before compression to the number of bits required to represent the data after compression an image requires 65536 bytes for storage, After compression it requires 16384 bytes ratio is 4:1
Suppose
Compression Ratio
can be expressed as the reduction in the amount of data required as a 11 percentage of the size of the original 11 data. Anokhi 4/15/12 Anokhi Shah
Compression Rate
The If
average number of bits required to represent a single sample we assume 8 bits per byte (pixel), the average number of bits per pixel in the compressed representation is 2 we would say that rate is 2 bits per pixel.
Thus
Anokhi
difference between the original and the reconstruction is called the distortion when you talk about the difference between reconstruction and original its called fidelity/ quality Y-X = =Distortion Fidelity or Quality, X-Y
But
i.e
When
fidelity or quality of a reconstruction is high, the difference 13 between the reconstruction and original 13 is small Anokhi 4/15/12 Anokhi Shah
scheme depends on various factors like characteristic of the data. compression technique that works on text would not work on image because both have different kind of data compression development is in 2 phases
Modeling Coding
14 14 4/15/12
Data
Algorithm
Anokhi Shah
Anokhi
modeling phase we try to extract information about any redundancy that exist in the data and describe the redundancy in the form of a model coding phase, a description of the model and a description of the data differs from a model are encoded generally using a binary alphabet difference between data and model is referred as residual
15 15 4/15/12
In
The
Anokhi Shah
Anokhi
Example 1.2.1
On
page no 7 We are required to store the binary representation of the numbers. Max no is 21, which will require 5 bits because binary of 21 is 10101 So rate is 5 bits per sample By plotting the data as seen in figure 1.2 (Pg no 7) ,the data falls on a straight line with an equation Xn= n + 8 Xn is the model and En is the residual
16 16 Anokhi Shah 4/15/12
Anokhi
Residual sequence consist of 3 numbers which can be represented in binary using 2 bits
n 1 2 3 4 5 6 Xn Xn = n+8 9 9 11 10 11 11 11 12 14 13 13 14 En=XnXn 0 1 0 -1 1 -1
4/15/12
-1 00 0 01
We need to make use 1 10 of structure that exist in data to predict the value of each element in the sequence and then encode the residual.
17 17
Anokhi Shah
Anokhi