0% found this document useful (0 votes)
19 views

Unit 4 Compression 11 Oct 22

Image compression addresses the problem of reducing the amount of data required to represent a digital image. It aims to yield a compact representation of an image while reducing storage and transmission requirements. There are three main types of redundancy in images - coding, interpixel, and psychovisual - that compression techniques exploit. Lossy compression can achieve higher compression ratios by removing psychovisual redundancy, at the cost of some loss in visual quality, while lossless compression preserves exact image data by removing only coding and interpixel redundancies. Common algorithms and standards are used to compress images and video for efficient storage and transmission.

Uploaded by

Nikita Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Unit 4 Compression 11 Oct 22

Image compression addresses the problem of reducing the amount of data required to represent a digital image. It aims to yield a compact representation of an image while reducing storage and transmission requirements. There are three main types of redundancy in images - coding, interpixel, and psychovisual - that compression techniques exploit. Lossy compression can achieve higher compression ratios by removing psychovisual redundancy, at the cost of some loss in visual quality, while lossless compression preserves exact image data by removing only coding and interpixel redundancies. Common algorithms and standards are used to compress images and video for efficient storage and transmission.

Uploaded by

Nikita Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Image Compression:

 Image compression model,


 Type of redundancy,
 Compression algorithms and its types,
 Lossless compression algorithms,
 Lossy compression algorithms,
 Image and video compression standards.
IMAGE COMPRESSION
Data compression refers to the process of reducing the amount of data required to
represents given quantity of information.
Sometimes the given data contains some data which has no relevant information, or
repeats the known information. It is said to contain data redundancy.

Image compression addresses the problem of reducing the amount of data required to represent a
digital image. It is a process intended to yield a compact representation of an image, thereby
reducing the image storage/transmission requirements.

Need of Compression

Storage: The storage requirements of imaging application are very high. The goal of data
compression is to reduce the amount of memory by reducing the number of bits, while at the
same time maintain the minimum data to reconstruct the image. The reduction of data reduces
the memory requirement and hence the money spent for storage.

Transmission: The transmission time of the image is directly proportional to the size of the
image. Image compression aims to reduce the transmission time by reducing the size of the
image. The reduction of data leads to easier and faster transportation of data.

Faster computation: Reduction data often simplifies the algorithm design and facilitates faster
execution of the algorithms.

Difference between lossless and lossy compression

Lossless compression Lossy Compression


This is a reversible process and no This is a non-reversible process and information
information is lost. is lost.
Compression ratio is usually less Compression ratio is very high
It is used for data that humans can handle It is useful for diffused data that humans cannot
directly such as text data. Compression is understand or interpret directly. Compression is
independent of the psycho visual system. dependent on the psycho visual characteristics.
It is required domains where reliability is very It is useful in domains where loss of data is
crucial such as executable files and medical acceptable.
data.

Image Compression Model

In first stage, the mapper transforms the input image into a format designed to reduce interpixel
redundancies. The second stage, quantizer block reduces the accuracy of mapper’s output in
accordance with a predefined criterion. In third and final stage, a symbol encoder creates a code
for quantizer output and maps the output in accordance with the code. These blocks perform, in
reverse order, the inverse operations of the encoder’s symbol coder and mapper block. As
quantization is irreversible, an inverse quantization is not included.

The Source Encoder is reduces/eliminates any coding, interpixel or psychovisual


redundancies. The Source Encoder contains 3 processes:
• Mapper: Transforms the image into array of coefficients reducing interpixel
redundancies. This is a reversible process which is not lossy.
• Quantizer: This process reduces the accuracy and hence psychovisual redundancies of a
given image. This process is irreversible and therefore lossy.
• Symbol Encoder: This is the source encoding process where fixed or variable-length
code is used to represent mapped and quantized data sets. This is a reversible process.
Removes coding redundancy by assigning shortest codes for the most frequently
occurring output values.

The Source Decoder contains two components.


• Symbol Decoder: This is the inverse of the symbol encoder and reverse of the variable-
length coding is applied.
• Inverse Mapper: Inverse of the removal of the interpixel redundancy.
The only lossy element is the Quantizer which removes the psychovisual redundancies causing
irreversible loss. Every Lossy Compression methods contain the quantizer module.
If error-free compression is desired the quantizer module is removed.

Compression Measures

Compression Ratio: This is defined as

( ) N1
Compression ratio = ( )
=
N2

This is expressed explicitly as N1:N2. It is common to use the compression ratio of 4:1. The
interpretation is that 4 pixels of the input image are expressed as 1 pixel in the output image.

Saving percentage: This is defined as

( ) N2
Saving percentage =1 - ( )
= 1
N1

Bit rate describes the rate at which bits are transferred from the sender to the receiver and
indicates the efficiency of the compression algorithm by specifying how much data is transmitted
in a given amount of time. It is often given as bits per second (bps), kilobits per second (Kbps),
or megabits per second (Mbps). Bit rate specifies the average number of bits per stored pixel of
the image and is given as

N2
Bit rate = = (bits per pixel)
N

1.1 Redundancy

Image compression addresses the problem of reducing the amount of data required to represent a
digital image. It is a process intended to yield a compact representation of an image, thereby
reducing the image storage/transmission requirements. Compression is achieved by the removal
of one or more of the three basic data redundancies:

1. Coding Redundancy
2. Interpixel Redundancy
3. Psychovisual Redundancy
4. Chromatic Redundancy

Coding redundancy is present when less than optimal code words are used.
Interpixel redundancy results from correlations between the pixels of an image.
Psychovisual redundancy is due to data that is ignored by the human visual system (i.e. visually
non-essential information).
Chromatic redundancy refers to the presence of unnecessary colours in an image.

Image compression techniques reduce the number of bits required to represent an image by
taking advantage of these redundancies. An inverse process called decompression (decoding) is
applied to the compressed data to get the reconstructed image. The objective of compression is to
reduce the number of bits as much as possible, while keeping the resolution and the visual
quality of the reconstructed image as close to the original image as possible.

Coding Redundancy:
• Coding redundancy is associated with the representation of information.
• The information is represented in the form of codes.
• If the gray levels of an image are coded in a way that uses more code symbols than
absolutely necessary to represent each gray level then the resulting image is said to
contain coding redundancy.

Inter-pixel Spatial Redundancy:


• Interpixel redundancy is due to the correlation between the neighboring pixels in an
image.
• That means neighboring pixels are not statistically independent. The gray levels are not
equally probable.
• The value of any given pixel can be predicated from the value of its neighbors that is
they are highly correlated.
• The information carried by individual pixel is relatively small. To reduce the interpixel
redundancy the difference between adjacent pixels can be used to represent an image.

Inter-pixel Temporal Redundancy:


• Interpixel temporal redundancy is the statistical correlation between pixels from
successive frames in video sequence.
• Temporal redundancy is also called interframe redundancy. Temporal redundancy can
be exploited using motion compensated predictive coding.
• Removing a large amount of redundancy leads to efficient video compression.

Psychovisual Redundancy:
• The Psychovisual redundancies exist because human perception does not involve
quantitative analysis of every pixel or luminance value in the image.
• It’s elimination is real visual information is possible only because the information itself
is not essential for normal visual processing.
• Psychovisual redundancy is related with the visual information. Its elimination results a
loss of quantitative information. However psychovisually loss is negligible. Removing
this type of redundancy is a lossy process and the lost information cannot be recovered.

Chromatic redundancy
• Chromatic redundancy refers to the presence of unnecessary colours in an image. The
colour channels of colour image are highly correlated and the human visual system
cannot perceive millions of colours. Hence the colours that are not perceived by the
human visual system can be removed without affecting the quality of image.

BENEFITS OF COMPRESSION
• It provides a potential cost savings associated with sending less data over switched
telephone network where cost of call is really usually based upon its duration.
• It not only reduces storage requirements but also overall execution time.
• It also reduces the probability of transmission errors since fewer bits are transferred.
• It also provides a level of security against illicit monitoring.

Information theory- Entropy


The information in an image can be modeled as a probabilistic process, where we first develop a
statistical model of the image generation process. The information content (entropy) can be
estimated based on this model.
The information per source (symbol or pixel) which is also referred as entropy, calculated by:
Where P(aj) refers to the source symbol/pixel probabilities. J refers to the number of symbols or
different pixel values.

For example, given the following Image segment:

Measuring Information: The entropy of the given 8-bit image segment can be calculated by:

The entropy of this image is calculated by:

2. IMAGE COMPRESSION TECHNIQUES


The image compression techniques are broadly classified into two categories depending whether
or not an exact replica of the original image could be reconstructed using the compressed image.
These are:

1. Lossless technique
2. Lossy technique

2.1 Lossless compression technique


In lossless compression techniques, the original image can be perfectly recovered form the
compressed (encoded) image. These are also called noiseless since they do not add noise to the
signal (image).It is also known as entropy coding since it use statistics/decomposition techniques
to eliminate/minimize redundancy. Lossless compression is used only for a few applications with
stringent requirements such as medical imaging.

Following techniques are included in lossless compression:


1. Run length encoding
2. Huffman encoding
3. Shanon –Fano Coding
4. Bit plane coding
5. Arithmatic coding
6. Dictionary based coding
7. Lossless predictive coding

2.2 Lossy compression technique


Lossy schemes provide much higher compression ratios than lossless schemes. Lossy schemes
are widely used since the quality of the reconstructed images is adequate for most applications.
By this scheme, the decompressed image is not identical to the original image, but reasonably
close to it. The quantization process results in loss of information. The entropy coding after the
quantization step, however, is lossless. The decoding is a reverse process. Firstly, entropy
decoding is applied to compressed data to get the quantized data. Secondly, de quantization is
applied to it & finally the inverse transformation to get the reconstructed image.

Major performance considerations of a lossy compression scheme include:


1. Compression ratio
2. Signal - to – noise ratio
3. Speed of encoding & decoding.

Lossy compression techniques includes following schemes:


1. Transformation coding
2. Lossy predictive coding
3. Vector quantization
4. Fractal coding
5. Block Transform Coding
6. Subband coding

2.3 LOSSLESS COMPRESSION TECHNIQUES

2.3.1 Run Length Encoding


This is a very simple compression method used for sequential data. It is very useful in case of
repetitive data. This technique replaces sequences of identical symbols (pixels), called runs by
shorter symbols. The run length code for a gray scale image is represented by a sequence {V i,
Ri} where Vi is the intensity of pixel and Ri refers to the number of consecutive pixels with the
intensity Vi as shown in the figure. If both Vi and Ri are represented by one byte, this span of 12
pixels is coded using eight bytes yielding a compression ratio of 1:5
• Let us take a hypothetical single scan line, with B representing a black pixel and W
representing white:

WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWW
WWWWWWWWWWWWWBWWWWWWWWWWWWWW
If we apply the run-length encoding (RLE) data compression algorithm to the above
hypothetical scan line, we get the following:12W1B12W3B24W1B14W
This is a very simple compression method used for sequential data. It is very useful in case of
repetitive data. This technique replaces sequences of identical symbols (pixels), called runs by
shorter symbols. The run length code for a gray scale image is represented by a sequence {V i,
Ri} where Vi is the intensity of pixel and Ri refers to the number of consecutive pixels with the
intensity Vi as shown in the figure. If both Vi and Ri are represented by one byte, this span of
12 pixels is coded using eight bytes yielding a compression ratio of 1:5

6
3
1
24
25/24= 1.042:1

Vertical line scanning of the same image yields

The scan line can be changed to zigzag


2.3.2 Huffman Encoding
This is a general technique for coding symbols based on their statistical occurrence frequencies
(probabilities). The pixels in the image are treated as symbols. The symbols that occur more
frequently are assigned a smaller number of bits, while the symbols that occur less frequently are
assigned a relatively larger number of bits. Huffman code is a prefix code. This means that the
(binary) code of any symbol is not the prefix of the code of any other symbol. Most image
coding standards use lossy techniques in the earlier stages of compression and use Huffman
coding as the final step.

The simplest construction algorithm uses a priority queue where the node with lowest probability
is given highest priority:
• Create a leaf node for each symbol and add it to the priority queue. While there is
more than one node in the queue, remove the node of highest priority (lowest
probability) twice to get two nodes.
• Create a new internal node with these two nodes as children and with probability
equal to the sum of the two nodes' probabilities. Add the new node to the queue.
• Repeat step 2 until the remaining node is the root and the tree is complete.
• For example, a source generates 5 different symbols {a1, a2, a3, a4, a5, a6} with
probability {0.1; 0.4; 0.06; 0.1; 0.04; 0.3}.

Error free compression


Huffman coding

The first code assignment is done for a2 with the highest probability and the last assignment are
done for a3 and a5 with the lowest probabilities.
 Shanon –Fano Coding
 Bit plane coding (class notes)

Arithmetic coding
Arithmetic coding is a form of entropy encoding used in lossless data compression.
Normally, a string of characters such as the words "hello there" is represented using a fixed
number of bits per character, as in the ASCII code. When a string is converted to arithmetic
encoding, frequently used characters will be stored with fewer bits and not-so-frequently
occurring characters will be stored with more bits, resulting in fewer bits used in total.
Arithmetic coding differs from other forms of entropy encoding, such as Huffman coding, in
that rather than separating the input into component symbols and replacing each with a code,
arithmetic coding encodes the entire message into a single number, an arbitrary-
precision fraction n where [0.0 ≤ n < 1.0).

Dictionary based coding (LZW coding)

LZW (Lempel- Ziv-Welch) is a dictionary based coding. Dictionary based coding can be
static or dynamic. In static dictionary coding, dictionary is fixed during the encoding and
decoding processes. In dynamic dictionary coding, the dictionary is updated on fly. LZW is
widely used in computer industry and is implemented as compress command on UNIX.
Lossless predictive coding

The system consists of an encoder and a decoder, each containing an identical predictor. As
each successive pixel of the input image, is introduced to the encoder, the predictor generates
the anticipated value of that pixel based on some number of past inputs. The output of the
predictor is then rounded to the nearest integer.

ˆf  round   f 
m

n  i n  i 
 i 1 

m is the order of the linear predictor, round is a function used to denote the rounding or
nearest integer operation and the αi for i=1,2,…m are prediction coefficient

en  f n  fˆn

this error is send across the chennel. the same predictor is used in the decoder side to predict the
value. the reconstructed image is

f n  en  fˆn
LOSSY COMPRESSION TECHNIQUES

Unlike the error-free compression, lossy encoding is based on the concept of compromising the
accuracy of the reconstructed image in exchange for increased compression.

The lossy compression method produces distortion which is irreversible. On the other hand, very
high compression ratios ranging between 10:1 to 50:1 can be achieved with visually
indistinguishable from the original. The error-free methods rarely give results more than 3:1.

Predictive Lossy coding

Vector Quantization
The basic idea in this technique is to develop a dictionary of fixed-size vectors, called code
vectors. A vector is usually a block of pixel values. A given image is then partitioned into non-
overlapping blocks (vectors) called image vectors. Then for each in the dictionary is determined
and its index in the dictionary is used as the encoding of the original image vector. Thus, each
image is represented by a sequence of indices that can be further entropy coded.
Block Transformation Coding
In this coding scheme, transforms such as DFT (Discrete Fourier Transform) and DCT (Discrete
Cosine Transform) are used to change the pixels in the original image into frequency domain
coefficients (called transform coefficients).These coefficients have several desirable properties.
One is the energy compaction property that results in most of the energy of the original data
being concentrated in only a few of the significant transform coefficients. This is the basis of
achieving the compression. Only those few significant coefficients are selected and the
remaining are discarded. The selected coefficients are considered for further quantization and
entropy encoding. DCT coding has been the most common approach to transform coding. It is
also adopted in the JPEG image compression standard.
DCT-based JPEG (Joint Photographic Expert Group) Standard

Image is divided in to smaller blocks. For example if the RGB image is 1024x1024 it can be
divided into subblocks of 4x4, 8x8, or 16x16. if the sub block size is chosen is 8x8, there will be
1024/8X1024/8=128x128 blocks in the horizontal and vertical direction.

Step 2. Quantization:

After the frequency coefficients are obtained, all the frequency components are not necessary
since the human eye is sensitive to the low frequency components. For this purpose a threshold
value is applied. The frequency components whose values are lesser than the threshold value are
discarded. The quantization table is used for this purpose. If the value in the quantization table is
10 and the frequency coefficient is 43, the quantized table value is 43/10=4.3. This is
approximated to the nearest integer i.e. 4. Depending on the threshold value, it is either retained
or discarded.

Step 3. Coefficient to Symbol Mapping:

Once the encoded file is received the decoding is the inverse process given below

Step 4. Entropy Coding:


The quantized values are encoded using entropy coding techniques. if an 8x8 block is assumed,
then the first element (0,0) is called dc coefficient and the other 63 elements are called AC
coefficients. First Run length coding is applied to the sequence. Then the output of the RLC
algorithm is Huffman coded for more compression.

JPEG Decoder
The JPEG format offers the following four model of operation:
1. Sequential DCT – based mode
2. Loss less mode
3. Progressive mode
4. Hierarchical mode

1. Sequential DCT : -– based mode:- It consist of following steps:


i. Image preparation
ii. Application of Image transform
iii. Quantization
iv. Entropy Encoding
v. Frame building

i. Image Preparation Convert an RGB image into a YCbCr (Luminance – Chrominance-


Colour space) image, so that the colour components are separated from luminance part.
Images are divided into smaller part.
ii. Image Transform: Image transform such as DCT or wavelet can also be used.
iii. Quantization: The human eye is sensitive only to the low frequency components for this
purpose a threshold value is applied. The frequency components whose values are lesser
than the threshold value are discarded.
iv. Entropy Encoding: The quantized values are encoded using entropy coding techniques.
If an 8x8 block is assumed, than the first element (0,0) is called the DC coefficient and
the other 63 elements are called AC coefficients. Differential encoding is used for
encoding the DC coefficient. The remaining 63 AC coefficients are arranged in a zigzag
sequence. First RLC is applied to the sequence and its output is Huffman coded for more
compression.

Fig: JPEG entropy coding Sample zigzag mask

v. Frame building: The frame header is created with additional information such as start
bit, end bit, data type and nature of the image. The compressed data is packed and the
frames are sent across the channel.

The JPEG decoder is the same as the encoder. It consists of the following components-
a. Frame decoder
b. Entropy decoder
c. DE quantization
d. Inverse DCT transform
e. Image Builder
Due to quantization, information loss occurs and hence perfect reconstruction is not possible.
However the thresholds are such the human visual system.

2. Loss Less Mode


JPEG also offers lossless compression. This mode creates a perfect duplicate of the original
image. Here pixel X is to be predicted using the pixels A, B and C.
The ways of predicting the unknown pixel x using neighbour pixels A, B and C. is shown in
Table above.
3. Progressive Encoding
The main ides of progressive encoding is a gradual compression using the priority of the
pixels. First it sends the coarse version of the image to the receiver. Then the additional
information is sent so that the quality of the image is progressively refined. This mode is
similar to the sequential DCT mode in all aspects. The sub blocks are DCT coded frequency
coefficients are divided into multiple spectral bands. Spectral bands are decided by zigzag
scanning order. This mode uses a progressive scheme know as spectral selection where low
frequency coefficients are sent first and then the remaining coefficients.

4. Hierarchical mode
This scheme uses the pyramidal data structure that store the image at several different
resolutions. Its advantage is that they allow the user to negotiate with the application at the
required resolution. The bottom layer is the original image in a pyramid data structure and
the subsequent layers are subsampled image by factor 2.
The predictive coding is used to encode the differential frames. Hence hierarchical coding
supports lossless code as well as progressive coding.
Video compression- MPEG

A sequence of still images associated with time index is called video. Video includes both still
image data and audio information. Video compression is based on two important aspects spatial
redundancy that is present in a single frame of the video and temporal redundancy that is present
among the frames.

 The data hierarchy of MPEG is given as follows and is illustrated in Fig.


1. Video sequence
2. Group Picture
3. Picture
4. Slice
5. Macroblock
6. Block

Video

Group of pictures

Macroblock
Slice

Picture
Block

Fig. Data hierarchy of MPEG


Macroblock formation
It can be seen from Fig 6.22 that the video sequence has a set of pictures. A collection of selected
pictures is called a group of pictures. Each picture can be divided into slices which can be
divided into a macroblock. Each macroblock is then divided into a block. A block has a 8x8
pixels. Compression is only done at the macro block level.

Frame Construction
1. The first frame or key frame is called the I-frame. These frames are very important as they
have more information.
2. Forward prediction schemes compute the difference between the current and the previous
frames. Backward prediction schemes compute the difference between the current and the
next frames. P-pictures or predictive pictures are obtained based on the difference obtained
from the frames compared with their previous frames. The bidirectional schemes predict the
difference between the current frame and both the previous and the subsequent frames. B-
frames are obtained by predicting both the previous and the later frames and then
interpolating them to get the complete frame.

I-frames are encoded using intra-frame compression algorithms. P-frames can be coded as
follows:

1. The difference between the current frame and the preceding frame is calculated.
2. The difference is encoding using DCT. Then the frequency coefficients are quantized and
then RLC is used for coding the sequence along with the motion vector.

B-frames use both past and future frames and hence should be reconstructed first before being
sent across the channel. Thus the frames should be reordered to the decoder in an efficient
manner.

Motion Estimation
Audio Compression
PCM encoder: This takes the audio signal and generates 32 samples.

Filter bank: This employs the 32-point FFT, and its 32 frequency coefficients are considered as
32 sub-bands.

Psychoacoustic models: The psychoacoustic model is an attempt to take advantage of the


human auditory system. This model, based on the limitation of the human auditory system,
determines the masking coefficients of the various bands. The signal-to-mask ratios are then
obtained. These ratios determine the frequency components to be retained.

The audio compression data flow is shown in figs 6.25(a) and 6.25(b).
Start
Start

Read audio
Read audio compressed data
signal

Unpack and
Transform to dequantize
frequency domain

Frequency sample
reconstruction
Bit allocation

Transform to time
domain
Quantization

Entropy decoding
Entropy encoding

Audio signal
Transmit

Fig. Audio compression data flow (a) Sender side (b) Receiver side.

You might also like