0% found this document useful (0 votes)
15 views

Compression 1

Uploaded by

Ayush Mishra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Compression 1

Uploaded by

Ayush Mishra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

UNIT-V

IMAGE COMPRESSION

Definition: Image compression deals with reducing the amount of data required to represent a
digital image by removing of redundant data.
Images can be represented in digital format in many ways. Encoding the contents of a
2-D image in a raw bitmap (raster) format is usually not economical and may result in very
large files. Since raw image representations usually require a large amount of storage space
(and proportionally long transmission times in the case of file uploads/ downloads), most
image file formats employ some type of compression. The need to save storage space and
shorten transmission time, as well as the human visual system tolerance to a modest amount of
loss, have been the driving factors behind image compression techniques.

Goal of image compression: The goal of image compression is to reduce the


amount of data required to represent a digital image.

Data  Information:

 Data and information are not synonymous terms!


 Data is the means by which information is conveyed.
 Data compression aims to reduce the amount of data required to represent a given
quality of information while preserving as much information as possible.
 The same amount of information can be represented by various amount of data.
Ex1: You have an extra class after completion of 3.50 p.m
Ex2: Extra class have been scheduled after 7th hour for you.
Ex3: After 3.50 p.m you should attended extra class.

126
Definition of compression ratio:

Definitions of Data Redundancy:

Coding redundancy:
 Code: a list of symbols (letters, numbers, bits etc.,)
 Code word: a sequence of symbol used to represent a piece of information or an event
(e.g., gray levels).
 Code word length: number of symbols in each code word.

127
128
129
130
131
COMPRESSION METHODS OF IMAGES:

Compression methods can be lossy,

when a tolerable degree of deterioration in the visual quality of the resulting image is
acceptable,

or lossless,

when the image is encoded in its full quality. The overall results of the compression
process, both in terms of storage savings – usually expressed numerically in terms of
compression ratio (CR) or bits per pixel (bpp) – as well as resulting quality loss (for the case
of lossy techniques) may vary depending on the technique, format, options (such as the
quality setting for JPEG), and the image contents.

As a general guideline, lossy compression should be used for general purpose


photographic images.

whereas lossless compression should be preferred when dealing with line art,
technical drawings, cartoons, etc. or images in which no loss of detail may be tolerable (most
notably, space images and medical images).

Fundamentals of visual data compression


The general problem of image compression is to reduce the amount of data required to
represent a digital image or video and the underlying basis of the reduction process is the
removal of redundant data. Mathematically, visual data compression typically involves

132
transforming (encoding) a 2-D pixel array into a statistically uncorrelated data set. This
transformation is applied prior to storage or transmission. At some later time, the compressed
image is decompressed to reconstruct the original image information (preserving or lossless
techniques) or an approximation of it (lossy techniques).

Redundancy
Data compression is the process of reducing the amount of data required to represent a given
quantity of information. Different amounts of data might be used to communicate the same
amount of information. If the same information can be represented using different amounts of
data, it is reasonable to believe that the representation that requires more data contains what is
technically called data redundancy.

Image compression and coding techniques explore three types of


redundancies: coding redundancy, interpixel (spatial) redundancy,
and psychovisual redundancy. The way each of them is explored is briefly described below.

 Coding redundancy: consists in using variable-length codewords selected as to match the


statistics of the original source, in this case, the image itself or a processed version of its pixel
values. This type of coding is always reversible and usually implemented using look-up
tables (LUTs). Examples of image coding schemes that explore coding redundancy are the
Huffman codes and the arithmetic coding technique.
 Interpixel redundancy: this type of redundancy – sometimes called spatial redundancy,
interframe redundancy, or geometric redundancy – exploits the fact that an image very often
contains strongly correlated pixels, in other words, large regions whose pixel values are the
same or almost the same. This redundancy can be explored in several ways, one of which is
by predicting a pixel value based on the values of its neighboring pixels. In order to do so, the
original 2-D array of pixels is usually mapped into a different format, e.g., an array of
differences between adjacent pixels. If the original image pixels can be reconstructed from
the transformed data set the mapping is said to be reversible. Examples of compression
techniques that explore the interpixel redundancy include: Constant Area Coding (CAC), (1-
D or 2-D) Run-Length Encoding (RLE) techniques, and many predictive coding algorithms
such as Differential Pulse Code Modulation (DPCM).

 Psycho visual redundancy: many experiments on the psychophysical aspects of


human vision have proven that the human eye does not respond with equal sensitivity to all
incoming visual information; some pieces of information are more important than others. The
knowledge of which particular types of information are more or less relevant to the final
human user have led to image and video compression techniques that aim at eliminating or
reducing any amount of data that is psycho visually redundant. The end result of applying
these techniques is a compressed image file, whose size and quality are smaller than the
original information, but whose resulting quality is still acceptable for the application at hand.

The loss of quality that ensues as a byproduct of such techniques is frequently


called quantization, as to indicate that a wider range of input values is normally mapped into
a narrower range of output values thorough an irreversible process. In order to establish the
nature and extent of information loss, different fidelity criteria (some objective such as root

133
mean square (RMS) error, some subjective, such as pair wise comparison of two images
encoded with different quality settings) can be used. Most of the image coding algorithms in
use today exploit this type of redundancy, such as the Discrete Cosine Transform (DCT)-
based algorithm at the heart of the JPEG encoding standard.

IMAGE COMPRESSION AND CODING MODELS

Figure 1 shows a general image compression model. It consists of a source encoder, a


channel encoder, the storage or transmission media (also referred to as channel ), a channel
decoder, and a source decoder. The source encoder reduces or eliminates any redundancies in
the input image, which usually leads to bit savings. Source encoding techniques are the
primary focus of this discussion. The channel encoder increase noise immunity of source
encoder’s output, usually adding extra bits to achieve its goals. If the channel is noise-free,
the channel encoder and decoder may be omitted. At the receiver’s side, the channel and
source decoder perform the opposite functions and ultimately recover (an approximation of)
the original image.

Figure 2 shows the source encoder in further detail. Its main components are:

 Mapper: transforms the input data into a (usually nonvisual) format designed to
reduce interpixel redundancies in the input image. This operation is generally
reversible and may or may not directly reduce the amount of data required to
represent the image.

 Quantizer: reduces the accuracy of the mapper’s output in accordance with some pre-
established fidelity criterion. Reduces the psychovisual redundancies of the input
image. This operation is not reversible and must be omitted if lossless compression is
desired.

 Symbol (entropy) encoder: creates a fixed- or variable-length code to represent the


quantizer’s output and maps the output in accordance with the code. In most cases, a
variable-length code is used. This operation is reversible.

Error-free compression

Error-free compression techniques usually rely on entropy-based encoding algorithms. The


concept of entropy is mathematically described in equation (1):

where:

a j is a symbol produced by the information source

P ( a j ) is the probability of that symbol


134
J is the total number of different symbols

H ( z ) is the entropy of the source.

The concept of entropy provides an upper bound on how much compression can be achieved,
given the probability distribution of the source. In other words, it establishes a theoretical
limit on the amount of lossless compression that can be achieved using entropy encoding
techniques alone.

Variable Length Coding (VLC)

Most entropy-based encoding techniques rely on assigning variable-length codewords to each


symbol, whereas the most likely symbols are assigned shorter codewords. In the case of
image coding, the symbols may be raw pixel values or the numerical values obtained at the
output of the mapper stage (e.g., differences between consecutive pixels, run-lengths, etc.).
The most popular entropy-based encoding technique is the Huffman code. It provides the
least amount of information units (bits) per source symbol. It is described in more detail in a
separate short article.

Run-length encoding (RLE)

RLE is one of the simplest data compression techniques. It consists of replacing a sequence
(run) of identical symbols by a pair containing the symbol and the run length. It is used as the
primary compression technique in the 1-D CCITT Group 3 fax standard and in conjunction
with other techniques in the JPEG image compression standard (described in a separate short
article).

Differential coding

Differential coding techniques explore the interpixel redundancy in digital images. The basic
idea consists of applying a simple difference operator to neighboring pixels to calculate a
difference image, whose values are likely to follow within a much narrower range than the
original gray-level range. As a consequence of this narrower distribution – and consequently
reduced entropy – Huffman coding or other VLC schemes will produce shorter codewords
for the difference image.

Predictive coding

Predictive coding techniques constitute another example of exploration of interpixel


redundancy, in which the basic idea is to encode only the new information in each pixel. This
new information is usually defined as the difference between the actual and the predicted
value of that pixel.

Figure 3 shows the main blocks of a lossless predictive encoder. The key component is the
predictor, whose function is to generate an estimated (predicted) value for each pixel from the
input image based on previous pixel values. The predictor’s output is rounded to the nearest
integer and compared with the actual pixel value: the difference between the two –
135
called prediction error – is then encoded by a VLC encoder. Since prediction errors are likely
to be smaller than the original pixel values, the VLC encoder will likely generate shorter
codewords.

There are several local, global, and adaptive prediction algorithms in the literature. In most
cases, the predicted pixel value is a linear combination of previous pixels.

Dictionary-based coding

Dictionary-based coding techniques are based on the idea of incrementally building a


dictionary (table) while receiving the data. Unlike VLC techniques, dictionary-based
techniques use fixed-length codewords to represent variable-length strings of symbols that
commonly occur together. Consequently, there is no need to calculate, store, or transmit the
probability distribution of the source, which makes these algorithms extremely convenient
and popular. The best-known variant of dictionary-based coding algorithms is
the LZW (Lempel-Ziv-Welch) encoding scheme, used in popular multimedia file formats
such as GIF, TIFF, and PDF.

Lossy compression

Lossy compression techniques deliberately introduce a certain amount of distortion to the


encoded image, exploring the psychovisual redundancies of the original image. These
techniques must find an appropriate balance between the amount of error (loss) and the
resulting bit savings.

Quantization

The quantization stage is at the core of any lossy image encoding algorithm. Quantization, in
at the encoder side, means partitioning of the input data range into a smaller set of values.
There are two main types of quantizers: scalar quantizers and vector quantizers. A scalar
quantizer partitions the domain of input values into a smaller number of intervals. If the
output intervals are equally spaced, which is the simplest way to do it, the process is
called uniform scalar quantization; otherwise, for reasons usually related to minimization of
total distortion, it is called nonuniform scalar quantization. One of the most popular
nonuniform quantizers is the Lloyd-Max quantizer. Vector quantization (VQ) techniques
extend the basic principles of scalar quantization to multiple dimensions. Because of its fast
lookup capabilities at the decoder side, VQ-based coding schemes are particularly attractive
to multimedia applications.

Transform coding

The techniques discussed so far work directly on the pixel values and are usually
called spatial domain techniques. Transform coding techniques use a reversible, linear
mathematical transform to map the pixel values onto a set of coefficients, which are then

136
quantized and encoded. The key factor behind the success of transform-based coding
schemes many of the resulting coefficients for most natural images have small magnitudes
and can be quantized (or discarded altogether) without causing significant distortion in the
decoded image. Different mathematical transforms, such as Fourier (DFT), Walsh-Hadamard
(WHT), and Karhunen-Loeve (KLT), have been considered for the task. For compression
purposes, the higher the capability of compressing information in fewer coefficients, the
better the transform; for that reason, the Discrete Cosine Transform (DCT) has become the
most widely used transform coding technique.

Wavelet coding

Wavelet coding techniques are also based on the idea that the coefficients of a transform that
decorrelates the pixels of an image can be coded more efficiently than the original pixels
themselves. The main difference between wavelet coding and DCT-based coding (Figure 4)
is the omission of the first stage. Because wavelet transforms are capable of representing an
input signal with multiple levels of resolution, and yet maintain the useful compaction
properties of the DCT, the subdivision of the input image into smaller subimages is no longer
necessary. Wavelet coding has been at the core of the latest image compression standards,
most notably JPEG 2000, which is discussed in a separate short article.

Image compression standards

Work on international standards for image compression started in the late 1970s with
the CCITT (currently ITU-T) need to standardize binary image compression algorithms for
Group 3 facsimile communications. Since then, many other committees and standards have
been formed to produce de jure standards (such as JPEG), while several commercially
successful initiatives have effectively become de facto standards (such as GIF). Image
compression standards bring about many benefits, such as: (1) easier exchange of image files
between different devices and applications; (2) reuse of existing hardware and software for a
wider array of products; (3) existence of benchmarks and reference data sets for new and
alternative developments.

Binary image compression standards

Work on binary image compression standards was initially motivated by CCITT Group 3 and
4 facsimile standards. The Group 3 standard uses a non-adaptive, 1-D RLE technique in
which the last K-1 lines of each group of K lines (for K = 2 or 4) are optionally coded in a 2-
D manner, using the Modified Relative Element Address Designate (MREAD) algorithm. The
Group 4 standard uses only the MREAD coding algorithm. Both classes of algorithms are
non-adaptive and were optimized for a set of eight test images, containing a mix of
representative documents, which sometimes resulted in data expansion when applied to
different types of documents (e.g., half-tone images).. The Joint Bilevel Image Group
(JBIG)– a joint committee of the ITU-T and ISO – has addressed these limitations and
proposed two new standards

(JBIG and JBIG2) which can be used to compress binary and gray-scale images of up to 6
gray-coded bits/pixel.

Continuous tone still image compression standards


137
For photograph quality images (both grayscale and color), different standards have
been proposed, mostly based on lossy compression techniques. The most popular standard in
this category, by far, is the JPEG standard, a lossy, DCT-based coding algorithm. Despite its
great popularity and adoption, ranging from digital cameras to the World Wide Web, certain
limitations of the original JPEG algorithm have motivated the recent development of two
alternative standards, JPEG 2000 and JPEG-LS (lossless). JPEG, JPEG 2000, and JPEG-LS
are described in separate short articles.

Encode each pixel ignoring their inter-pixel dependencies. Among methods are:

1. Entropy Coding: Every block of an image is entropy encoded based upon the Pk’s
within a block. This produces variable length code for each block depending on
spatial activities within the blocks.
2. Run-Length Encoding: Scan the image horizontally or vertically and while scanning
assign a group of pixel with the same intensity into a pair (gi , li) where gi is the
intensity and li is the length of the “run”. This method can also be used for detecting
edges and boundaries of an object. It is mostly used for images with a small number
of gray levels and is not effective for highly textured images.

Example 2: Let the transition probabilities for run-length encoding of a binary image
(0:black and 1:white) be p0 = P(0/1) and p1 = P(1/0). Assuming all runs are independent, find
(a) average run lengths, (b) entropies of white and black runs, and (c) compression ratio.

Solution:

A run of length l ≥ 1 can be represented by a Geometric random variable (Grv) X i


with PMF P(Xi = l) = pi (1-pi) t-1 with i = 0,1 which corresponds to happening of 1st
occurrences of 0 or 1 after l independent trials. (Note that (1-P(0/1)) = P(1/1) and (1-P(1/0))
= P(0/0)) and Thus, for the average we have

138
Using the same series formula, we get

Huffman Encoding Algorithm: It consists of the following steps.


1. Arrange symbols with probability Pk’s in a decreasing order and consider them as
“leaf nodes” of a tree.
2. Merge two nodes with smallest prob to form a new node whose prob is the sum
of the two merged nodes. Go to Step 1 and repeat until only two nodes are left
(“root nodes”). 3 Arbitrarily assign 1’s and 0’s to each pair of branches merging
into a node. 4 Read sequentially from root node to the leaf nodes to form the
associated code for each symbol.

Example 3: For the same image in the previous example, which requires 3 bits/pixel using
standard PCM we can arrange the table on the next page.

139
Fig. Tree structure for Huffman Encoding

Note that in this case, we have

i.e., an average of 2 bits/pixel (instead of 3 bits/pixel using PCM) can be used to code the
image. However, the drawback of the standard Huffman encoding method is that the codes
have variable lengths.

140
PREDICTIVE ENCODING:

Idea: Remove mutual redundancy among successive pixels in a region of support (ROS) or
neighborhood and encode only the new information. This mehtod is based upon linear
prediction. Let us start with 1-D linear predictors. An Nth order linear prediction of x(n)
based on N previous samples is generated using a 1-D autoregressive (AR) model.

ai s are model coefficients determined based on some sample signals. Now instead of
encoding x(n) the prediction error.

Is encoded as it requires substantially small number of bits. Then, at the receiver we


reconstruct x(n) using the previous encoded values x(n-k) and the encoded error signal, i.e.,

This method is also referred to as differential PCM (DPCM).

To understand the need for comapct image representation, consider the amount of
data required to represent a 2 hour standard Definition(SD) using 720 x 480 x 24 bit pixel
arrays.

141
A video is a sequence of video frames where each frame is full color still image.
Because video player must display the frames sequentially at rates near 30 fps. Standard
definition data must be accessed 30fps x (720 x 480) ppf x 3bpp = 31,104,000 bps.

fps: frames per second, ppf: pixels per frame, bpp: bytes per pixel, bps: bytes per second.

Thus a 2 hour movie consists of : = 31,104,000 bps x (602) sph x 2hrs

where sph is second per hour = 2.24 x 1011 bytes = 224 GB of data.

TWENTY SEVEN 8.5 GB dual layer DVD’s are needed to store it.

To put 2 hours movie on a single DVD, each frame must be compressed by a factor of around
26.3.

The compression must be even higher for HD, where image resolution reach 1920 x 1080 x
24 bits per image.

Webpage images & High-resolution digital camera photos also are compressed to save
storage space & reduce transmission time.

Residential Internet connection delivers data at speeds ranging from 56kbps (conventional
phone line) to more than 12 mbps (broadband).

Time required to transmit a small 128 x 128 x 24 bit full color image over this range of speed
is from 7.0 to 0.03 sec.

Compression can reduce the transmission time by a factor of around 2 to 10 or more.

Similarly, number of uncompressed full color images that an 8 Megapixel digital camere can
store on a 1GB Memory card can be increased.

Data compression: It refers to the process of reducting the amount of data required to
represent a given quantity of information.

Data Vs Information:

Data and information are not the same thing; data are the means by which information is
conveyed.

Because various amount of data can be used to represent the same amount of information,
representations that contain irrelevant or repeated information are said to contain redundant.

In today’s multimedia wireless communication, major issue is bandwidth needed to satisfy real time
transmission of image data. Compression is one of the good solutions to address this issue.
Transform based compression algorithms are widely used in the field of compression, because of
their de-correlation and other properties, useful in compression. In this paper, comparative study of
compression methods is done based on their types. This paper addresses the issue of importance of

142
transform in image compression and selecting particular transform for image compression. A
comparative study of performance of a variety of different image transforms is done base on
compression ratio, entropy and time factor.
The Role of Transforms in Image Compression (PDF Download Available). Available from:
https://www.researchgate.net/publication/257251096_The_Role_of_Transforms_in_Image_Compr
ession [accessed Jun 05 2018].

THE FLOW OF IMAGE COMPRESSION CODING:


What is the so-called image compression coding? Image compression coding is to store the
image into bit-stream as compact as possible and to display the decoded image in the monitor
as exact as possible. Now consider an encoder and a decoder as shown in Fig. 1.3. When the
encoder receives the original image file, the image file will be converted into a series of
binary data, which is called the bit-stream. The decoder then receives the encoded bit-stream
and decodes it to form the decoded image. If the total data quantity of the bit-stream is less
than the total data quantity of the original image, then this is called image compression. The
full compression flow is as shown in Fig. 1.3.

In order to evaluate the performance of the image compression coding, it is necessary to


define a measurement that can estimate the difference between the original image and the
decoded image. Two common used measurements are the Mean Square Error (MSE) and the
Peak Signal to Noise Ratio (PSNR), which are defined in (1.3) and (1.4), respectively. f(x,y)
is the pixel value of the original image, and f’(x,y)is the pixel value of the decoded image.
Most image compression systems are designed to minimize the MSE and maximize the
PSNR.

143
The general encoding architecture of image compression system is shown is Fig. 1.4. The
fundamental theory and concept of each functional block will be introduced in the following
sections.

Reduce the Correlation between Pixels

Why an image can be compressed? The reason is that the correlation between one
pixel and its neighbor pixels is very high, or we can say that the values of one pixel and its
adjacent pixels are very similar. Once the correlation between the pixels is reduced, we can
take advantage of the statistical characteristics and the variable length coding theory to
reduce the storage quantity. This is the most important part of the image compression
algorithm; there are a lot of relevant processing methods being proposed. The best-known
methods are as follows:

 Predictive Coding: Predictive Coding such as DPCM (Differential Pulse Code


Modulation) is a lossless coding method, which means that the decoded image and the
original image have the same value for every corresponding element.
 Orthogonal Transform: Karhunen-Loeve Transform (KLT) and Discrete Cosine
Transform (DCT) are the two most well-known orthogonal transforms. The DCT-
based image compression standard such as JPEG is a lossy coding method that will
result in some loss of details and unrecoverable distortion.
 Subband Coding: Subband Coding such as Discrete Wavelet Transform (DWT) is
also a lossy coding method. The objective of subband coding is to divide the spectrum
of one image into the lowpass and the highpass components. JPEG 2000 is a 2-
dimension DWT based image compression standard.
QUANTIZATION
The objective of quantization is to reduce the precision and to achieve higher
compression ratio. For instance, the original image uses 8 bits to store one element for
every pixel; if we use less bits such as 6 bits to save the information of the image,
then the storage quantity will be reduced, and the image can be compressed. The
shortcoming of quantization is that it is a lossy operation, which will result into loss of
precision and unrecoverable distortion. The image compression standards such as

144
JPEG and JPEG 2000 have their own quantization methods, and the details of relevant
theory will be introduced in the chapter 2.

ENTROPY CODING
The main objective of entropy coding is to achieve less average length of the
image. Entropy coding assigns codewords to the corresponding symbols according to
the probability of the symbols. In general, the entropy encoders are used to compress
the data by replacing symbols represented by equal-length codes with the code words
whose length is inverse proportional to corresponding probability. The entropy
encoder of JPEG and JPEG 2000 will also be introduced in the chapter 2.
2 AN OVERVIEW OF IMAGE COMPRESSION STANDARD:
In this chapter, we will introduce the fundamental theory of two well-known
image compression standards –JPEG and JPEG 2000.

JPEG – JOINT PICTURE EXPERT GROUP

Fig. 2.1 and 2.2 shows the Encoder and Decoder model of JPEG. We will introduce
the operation and fundamental theory of each block in the following sections.

145
DISCRETE COSINE TRANSFORM
The next step after color coordinate conversion is to divide the three color
components of the image into many 8×8 blocks. The mathematical definition of the Forward
DCT and the Inverse DCT are as follows:

The f(x,y) is the value of each pixel in the selected 8×8 block, and the F(u,v) is the
DCT coefficient after transformation. The transformation of the 8×8 block is also a 8×8 block
composed of F(u,v).

The DCT is closely related to the DFT. Both of them taking a set of points from the
spatial domain and transform them into an equivalent representation in the frequency domain.
However, why DCT is more appropriate for image compression than DFT? The two main
reasons are:

1. The DCT can concentrate the energy of the transformed signal in low
frequency, whereas the DFT can not. According to Parseval’s theorem,
theenergy is the same in the spatial domain and in the frequency domain.
Because the human eyes are less sensitive to the low frequency component,
we can focus on the low frequency component and reduce the contribution
of the high frequency component after taking DCT.
2. 2. For image compression, the DCT can reduce the blocking effect than the
DFT.

After transformation, the element in the upper most left corresponding to zero
frequency in both directions is the “DC coefficient” and the rest are called “AC
coefficients.”

Quantization in JPEG:
Quantization is the step where we actually throw away data. The DCT is a lossless
procedure. The data can be precisely recovered through the IDCT (this isn’t entirely true
because in reality no physical implementation can compute with perfect accuracy). During

146
Quantization every coefficients in the 8×8 DCT matrix is divided by a corresponding
quantization value. The quantized coefficient is defined in (2.3), and the reverse the process
can be achieved by the (2.4).

The goal of quantization is to reduce most of the less important high frequency DCT
coefficients to zero, the more zeros we generate the better the image will compress. The
matrix Q generally has lower numbers in the upper left direction and large numbers in the
lower right direction. Though the high-frequency components are removed, the IDCT still
can obtain an approximate matrix which is close to the original 8×8 block matrix. The JPEG
committee has recommended certain Q matrix that work well and the performance is close to
the optimal condition, the Q matrix for luminance and chrominance components is defined in
(2.5) and (2.6)

ZIGZAG SCAN:
After quantization, the DC coefficient is treated separately from the 63 AC
coefficients. The DC coefficient is a measure of the average value of the original 64 image
samples. Because there is usually strong correlation between the DC coefficients of adjacent
8×8 blocks, the quantized DC coefficient is encoded as the difference from the DC term of
the previous block. This special treatment is worthwhile, as DC coefficients frequently
contain a significant fraction of the total image energy. The other 63 entries are the AC

147
components. They are treated separately from the DC coefficients in the entropy coding
process.

Entropy Coding in JPEG


Differential Coding:
The mathematical representation of the differential coding is:

We set DC0 = 0. DC of the current block DCi will be equal to DCi-1 + Diffi .
Therefore, in the JPEG file, the first coefficient is actually the difference of DCs.
Then the difference is encoded with Huffman coding algorithm together with the
encoding of AC coefficients.

Question: What are different types of redundancies in


digital image? Explain in detail.

(i) Redundancy can be broadly classified into Statistical redundancy and Psycho visual
redundancy.
(ii) Statistical redundancy can be classified into inter-pixel redundancy and coding
redundancy.

148
(iii) Inter-pixel can be further classified into spatial redundancy and temporal redundancy.
(iv) Spatial redundancy or correlation between neighboring pixel values.
(v) Spectral redundancy or correlation between different color planes or spectral bands.
(vi) Temporal redundancy or correlation between adjacent frames in a sequence of images in
video applications.
(vii) Image compression research aims at reducing the number of bits needed to represent an
image by removing the spatial and spectral redundancies as much as possible.
(viii) In digital image compression, three basic data redundancies can be identified and
exploited: Coding redundancy, Inter-pixel redundancy and Psychovisual redundancy.

 Coding Redundancy:
o Coding redundancy is associated with the representation of information.
o The information is represented in the form of codes.
o If the gray levels of an image are coded in a way that uses more code symbols
than absolutely necessary to represent each gray level then the resulting image
is said to contain coding redundancy.
 Inter-pixel Spatial Redundancy:
o Interpixel redundancy is due to the correlation between the neighboring pixels
in an image.
o That means neighboring pixels are not statistically independent. The gray
levels are not equally probable.
o The value of any given pixel can be predicated from the value of its neighbors
that is they are highly correlated.
o The information carried by individual pixel is relatively small. To reduce the
interpixel redundancy the difference between adjacent pixels can be used to
represent an image.
 Inter-pixel Temporal Redundancy:
o Interpixel temporal redundancy is the statistical correlation between pixels
from successive frames in video sequence.
o Temporal redundancy is also called interframe redundancy. Temporal
redundancy can be exploited using motion compensated predictive coding.
o Removing a large amount of redundancy leads to efficient video compression.
 Psychovisual Redundancy:
o The Psychovisual redundancies exist because human perception does not
involve quantitative analysis of every pixel or luminance value in the image.
o It’s elimination is real visual information is possible only because the
information itself is not essential for normal visual processing.

149

You might also like