0% found this document useful (0 votes)
18 views

R20 DIP Digital Notes 5th unit

Uploaded by

eragaboyinapooja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

R20 DIP Digital Notes 5th unit

Uploaded by

eragaboyinapooja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

(20EC0441) DIGITAL IMAGE PROCESSING

UNIT I
Introduction To Digital Image Processing: Fundamental steps:
Image processing is often viewed as arbitrarily manipulating an image to achieve an aesthetic
standard or to support a preferred reality. However, image processing is more accurately
defined as a means of translation between the human visual system and digital imaging devices.
The human visual system does not perceive the world in the same manner as digital detectors,
with display devices imposing additional noise and bandwidth restrictions. Salient differences
between the human and digital detectors will be shown, along with some basic processing steps
for achieving translation. Image processing must be approached in a manner consistent with
the scientific method so that others may reproduce, and validate, one's results. This includes
recording and reporting processing actions, and applying similar treatments to adequate control
images.

Figure 1. fundamental steps in digital image processing

The fundamental steps in digital image processing includes:


Image acquisition is the first process shown in Figure below Note that acquisition could be as
simple as being given an image that is already in digital form. Generally, the image acquisition
stage involves preprocessing, such as scaling.
Image enhancement is among the simplest and most appealing areas of digital image
processing. Basically, the idea behind enhancement techniques is to bring out detail that I
obscured, or simply to highlight certain features of interest in an image.
Image restoration is an area that also deals with improving the appearance of an image.
However, unlike enhancement, which is subjective, image restoration is objective, in the sense
that restoration techniques tend to be based on mathematical or probabilistic models of image
degradation.
Color image processing is an area that has been gaining in importance because of the
significant increase in the use of digital images over the Internet.
Wavelets are the foundation for representing images in various degrees of resolution.
Compression, as the name implies, deals with techniques for reducing the storage required to
save an image, or the bandwidth required to transmit it. Although storage technology has
improved significantly over the past decade, the same cannot be said for transmission capacity.
Morphological processing deals with tools for extracting image components that are useful in
the representation and description of shape.
Segmentation procedures partition an image into its constituent parts or objects. In general,
autonomous segmentation is one of the most difficult tasks in digital image processing.
Representation and description almost always follow the output of a segmentation stage,
which usually is raw pixel data, constituting either the boundary of a region (i.e., the set of
pixels separating one image region from another) or all the points in the region itself. Regional
representation is appropriate when the focus is on internal properties, such as texture or skeletal
shape. Description, also called feature selection, deals with extracting attributes that result in
some quantitative information of interest or are basic for differentiating one class of objects
from another.
Recognition is the process that assigns a label (e.g., “vehicle”) to an object based on its
descriptors. We conclude our coverage of digital image processing with the development of
methods for recognition of individual objects.

Example fields of its usage:


• Gamma-ray imaging
• X-ray imaging
• Imaging in the ultraviolet band
• Imaging in the visible and infrared bands
• Imaging in the microwave band
• Imaging in the radio band
• Ultrasound imaging

Image Sensing and Acquisition:

Most of the images in which we are interested are generated by the combination of an
“illumination” source and the reflection or absorption of energy from that source by the
elements of the “scene” being imaged. the three principal sensor arrangements used to
transform incident energy into digital images. The idea is simple: Incoming energy is
transformed into a voltage by a combination of the input electrical power and sensor material
that is responsive to the type of energy being detected. The output voltage waveform is the
response of the sensor, and a digital quantity is obtained by digitizing that response.

Figure 2. Single Sensing Element

Figure 2. shows the components of a single sensing element. A familiar sensor of this type is
the photodiode, which is constructed of silicon materials and whose output is a voltage
proportional to light intensity. Using a filter in front of a sensor improves its selectivity. For
example, an optical green-transmission filter favors light in the green band of the color
spectrum. As a consequence, the sensor output would be stronger for green light than for other
visible light components.
Figure 3. Line Sensor

In order to generate a 2-D image using a single sensing element, there has to be relative
displacements in both the x- and y-directions between the sensor and the area to be imaged.

Figure 4. Array Sensor

Figure 4. shows individual sensing elements arranged in the form of a 2-D array.
Electromagnetic and ultrasonic sensing devices frequently are arranged in this manner. This is
also the predominant arrangement found in digital cameras. A typical sensor for these cameras
is a CCD (charge-coupled device) array, which can be manufactured with a broad range of
sensing properties and can be packaged in rugged arrays of 4000 * 4000 elements or more.
CCD sensors are used widely in digital cameras and other light-sensing instruments. The
response of each sensor is proportional to the integral of the light energy projected onto the
surface of the sensor, a property that is used in astronomical and other applications requiring
low noise Images

.
Figure 5. Combining a single sensing element with mechanical motion to generate a 2-D
image.

Figure 5. shows an arrangement used in high-precision scanning, where a film negative is


mounted onto a drum whose mechanical rotation provides displacement in one dimension. The
sensor is mounted on a lead screw that provides motion in the perpendicular direction. A light
source is contained inside the drum. As the light passes through the film, its intensity is
modified by the film density before it is captured by the sensor. This modulation" of the light
intensity causes corresponding variations in the sensor voltage, which are ultimately
converted to image intensity levels by digitization.

Figure 6. (a) Image acquisition using a linear sensor strip. (b) Image acquisition using a
circular sensor strip.

Sensor strips in a ring configuration are used in medical and industrial imaging to obtain cross-
sectional (“slice”) images of 3-D objects, as Fig.6 shows. A rotating X-ray source provides
illumination, and X-ray sensitive sensors opposite the source collect the energy that passes
through the object. This is the basis for medical and industrial computerized axial tomography
(CAT) imaging. The output of the sensors is processed by reconstruction algorithms whose
objective is to transform the sensed data into meaningful cross-sectional images.

Image Sampling and Quantization:

To create a digital image, we need to convert the continuous sensed data into a digital format.
This requires two processes: sampling and quantization. Figure 7. shows a continuous image f
that we want to convert to digital form. An image may be continuous with respect to the x- and
y-coordinates, and also in amplitude. To digitize it, we have to sample the function in both
coordinates and also in amplitude. Digitizing the coordinate values is called sampling.
Digitizing the amplitude values is called quantization.

The one-dimensional function in Figure 7 (b) is a plot of amplitude (intensity level) values of
the continuous image along the line segment AB in Figure 7 (a). The random variations are due
to image noise. To sample this function, we take equally spaced samples along line AB, as
shown in Figure 7 (c). The samples are shown as small dark squares superimposed on the
function, and their (discrete) spatial locations are indicated by corresponding tick marks in the
bottom of the figure.
The set of dark squares constitute the sampled function. However, the values of the samples
still span (vertically) a continuous range of intensity values. In order to form a digital function,
the intensity values also must be converted (quantized) into discrete quantities. The vertical
gray bar in Figure 7 (c) depicts the intensity scale divided into eight discrete intervals, ranging
from black to white. The vertical tick marks indicate the specific value assigned to each of the
eight intensity intervals. The continuous intensity levels are quantized by assigning one of the
eight values to each sample, depending on the vertical proximity of a sample to a vertical tick
mark. The digital samples resulting from both sampling and quantization are shown as white
squares in Fig. Figure 7 (d). Starting at the top of the continuous image and carrying out this
procedure downward, line by line, produces a two-dimensional digital image. It is implied in

Figure 7. (a) Continuous image. (b) A scan line showing intensity variations along line AB in
the continuous image. (c) Sampling and quantization. (d) Digital scan line. (The black border
in (a) is included for clarity. It is not part of the image).

Figure 7 that, in addition to the number of discrete levels used, the accuracy achieved in
quantization is highly dependent on the noise content of the sampled signal.

Figure 8. (a) Continuous image projected onto a sensor array. (b) Result of image sampling
and quantization.

When a sensing strip is used for image acquisition, the number of sensors in the strip establishes
the samples in the resulting image in one direction, and mechanical motion establishes the
number of samples in the other. Quantization of the sensor outputs completes the process of
generating a digital image. When a sensing array is used for image acquisition, no motion is
required. The number of sensors in the array establishes the limits of sampling in both
directions. Quantization of the sensor outputs is as explained above. Figure 8. illustrates this
concept.

Digital image representation:


We will use two principal ways to represent digital images. Assume that an image f(x, y) is
sampled so that the resulting digital image has M rows and N columns. The values of the
coordinates (x, y) now become discrete quantities. For notational clarity and convenience, we
shall use integer values for these discrete coordinates. Thus, the values of the coordinates at the
origin are (x, y) = (0, 0). The next coordinate values along the first row of the image are
represented as (x, y) = (0, 1). It is important to keep in mind that the notation (0, 1) is used to
signify the second sample along the first row. It does not mean that these are the actual values of
physical coordinates when the image was sampled. Figure 9. shows the coordinate convention
used.

Figure 9. Coordinate convention used to represent digital images

The notation introduced in the preceding paragraph allows us to write the complete M*N digital
image in the following compact matrix form: The right side of this equation is by definition a digital
image. Each element of this matrix array is called an image element, picture element, pixel, or pel.

Classification of digital images and Image types:


Digital images can be broadly classified into two types and they are (i) raster image, and (ii)
vector image.
Raster Image or Bitmap Image:
• A raster image file is generally defined as a rectangular array of regularly sampled
values known as pixels. Scanned graphics and web graphics are the most common
forms of raster images.
• Raster images are mapped to grids which are not easily scalable. A raster image is
resolution dependent because it contains a fixed number of pixels that are used to create
the image.
• Since there are a fixed and limited number of pixels, a raster image will lose its quality
if it is enlarged beyond that number of pixels as the computer will have to ‘make up’
for the missing information.
• Bitmaps are used for photorealistic images, and therefore, involve complex colour
variations. Raster images can show well the gradations of colour and detailed images
such as photographs.
• Also, they can be acquired by optical scanners, digital CCD cameras and other raster-
imaging devices. The spatial resolution of a raster image is determined by the resolution
of the acquisition device and the quality of the original data source.
• Common raster image formats include BMP (Windows Bitmap), PCX (Paintbrush),
TIFF (Tag Interleave Format), JPEG (Joint Photographics Expert Group), GIF
(Graphics Interchange Format), PNG (Portable Network Graphics), PSD (Adobe
Photoshop) and CPT( Corel PhotoPaint).

Vector Image:
• A vector image is defined by objects which are made of lines and curves that are
mathematically defined in the computer. A vector can have various attributes such as
line thickness, length and colour.
• Vector images are mathematically defined and hence, they are easily scalable. This
implies that vectors can be printed at any size, on any output device, at any resolution,
without losing the detail and without altering the resolution of the image.
• The vector image is zoomed by a factor of three and twenty-four. Vector images are
thus suitable for typography, line art and illustrations.

Basic relationships between pixels:

Neighbors of a pixel:

A pixel p at coordinates (x, y) has four horizontal and vertical neighbors whose coordinates
are given by

(x+1, y), (x-1, y), (x, y+1), (x, y-1)

This set of pixels, called the 4-neighbors of p, is denoted by N4(p).


The four diagonal neighbors of p have coordinates

(x+1, y+1), (x+1, y-1), (x-1, y+1), (x-1, y-1)

and are denoted by ND(p).


Diagonal neighbors together with the 4-neighbors are called the 8-neighbors of p, denoted
by N8(p).
N8(p) = N4(p) + ND(p)
The set of image locations of the neighbors of a point p is called the neighborhood of p. The
neighborhood is said to be closed if it contains p. Otherwise, the neighborhood is said to
be open.
Figure 10. Neighbors of a pixel
Adjacency between pixels:

Let V be the set of intensity values used to define adjacency. In a binary image, V ={1} if we
are referring to the adjacency of pixels with the value 1. In a gray-scale image, the idea is the
same but set V typically contains more elements.
For example, in the adjacency of pixels with intensity values ranging from 0 to 255, set V
might be any subset of these 256 values.
There are three types of adjacency:

• 4-adjacency: Two pixels p and q with values from V are 4-adjacent if q is in the set
N4(p).
• 8-adjacency: Two pixels p and q with values from V are 8-adjacent if q is in the set
N8(p).
• m-adjacency (Mixed Adjaceny): Two pixels p and q with values from V are m-
adjacent if q is in N4(p), or q is in ND(p) and the set N4(p)∩N4(q) has no pixels
whose values are from V.
Mixed adjacency is a modification of 8-adjacency, and is introduced to eliminate the
ambiguities that may result from using 8-adjacency.

Figure 11. (a) An arrangement of pixels. (b) Pixels that are 8-adjacent (adjacency is shown by
dashed lines). (c) m-adjacency.

Connectivity between pixels:


It is used to define the boundaries of objects and region components in an image.
Let S represent a subset of pixels in an image. Two pixels p and q are said to
be connected in S if there exists a path between them consisting entirely of pixels in S.

For any pixel p is S, the set of pixels that are connected to it in S is called a connected
component of S. If it only has one component, and that component is connected, then S is
called a connected set. On the basis of adjacency, there are three forms of connectivity. They
are as follows:

• 4-connectivity: If two or more pixels are 4-adjacent to each other, they are said to be
4-connected.
• 8-connectivity: If two or more pixels are 8-adjacent to each other, they are said to be
8-connected.
• M-connectivity: If two or more pixels are m-adjacent to each other, they are said to be
m-connected.

Figure 12. Connectivity of pixels

Path:
A (digital) path (or curve) from pixel p with coordinates (x, y) to pixel q with coordinates (s, t) is
a sequence of distinct pixels with coordinates

Region:

Let S represent a subset of pixels in an image. Two pixels p and q are said to be connected in
S if there exists a path between them consisting entirely of pixels in S. For any pixel p in S, the
set of pixels that are connected to it in S is called a connected component of S. If it only has
one connected component, then set S is called a connected set.
Let R be a subset of pixels in an image. We call R a region of the image if R is a connected set.
The boundary (also called border or contour) of a region R is the set of pixels in the region that
have one or more neighbors that are not in R. If R happens to be an entire image (which we
recall is a rectangular set of pixels), then its boundary is defined as the set of pixels in the first
and last rows and columns of the image. This extra definition is required because an image has
no neighbors beyond its border. Normally, when we refer to a region, we are referring to a
subset of an image, and any pixels in the boundary of the region that happen to coincide with
the border of the image are included implicitly as part of the region boundary.

Distance Measures:
Mathematical tools/operations applied on images:

Elementwise versus matrix operations:


An elementwise operation involving one or more images is carried out on a pixel-by pixel
basis. We mentioned earlier in this chapter that images can be viewed equivalently as matrices.
In fact, as you will see later in this section, there are many situations in which operations
between images are carried out using matrix theory. It is for this reason that a clear distinction
must be made between elementwise and matrix operations. For example, consider the
following 2 * 2 images (matrices):

That is, the elementwise product is obtained by multiplying pairs of corresponding pixels. On
the other hand, the matrix product of the images is formed using the rules of matrix
multiplication:

We assume elementwise operations throughout the book, unless stated otherwise. For example,
when we refer to raising an image to a power, we mean that each individual pixel is raised to
that power; when we refer to dividing an image by another, we mean that the division is
between corresponding pixel pairs, and so on. The terms elementwise addition and subtraction
of two images are redundant because these are elementwise operations by definition. However,
you may see them used sometimes to clarify notational ambiguities.

Linear Versus Nonlinear Operations:


One of the most important classifications of an image processing method is whether it is linear
or nonlinear. Consider a general operator, that produces an output image, g(x, y), from a given
input image, f (x, y):

This equation indicates that the output of a linear operation applied to the sum of two inputs is
the same as performing the operation individually on the inputs and then summing the results.
In addition, the output of a linear operation on a constant multiplied by an input is the same as
the output of the operation due to the original input multiplied by that constant. The first
property is called the property of additivity, and the second is called the property of
homogeneity. By definition, an operator that fails to satisfy Eq. (2) is said to be nonlinear.

Set & Logical operations Basic Set operations:


Figure 13. Venn diagrams corresponding to the set operations

Logical Operations:
Logical operations deal with TRUE (typically denoted by 1) and FALSE (typically denoted by
0) variables and expressions. For our purposes, this means binary images composed of
foreground (1-valued) pixels, and a background composed of 0-valued pixels.
We work with set and logical operators on binary images using one of two basic approaches:
(1) we can use the coordinates of individual regions of foreground pixels in a single image as
sets, or
(2) we can work with one or more images of the same size and perform logical operations
between corresponding pixels in those arrays.
In the first category, a binary image can be viewed as a Venn diagram in which the coordinates
of individual regions of 1-valued pixels are treated as sets. The union of these sets with the set
composed of 0-valued pixels comprises the set universe, A. In this representation, we work
with single images using all the set operations defined in the previous section. For example,
given a binary image with two 1-valued regions, R1 and R2 , we can determine if the regions
overlap (i.e., if they have at least one pair of coordinates in common) by performing the set
intersection operation R1 or R2. In the second approach, we perform logical operations on the
pixels of one binary image, or on the corresponding pixels of two or more binary images of the
same size.

Table 1. Truth table defining the logical operators AND, OR, and NOT.

Logical operators can be defined in terms of truth tables, as Table 1.1 shows for two logical
variables a and b. The logical AND operation yields a 1 (TRUE) only when both a and b are
1. Otherwise, it yields 0 (FALSE). Similarly, the logical OR yields 1 when both a or b or both
are 1, and 0 otherwise. The NOT (-) operator is self-explanatory. When applied to two binary
images, AND and OR operate on pairs of corresponding pixels between the images. That is,
they are elementwise operators in this context. The operators AND, OR, and NOT are
functionally complete, in the sense that they can be used as the basis for constructing any other
logical operator.

Figure 14. logical operations involving foreground (white) pixels. Black represents binary 0’s
and white binary 1’s. The dashed lines are shown for reference only. They are not part of the
result.
UNIT II
Image Transforms:

An Image Transform is a mathematical tool to represent an image(signal). Working with the


transformed version of an image rather than the image itself may give us more understanding of
an image.
1. It offers us mathematical convenience: Every action on an image in time domain has a
corresponding impact on its frequency domain representation. Convolution in time domain <---->
Multiplication in frequency domain The word "Convolution" itself suggests that convolution is
very convoluted operation ;So, Image Transform offers Fast Computation.
2. It offers better image processing and extraction of information: Consider this analogy: Original
image is like white light to which image transform acts like a prism which results in combination
of seven colors- transformed image. This way image transform offers us Better Insight into image
without changing the information content present in the signal so that we can easily identify
where the smooth, moderately changing, and rapidly changing of an image is. Hence, Image
transform sets apart relevant parts of the image so that they are readily accessible for analysis.
3. It makes storage and transmission of image efficient:
4. It offers energy compaction by allowing us to pick up few samples of an image which can be
stored/transmitted and yet can be recovered without any loss of information.
5. It provides alternative representation of an image which can be sensed and measured in medical
and astrophysics applications.

2D Orthogonal and Unitary transforms:

Properties of Unitary transforms:


The property of energy preservation:
1D and 2D Discrete Fourier Transform:
Discrete Fourier Transform
The Fourier transform was developed by Jean Baptiste Joseph Fourier to explain the distribution
of temperature and heat conduction. Fourier transform is widely used in the field of image
processing. An image is a spatially varying function. One way to analyse spatial variations is to
decompose an image into a set of orthogonal functions, one such set being the Fourier functions.
A Fourier transform is used to transform an intensity image into the domain of spatial frequency.
1D-DFT:

2D-DFT:

Discrete Cosine transforms:


Hadamard Transforms:
The Hadamard transform is basically the same as the Walsh transform except the rows of the
transform matrix are re-ordered. The elements of the mutually orthogonal basis vectors of a
Hadamard transform are either +1 or –1, which results in very low computational complexity
in the calculation of the transform coefficients.
Hadamard matrices are easily constructed for N = 2n by the following procedure.
The order N = 2 Hadamard matrix is given as

Substituting N = 2 in above Eq., we get


Walsh Transform:
• Fourier analysis is basically the representation of a signal by a set of orthogonal
sinusoidal waveforms. The coefficients of this representation are called frequency
components and the waveforms are ordered by frequency.
• Walsh in 1923 introduced a complete set of orthonormal square-wave functions to
represent these functions.
• The computational simplicity of the Walsh function is due to the fact that Walsh
functions are real and they take only two values which are either +1 or –1.

• Here n represents the time index, k represents the frequency index and N represents the
order. Also, m represents the number bits to represent a number and bi(n) represents
the i th (from LSB) bit of the binary value, of n decimal number represented in binary.
The value of m is given by m = log2 N.

Haar transforms:
The Haar transform is based on a class of orthogonal matrices whose elements are either 1, –
1, or 0 multiplied by powers of root 2 . The Haar transform is a computationally efficient
transform as the transform of an N-point vector requires only 2(N – 1) additions and N
multiplications.

Otherwise,
Flow chart for Haar Transform:

Comparison of image transforms.

UNIT III

Image Enhancement:
• Image enhancement is to improve the interpretability of the information present in
images for human viewers. An enhancement algorithm is one that yields a better-quality
image for the purpose of some particular application which can be done by either
suppressing the noise or increasing the image contrast.
• Image enhancement algorithms are employed to emphasise, sharpen or smoothen image
features for display and analysis. Enhancement methods are application specific and
are often developed empirically. Image enhancement techniques emphasise specific
image features to improve the visual perception of an image.
• Image-enhancement techniques can be classified into two broad categories as spatial
domain method, and (2) transform domain method.

Background and basic intensity transformation:


Point Operations:
• In point operation, each pixel is modified by an equation that is not dependent on
other pixel values. It is defined by the following equation as,

-where T operates on one pixel, or there exists a one-to-one mapping between the
input image f (m, n) and the output image g (m, n).
• In point operation, each pixel value is mapped to a new pixel value. Point operations
are basically memoryless operations. In a point operation, the enhancement at any point
depends only on the image value at that point.
• The point operation maps the input image f (m, n) to the output image g (m, n) which
is illustrated in Figure 1.
• From the figure, it is obvious that every pixel of f (m, n) with the same gray level maps
to a single gray value in the output image.

Figure. 1 Illustration of point operation


• Some of the examples of point operation include (i) brightness modification, (ii)
contrast manipulation, and (iii) histogram manipulation.
• The brightness of an image depends on the value associated with the pixel of the image.
When changing the brightness of an image, a constant is added or subtracted from the
luminance of all sample values. The brightness of the image can be increased by adding
a constant value to each and every pixel of the image. Similarly, the brightness can be
decreased by subtracting a constant value from each and every pixel of the image.
• Contrast adjustment is done by scaling all the pixels of the image by a constant k. It is
given by

• Changing the contrast of an image, changes the range of luminance values present in
the image.
• Histogram manipulation basically modifies the histogram of an input image so as to
improve the visual quality of the image. In order to understand histogram
manipulation, it is necessary that one should have some basic knowledge about the
histogram of the image.

Contrast stretching:
• Sometimes during image acquisition low contrast images may result due to one of the
following reasons:
- Poor illumination
- Lack of dynamic range in the image sensor and
- Wrong setting of the lens aperture.
• Contrast stretching expands the range of intensity levels in an image so that it spans
the ideal full intensity range of the recording medium or display device. An intensity
(also called a gray-level, or mapping) transformation function of the form,
s = T(r)
-where, s and r denoted, respectively, the intensity of g and f at any point (x, y).

Figure 2. Intensity transformation functions. (a) Contrast stretching function.


• if T(r) has the form in Figure 1, the result of applying the transformation to every
pixel in f to generate the corresponding pixels in g would be to produce an image of
higher contrast than the original, by darkening the intensity levels below k and
brightening the levels above k.
• In this technique, values of r lower than k reduce (darken) the values of s, toward
black. The opposite is true for values of r higher than k.

Negative image transformation:


• The negative of an image with intensity levels in the range [0, L-1] is obtained by
using the negative transformation function shown in Figure 1., which has the form:
𝑠 =𝐿−1−𝑟
Figure 3. Graphical representation of negation

• Reversing the intensity levels of a digital image in this manner produces the
equivalent of a photographic negative. This type of processing is used, for example, in
enhancing white or gray detail embedded in dark regions of an image, especially
when the black areas are dominant in size.

Figure 4. (a) A digital mammogram. (b) Negative image

• Negative images are useful in the display of medical images and producing negative
prints of images. Figure 2. shows an example. The original image is a digital
mammogram showing a small lesion. Despite the fact that the visual content is the
same in both images, some viewers find it easier to analyze the fine details of the
breast tissue using the negative image.

Intensity level slicing operation:


• There are applications in which it is of interest to highlight a specific range of intensities
in an image. Some of these applications include enhancing features in satellite imagery,
such as masses of water, and enhancing flaws in X-ray images.
• The method, called intensity-level slicing, can be implemented in several ways, but
most are variations of two basic themes.
• One approach is to display in one value (say, white) all the values in the range of interest
and in another (say, black) all other intensities. This transformation, shown in Figure
1(a), produces a binary image.
• The second approach, based on the transformation in Figure 1(b), brightens (or darkens)
the desired range of intensities, but leaves all other intensity levels in the image
unchanged.
Figure 5. (a) and (b) (a) This transformation function highlights range [A,B] and reduces all other intensities to a
lower level. (b) This function highlights range [A,B] and leaves other intensities unchanged.

Bit extraction operation:


• Pixel values are integers composed of bits. For example, values in a 256-level grayscale
image are composed of 8 bits (one byte). Instead of highlighting intensity-level ranges,
we could highlight the contribution made to total image appearance by specific bits.
• As Figure 2. illustrates, an 8-bit image may be considered as being composed of eight
one-bit planes, with plane 1 containing the lowest-order bit of all pixels in the image,
and plane 8 all the highest-order bits.

Figure 6. Bit-planes of an 8-bit image.


• Observe that the four higher-order bit planes, especially the first two, contain a
significant amount of the visually-significant data.
• The lower-order planes contribute to more subtle intensity details in the image. The
original image has a gray border whose intensity is 194. Notice that the corresponding
borders of some of the bit planes are black (0), while others are white (1).

Histogram Processing:
• Histogram manipulation basically modifies the histogram of an input image so as to
improve the visual quality of the image. In order to understand histogram manipulation,
it is necessary that one should have some basic knowledge about the histogram of the
image.
• The histogram of an image is a plot of the number of occurrences of gray levels in the
image against the gray-level values. The histogram provides a convenient summary of
the intensities in an image, but it is unable to convey any information regarding spatial
relationships between pixels. The histogram provides more insight about image contrast
and brightness.
• Histogram shape is related to image appearance. For example, Figure 1. shows images
with four basic intensity characteristics: dark, light, low contrast, and high contrast; the
image histograms are also shown.
• We note in the dark image that the most populated histogram bins are concentrated on
the lower (dark) end of the intensity scale. Similarly, the most populated bins of the
light image are biased toward the higher end of the scale.
• An image with low contrast has a narrow histogram located typically toward the middle
of the intensity scale. Finally, we see that the components of the histogram of the high-
contrast image cover a wide range of the intensity scale, and the distribution of pixels
is not too far from uniform, with few bins being much higher than the others.
• Intuitively, it is reasonable to conclude that an image whose pixels tend to occupy the
entire range of possible intensity levels and, tend to be distributed uniformly, will have
an appearance of high contrast and will exhibit a large variety of gray tones.
• The net effect will be an image that shows a great deal of gray-level detail and has a
high dynamic range.

Figure 7. Four image types and their corresponding histograms. (a) dark; (b) light; (c) low contrast; (d) high
contrast. The horizontal axis of the histograms are values of rk and the vertical axis are values of p(rk)
Procedure for histogram process and uses of histogram:

Uses of Histogram:
• Statistics obtained directly from an image histogram can be used for image
enhancement. Let r denote a discrete random variable representing intensity values in
the range [0,L - 1], and let p(ri) denote the normalized histogram component
corresponding to intensity value ri.
• For an image with intensity levels in the range [0,L-1], the nth moment of r about its
mean, m, is defined as

where m is given by

The mean is a measure of average intensity and the variance (or standard deviation, σ), given
by

is a measure of image contrast.

• There are two uses of the mean and variance for enhancement purposes. The global
mean and variance are computed over an entire image and are useful for gross
adjustments in overall intensity and contrast.
• A more powerful use of these parameters is in local enhancement, where the local mean
and variance are used as the basis for making changes that depend on image
characteristics in a neighborhood about each pixel in an image.
Spatial filtering:

• A linear spatial filter performs a sum-of-products operation between an image f and a


filter kernel, w. The kernel is an array whose size defines the neighborhood of operation,
and whose coefficients determine the nature of the filter. Other terms used to refer to a
spatial filter kernel are mask, template, and window. We use the term filter kernel or
simply kernel.
• Figure 1., illustrates the mechanics of linear spatial filtering using a 3 × 3 kernel. At any
point (x, y) in the image, the response, g(x, y), of the filter is the sum of products of the
kernel coefficients and the image pixels encompassed by the kernel:

• As coordinates x and y are varied, the center of the kernel moves from pixel to pixel,
generating the filtered image, g, in the process.

Figure 8. Mechanics of Spatial Filtering

• Observe that the center coefficient of the kernel, w(0, 0), aligns with the pixel at location
(x, y). For a kernel of size m × n, we assume that m = 2a + 1 and n = 2b + 1, where a
and b are nonnegative integers. This means that our focus is on kernels of odd size in
both coordinate directions. In general, linear spatial filtering of an image of size M × N
with a kernel of size m × n is given by the expression

where x and y are varied so that the center (origin) of the kernel visits every pixel in f
once. For a fixed value of (x, y), Eq. (2) implements the sum of products of the form
shown in Eq. (1), but for a kernel of arbitrary odd size.

Image smoothing spatial filters:


• Smoothing (also called averaging) spatial filters are used to reduce sharp transitions in
intensity. Because random noise typically consists of sharp transitions in intensity, an
obvious application of smoothing is noise reduction.
• Smoothing is used to reduce irrelevant detail in an image, where “irrelevant” refers to
pixel regions that are small with respect to the size of the filter kernel. Another
application is for smoothing the false contours that result from using an insufficient
number of intensity levels in an image.
• There are linear and non-linear smoothing filters in spatial domain smoothing filters
available.
Box Filter:
• Box filter is linear spatial domain filter, whose kernel is the box kernel, its coefficients
have the same value (typically 1). The name “box kernel” comes from a constant kernel
resembling a box when viewed in 3-D.

• We showed a 3 × 3 box filter in above Figure. An m × n box filter is an m × n array of


1’s, with a normalizing constant in front, whose value is 1 divided by the sum of the
values of the coefficients (i.e., 1 mn when all the coefficients are 1’s).
• This normalization, which we apply to all lowpass kernels, has two purposes. First, the
average value of an area of constant intensity would equal that intensity in the filtered
image, as it should.
• Second, normalizing the kernel in this way prevents introducing a bias during filtering;
that is, the sum of the pixels in the original and filtered images will be the same. Because
in a box kernel all rows and columns are identical, the rank of these kernels is 1, which,
as we discussed earlier, means that they are separable.
Weighted Average Filter:
• The mask of a weighted average filter is given by
• From the mask, it is obvious that the pixels nearest to the centre are weighted more than
the distant pixels; hence it is given the name weighted average filter. The pixel to be
updated is replaced by a sum of the nearby pixel value times the weights given in the
matrix and divided by the sum of the coefficients in the matrix.

Order Statistics (Non-Linear) Filters:


• Order-statistic filters are nonlinear spatial filters whose response is based on ordering
(ranking) the pixels contained in the region encompassed by the filter. Smoothing is
achieved by replacing the value of the center pixel with the value determined by the
ranking result.
• There are three basic types of filters available: Median filter, Min and Max filters.
Median Filter:
• The median filter, which, as its name implies, replaces the value of the center pixel by
the median of the intensity values in the neighborhood of that pixel (the value of the
center pixel is included in computing the median).
• Median filters provide excellent noise reduction capabilities for certain types of random
noise, with considerably less blurring than linear smoothing filters of similar size.
Median filters are particularly effective in the presence of impulse noise (sometimes
called salt-and-pepper noise, when it manifests itself as white and black dots
superimposed on an image).
• The same way maximum or minimum in the sorted pixels values will make the Max or
Min filter.

Image sharpening spatial filters:


Sharpening highlights transitions in intensity. Uses of image sharpening range from electronic
printing and medical imaging to industrial inspection and autonomous guidance in military
systems.
• Image blurring could be accomplished in the spatial domain by pixel averaging
(smoothing) in a neighborhood. Because averaging is analogous to integration, it is
logical to conclude that sharpening can be accomplished by spatial differentiation.
• Sharpening is often referred to as highpass filtering. In this case, high frequencies
(which are responsible for fine details) are passed, while low frequencies are attenuated
or rejected.
The second order derivative or Laplacian:
• The approach consists of defining a discrete formulation of the second-order derivative
and then constructing a filter kernel based on that formulation. T
• he simplest isotropic derivative operator (kernel) is the Laplacian, which, for a function
(image) f (x, y) of two variables, is defined as

Because derivatives of any order are linear operations, the Laplacian is a linear operator.
• In the x-direction, we have
• It follows from the preceding three equations that the discrete Laplacian of two
variables is

• This equation can be implemented using convolution with the kernel shown below.

Figure 9. a) Laplacian Kernel b). Diagonal Terms c) and d) Two other Laplacian Kernels

Unsharp Masking and High Boost Filtering:

• Subtracting an unsharp (smoothed) version of an image from the original image is


process that has been used since the 1930s by the printing and publishing industry to
sharpen images. This process, called unsharp masking, consists of the following steps:
1. Blur the original image.
2. Subtract the blurred image from the original (the resulting difference is called the mask.)
3. Add the mask to the original.

where we included a weight, k (k ≥ 0), for generality.


• When k = 1 we have unsharp masking, as defined above. When k > 1, the process is
referred to as high boost filtering. Choosing k < 1 reduces the contribution of the
unsharp mask.
First Order Derivatives-The Gradient Operator:
• First derivatives in image processing are implemented using the magnitude of the
gradient. The gradient of an image f at coordinates (x, y) is defined as the two-
dimensional column vector
• This vector has the important geometrical property that it points in the direction of the
greatest rate of change of f at location (x, y).

• Here M(x,y) is the value at (x, y) of the rate of change in the direction of the gradient
vector. It is more suitable computationally to approximate the squares and square root
operations by absolute values:

There are some different implementations:


• Roberts cross-gradient operators:

we compute the gradient image as

then

where it is understood that x and y vary over the dimensions of the image.
These kernels are referred to as the Roberts cross-gradient operators.

Figure 10. A 3×3 region of an image b) and c) Roberts cross gradient operators d) and e)
Sobel operators
Sobel:
• Mostly we prefer to use kernels of odd sizes because they have a unique, (integer) center
of spatial symmetry. The smallest kernels in which we are interested are of size 3 × 3.
Approximations to gx and gy using a 3 × 3 neighborhood centered on Z5 are as follows:

• These equations can be implemented using the kernels and The difference between the
third and first rows of the 3 × 3 image region approximates the partial derivative in the
x-direction, The difference between the third and first columns approximates the partial
derivative in the y-direction. We then obtain the magnitude of the gradient as

Basics of frequency domain filtering:


• Frequency refers to the rate of repetition of some periodic event. In image processing,
spatial frequency refers to the variation of image brightness with its position in space.
• A varying signal can be transformed into a series of simple periodic variations. The
Fourier transform decomposes a signal into a set of sine waves of different
characteristics like frequency and phase.
• If we compute the Fourier Transform of an image, and then immediately inverse
transform the result, we can regain the same image because a Fourier transform is
perfectly reversible.
• On the other hand, if we multiply each element of the Fourier coefficient by a suitably
chosen weighting function then we can accentuate certain frequency components and
attenuate others.
• The corresponding changes in the spatial form can be seen after an inverse transform
is computed. This selective enhancement or suppression of frequency components is
termed Fourier filtering or frequency domain filtering.
• The spatial representation of image data describes the adjacency relationship between
the pixels. On the other hand, the frequency domain representation clusters the image
data according to their frequency distribution.
• In frequency domain filtering, the image data is dissected into various spectral bands,
where each band depicts a specific range of details within the image. The process of
selective frequency inclusion or exclusion is termed frequency domain filtering.
• It is possible to perform filtering in the frequency domain by specifying the
frequencies that should be kept and the frequencies that should be discarded. Spatial
domain filtering is accomplished by convolving the image with a filter kernel.
• We know that
Convolution in spatial domain = Multiplication in the frequency domain.
If filtering in spatial domain is done by convolving the input image f (m, n) with the
filter kernel h(m, n),
Filtering in spatial domain = f (m, n)∗ h(m, n)
In the frequency domain, filtering corresponds to the multiplication of the image
spectrum by the Fourier transform of the filter kernel, which is referred to as the
frequency response of the filter.
Filtering in the frequency domain = F(k, l)×H(k, l)
Here, F (k, l ) is the spectrum of the input image and H(k, l ) is the spectrum of the
filter kernel.
• Thus, frequency domain filtering is accomplished by taking the Fourier transform of
the image and the Fourier transform of the kernel, multipling the two Fourier
transforms, and taking the inverse Fourier transform of the result.
• The multiplication of the Fourier transforms need to be carried out point by point.
This point-by-point multiplication requires that the Fourier transforms of the image
and the kernel themselves have the same dimensions. As convolution kernels are
commonly much smaller than the images they are used to filter, it is required to zero
pad out the kernel to the size of the image to accomplish this process.

Smoothing filters in frequency domain:


Ideal Lowpass Filters:
• Edges and other sharp intensity transitions (such as noise) in an image contribute
significantly to the high frequency content of its Fourier transform. Hence, smoothing
(blurring) is achieved in the frequency domain by high-frequency attenuation; that is,
by lowpass filtering. In this section, we consider three types of lowpass filters: ideal,
Butterworth, and Gaussian. These three categories cover the range from very sharp
(ideal) to very smooth (Gaussian) filtering.
• A 2-D lowpass filter that passes without attenuation all frequencies within a circle of
radius from the origin, and “cuts off” all frequencies outside this, circle is called an
ideal lowpass filter (ILPF); it is specified by the transfer function

where D0 is a positive constant, and D(u,v) is the distance between a point (u,v) in the
frequency domain and the center of the P×Q frequency rectangle; that is,

• The name ideal indicates that all frequencies on or inside a circle of radius D0 are
passed without attenuation, whereas all frequencies outside the circle are completely
attenuated (filtered out).

Figure 11. (a) Perspective plot of an ideal lowpass-filter transfer function. (b) Function displayed as an image. (c) Radial
cross section.
• For an ILPF cross section, the point of transition between the values H(u,v) = 1 and
H(u,v) = 0 is called the cutoff frequency. In Figure. the cutoff frequency is D0. The
sharp cutoff frequency of an ILPF cannot be realized with electronic components,
although they certainly can be simulated in a computer (subject to the constrain that the
fastest possible transition is limited by the distance between pixels).

Gaussian Lowpass Filters:

• Gaussian lowpass filter (GLPF) transfer functions have the form

where, D(u,v) is the distance from the center of the P ×Q frequency rectangle to any point,
(u,v), contained by the rectangle.
• As before, σ is a measure of spread about the center. By letting σ = D0, we can
express the Gaussian transfer function in the same notation as other functions in this
section:

where, D(u,v) is the distance from the center of the P×Q frequency rectangle to any point,
(u,v), contained by the rectangle.
• Narrow Gaussian transfer functions in the frequency domain imply broader kernel
functions in the spatial domain, and vice versa. we know that the inverse Fourier
transform of a frequency domain Gaussian function is Gaussian also. This means that
a spatial Gaussian filter kernel, obtained by computing the IDFT, will have no ringing.

Figure 12. (a) Perspective plot of a GLPF transfer function. (b) Function displayed as an
image. (c) Radial cross sections for various values of D0 .

Butterworth Lowpass filter:

• The transfer function of a Butterworth lowpass filter (BLPF) of order n, with cutoff
frequency at a distance D0 from the center of the frequency rectangle, is defined as

• BLPF function can be controlled to approach the characteristics of the ILPF using
higher values of n, and the GLPF for lower values of n, while providing a smooth
transition in from low to high frequencies. Thus, we can use a BLPF to approach the
sharpness of an ILPF function with considerably less ringing.
Figure 13. (a) Perspective plot of a Butterworth lowpass-filter transfer function. (b) Function
displayed as an image. (c) Radial cross sections of BLPFs of orders 1 through 4.

• The spatial domain kernel obtainable from a BLPF of order 1 has no ringing.
Generally, ringing is imperceptible in filters of order 2 or 3, but can become
significant in filters of higher orders

Sharpening filters in frequency:


• Subtracting a lowpass filter transfer function from 1 yields the corresponding high pass
filter transfer function in the frequency domain:

where HLP(u,v) is the transfer function of a lowpass filter.


• Thus, it follows from that an ideal highpass filter (IHPF) transfer function is given by

where, as before, D(u,v) is the distance from the center of the P×Q frequency rectangle.
• Similarly, the transfer function of a Gaussian highpass filter (GHPF) transfer function
is given by

• The transfer function of a Butterworth highpass filter (BHPF) is


Figure 14. Top row: Perspective plot, image, and, radial cross section of an IHPF transfer
function. Middle and bottom rows: The same sequence for GHPF and BHPF transfer
functions.
Color Fundamentals:
Saturation:
Saturation refers to the relative purity or the amount of white light mixed with a hue. The pure
spectrum colors are fully saturated. Colors such as pink (red and white) and lavender (violet
and white) are less saturated, with the degree of saturation being inversely proportional to the
amount of white light added.
Hue:
Hue is an attribute associated with the dominant wavelength in a mixture of light waves. Hue
represents dominant color as perceived by an observer. Thus, when we call an object red,
orange, or yellow, we are
referring to its hue.
Brightness:
Brightness is a subjective descriptor that is practically impossible to measure. It embodies the
achromatic notion of intensity and is one of the key factors in describing color sensation.

CIE chromaticity diagram and mention its significance:


• The CIE chromaticity diagram is an international standard for primary colours
established in 1931. It allows all other colours to be defined as weighted sums of the
three primary colours.
• In the CIE system, the intensities of red, green and blue are transformed into tristimulus
values which are represented by X, Y and Z.
• The relative amounts of the three primary colours of light required to produce a colour
of a given wavelength are called tristimulus values.
• These values represent three relative quantities of the primary colours. The coordinate
axes of the CIE chromaticity diagram are derived from the tristimulus values as
x = X/(X + Y + Z ) = red/(red + green + blue), y = Y/(X + Y + Z ) = green/(red + green
+ blue) and z = Z/(X + Y + Z ) = blue/(red + green + blue).
• The coordinates x, y and z are called chromaticity coordinates and always add up to 1.
As a result, z can always be expressed in terms of x and y, which means that only x and
y are required to specify any colour, and the diagram can be two-dimensional.
• The chromaticity diagram can be used to compare the gamuts of various possible output
devices like printers and monitors.

• The CIE chromaticity diagram as shown in Figure is useful in defining the colour
gamut, the range of colours that can be produced on a device.

Figure 15. CIE 1931 Chromaticity diagram showing various colour regions

Radiance:
Radiance is the total amount of energy that flows from the light source, and it is usually
measured in watts (W).
Luminance:
Luminance, measured in lumens (lm), is a measure of the amount of energy that an observer
perceives from a light source. For example, light emitted from a source operating in the far
infrared region of the spectrum could have significant energy (radiance), but an observer
would hardly perceive it; its luminance would be almost
zero.
Brightness:
brightness is a subjective descriptor that is practically impossible to measure. It embodies the
achromatic notion of intensity, and is one of the key factors in describing color sensation.

Color Models:
Importance of Color Models:
• The purpose of a color model (also called a color space or color system) is to facilitate
the specification of colors in some standard way. In essence, a color model is a
specification of (1) a coordinate system, and (2) a subspace within that system, such
that each color in the model is represented by a single point contained in that subspace.
• Colour models provide a standard way to specify a particular colour by defining a 3D
coordinate system, and a subspace that contains all constructible colours within a
particular model. Any colour can be specified using a model.
• Each colour model is oriented towards a specific software or image-processing
application. Most color models in use today are oriented either toward hardware (such
as for color monitors and printers) or toward applications, where color manipulation is
a goal (the creation of color graphics for animation is an example of the latter).
• In terms of digital image processing, the hardware-oriented models most commonly
used in practice are the RGB (red, green, blue) model for color monitors and a broad
class of color video cameras; the CMY (cyan, magenta, yellow) and CMYK (cyan,
magenta, yellow, black) models for color printing; and the HSI (hue, saturation,
intensity) model, which corresponds closely with the way humans describe and
interpret color.
RGB Color Model:
• In an RGB colour model, the three primary colours red, green and blue form the axis
of the cube which is illustrated in Figure 1. Each point in the cube represents a specific
colour. This model is good for setting the electron gun for a CRT.

Figure 16. RGB Color Cube


• RGB is an additive colour model. From Figure 2., it is obvious that
Magenta = Red + Blue,
Yellow = Red + Green,
Cyan = Blue + Green

Figure 17. RGB Color Model

Limitations of the RGB Model:


The limitations of the RGB model are summarised below:
(i) The RGB colour coordinates are device dependent. This implies that the RGB model will
not in general reproduce the same colour from one display to another.
(ii) The RGB model is not perceptually uniform. The meaning is that one unit of coordinate
distance does not correspond to the same perceived colour difference in all regions of the colour
space.
(iii) It is difficult to relate this model to colour appearance because its basis is to device signals
and not display luminance values.

Converting colours from RGB to HIS:


Converting colors from HSI to RGB:
Pseudo colour processing:
• Basically, the idea of this approach is to perform three independent transformations on
the intensity of input pixels. The three results are then fed separately into the red, green,
and blue channels of a color monitor.
• This method produces a composite image whose color content is modulated by the
nature of the transformation functions.

Figure 18. Functional block diagram for pseudocolor image processing. Images fR, fG,and fB are fed into the
corresponding red, green, and blue inputs of an RGB color monitor.
• The method for intensity slicing is a special case of this technique. There, piecewise
linear functions of the intensity levels are used to generate colors.
• On the other hand, the method, this approach is based on a single grayscale image.
Often, it is of interest to combine several grayscale images into a single-color
composite.
• A frequent use of this approach is in multispectral image processing, where different
sensors produce individual grayscale images, each in a different spectral band.
• When coupled with background knowledge about the physical characteristics of each
band, color-coding in the manner just explained is a powerful aid for human visual
analysis of complex multispectral images.
Figure 19. Transformation functions used to obtain the pseudocolor images
• Figure 2., shows the transformation functions used. These sinusoidal functions contain
regions of relatively constant value around the peaks as well as regions that change
rapidly near the valleys.
• Changing the phase and frequency of each sinusoid can emphasize (in color) ranges in
the grayscale. For instance, if all three transformations have the same phase and
frequency, the output will be a grayscale image.
• A small change in the phase between the three transformations produces little change
in pixels whose intensities correspond to peaks in the sinusoids, especially if the
sinusoids have broad profiles (low frequencies).
• Pixels with intensity values in the steep section of the sinusoids are assigned a much
stronger color content as a result of significant differences between the amplitudes of
the three sinusoids caused by the phase displacement between them.

Unit IV
Image degradation Model:
• Restoration attempts to recover an image that has been degraded by using a priori
knowledge of the degradation phenomenon. Thus, restoration techniques are oriented
toward modeling the degradation and applying the inverse process in order to recover
the original image.

Figure 1. A model of the image degradation/restoration process.


Noise Models:

Gaussian Nose:

Rayleigh Noise:

Gamma/Erlang Noise:
Exponential Noise:

Uniform Noise:

Salt and Pepper Noise:


Figure 2. Some important probability density functions.

Spatial domain restoration in presence of noise:


Inverse filtering:
Least mean square filters:
Constrained least square restoration:
Image Segmentation:

Region based Approach:


Region based segmentation:

Region Growing:
Region splitting and merging:

Figure 3. Region based segmentation

Clustering techniques:
Thresholding and its types:
An image, f (x, y), composed of light objects on a dark background, in such a way that object
and background pixels have intensity values grouped into two dominant modes. One obvious
way to extract the objects from the background is to select a threshold, T, that separates these
modes. Then, any point (x, y) in the image at which f (x, y) > T is called an object point.
Otherwise, the point is called a background point. In other words, the segmented image,
denoted by g(x, y), is given by

When T is a constant applicable over an entire image, the process given in this equation is
referred to as global thresholding. When the value of T changes over an image, we use the term
variable thresholding. The terms local or regional thresholding are used sometimes to denote
variable thresholding in which the value of T at any point (x, y) in an image depends on
properties of a neighborhood of (x, y) (for example, the average intensity of the pixels in the
neighborhood). If T depends on the spatial coordinates (x, y) themselves, then variable
thresholding is often referred to as dynamic or adaptive thresholding.

Figure 4. shows a more difficult thresholding problem involving a histogram with three
dominant modes corresponding, for example, to two types of light objects on a dark
background. Here, multiple thresholding classifies a point (x, y) as belonging to the background
if f (x, y) ≤ T1 to one object class if T1 <f(x, y) ≤T2, and to the other object class if f (x, y) > T2.
That is, the segmented image is given by
Figure 4. Intensity histograms that can be partitioned (a) by a single threshold, and (b) by
dual thresholds.

Edge detection:
Laplacian of Gaussian (LoG) operator for edge detection:
Unit V
Image Compression:
Image compression, the art and science of reducing the amount of data required to represent
an image, is one of the most useful and commercially successful technologies in the field of
digital image processing. The term data compression refers to the process of reducing the
amount of data required to represent a given quantity of information.

Redundancies in Images:
For the given table,

Temporal Redundancy:

Figure 1. Computer generated 256×256×8bit images with (a) coding redundancy, (b) spatial
redundancy, and (c) irrelevant information. (Each was designed to demonstrate one principal
redundancy, but may exhibit others as well.)
Coding Redundancy:

using a natural m-bit fixed-length code.

Image Compression model:


An image compression system is composed of two distinct functional components: an encoder
and a decoder. The encoder performs compression, and the decoder performs the
complementary operation of decompression. Both operations can be performed in software, as
is the case in Web browsers and many commercial image-editing applications, or in a
combination of hardware and firmware, as in commercial DVD players. A codec is a device or
program that is capable of both encoding and decoding.

Figure 2. Functional block diagram of a general image compression system.

Input image f (x, p) is fed into the encoder, which creates a compressed representation of the
input. This representation is stored for later use, or transmitted for storage and use at a remote
location. When the compressed representation is presented to its complementary decoder, a
reconstructed output image fˆ(x, p) is generated. In still-image applications, the encoded input
and decoder output are f (x, y) and fˆ(x, y), respectively. In video applications, they are f (x, y,
t) and fˆ(x, y, t), where the discrete parameter t specifies time. In general, fˆ(x, p) may or may
not be an exact replica of f (x, p). If it is, the compression system is called error free, lossless,
or information preserving. If not, the reconstructed output image is distorted, and the
compression system is referred to as lossy.
Fundamentals of information theory:
Huffman Coding:
• The first step in Huffman’s approach is to create a series of source reductions by
ordering the probabilities of the symbols under consideration, then combining the
lowest probability symbols into a single symbol that replaces them in the next source
reduction.
• At the first step, all of source symbols and their probabilities are ordered from top to
bottom in terms of decreasing probability values.
• To form the first source reduction, the bottom two probabilities of symbols are
combined to form a “compound symbol” with new probability.
• This compound symbol and its associated probability are placed in the first source
reduction column so that the probabilities of the reduced source also are ordered from
the most to the least probable. This process is then repeated until a reduced source with
two symbols (at the far right) is reached.
• The second step in Huffman’s procedure is to code each reduced source, starting with
the smallest source and working back to the original source.
• The minimal length binary code for a two-symbol source, of course, are the symbols 0
and 1.
• These symbols are assigned to the two symbols on the right. (The assignment is
arbitrary; reversing the order of the 0 and 1 would work just as well.)
• Suppose as the reduced source symbol with probability 0.6 was generated by combining
two symbols in the reduced source to its left, the 0 used to code it is now assigned to
both of these symbols, and a 0 and 1 are arbitrarily appended to each to distinguish
them from each other.
• This operation is then repeated for each reduced source until the original source is
reached. The final code appears at the far left. Then the average length of this code and
entropy of the source can be calculated using the following equations respectively.
Arithmetic coding:
Bit plane coding:

Run length coding:


Transform coding:

Figure 4. Transform Coding Block diagram


Figure 4. shows a typical block transform coding system. The decoder implements the inverse
sequence of steps (with the exception of the quantization function) of the encoder, which
performs four relatively straightforward operations: subimage decomposition, transformation,
quantization, and coding. An M × N input image is subdivided first into subimages of size n
×n, which are then transformed to generate MN n2 subimage transform arrays, each of size n
×n. The goal of the transformation process is to decorrelate the pixels of each subimage, or to
pack as much information as possible into the smallest number of transform coefficients. The
quantization stage then selectively eliminates or more coarsely quantizes the coefficients that
carry the least amount of information in a predefined sense (several methods will be discussed
later in the section). These coefficients have the smallest impact on reconstructed subimage
quality. The encoding process terminates by coding (normally using a variable-length code)
the quantized coefficients.
Image Formats and compression standards:

References:
1. R.C. Gonzalez & R.E. Woods, Digital Image Processing, Addison Wesley/Pearson
education, 3rd edition, 2010.
2. A.K. Jain, Fundamentals of Digital Image processing, PHI.
3. S Jayaraman, S Esakkirajan, T Veerakumar, Digital Image processing, Tata McGraw Hill,
1st edition, 2009.

You might also like