0% found this document useful (0 votes)
23 views54 pages

Theory noteDIP

Uploaded by

imoontasir07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views54 pages

Theory noteDIP

Uploaded by

imoontasir07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

CONTENTS

 Introduction & Fundamental to Digital Nabilah Anzoom Nishu


Image Processing Dept. of CSE
 Image Enhancement in the Spatial Domain & Session: 2017-18
Frequency Domain CSE 4207: Digital Image Processing
 Image Restoration
 Color Image Processing
 Wavelet and other image transformation
 Image Compression
 Morphological image processing

Introduction & Fundamental to Digital Image Processing

Q1. What is an image? How are images represented in memory? Or, Explain the technique of
representing an image in the memory.

Answer:

An image may be defined as a two dimensional function f (x, y), where x and y are spatial
coordinates, and the amplitude of f at any pair of coordinates (x, y) is called the intensity or gray
level of the image at that point.

Images are typically represented in computer memory as a collection of pixels. Each pixel is a tiny
unit that stores information about the color of a specific location in the image. The most common
representation is using a combination of red, green and blue (RGB) channels, where each channel
stores the intensity of its perspective color.

For example, in a standard 24 bit color image, each pixel is represented by 8 bits for red, 8 bits for
green and 8 bits for blue, totaling 24 bits (or 3 bytes) per pixel. This allows for over 16 million possible
colors (224).

Technique of representing an image in the memory:

Representing an image in memory typically involves using a data structure to store the pixel values
that make up the image. The most common technique is to use a grid-like structure, with each
element of the grid representing a pixel.

Here are the key components of this technique:

1. Pixel Values:
Each pixel in the image is represented by a set of values that define its color and intensity.
For colored images, these values often include Red, Green, and Blue (RGB) components.
Grayscale images have a single intensity value per pixel.

1
2. Grid Structure:
Images are two-dimensional, so a grid-like structure is used to organize the pixel values. Each
cell in the grid corresponds to a pixel, and the rows and columns define the image's
dimensions.

3. Resolution:
The number of rows and columns in the grid determine the image's resolution. A higher
resolution image has more pixels and can display more detail.

4. Color Depth:
The number of bits used to represent each pixel's color values determines the image's color
depth. For example, 8-bit color depth allows 256 colors, while 24-bit color depth provides
millions of colors.

5. Data Storage:
The pixel values are stored in memory as a sequence of binary data. The specific data structure
used (e.g., arrays, matrices, or other specialized formats) depends on the programming
language and application.

6. Compression:
Images may be compressed to reduce memory usage. Common image compression formats
include JPEG and PNG. These formats use various algorithms to store images more efficiently.

Q2. Write the differences among picture, images and photograph.

Answer:
Followings are the differences among picture, images and photograph:

Pictures Image Photograph


1. Any representation of an
A generic term that A specific type of image that is created
object ir scene createdrefers to any visual by capturing light on a photo sensitive
by the means creation. representation. surface.
2. It can be physical or It can be physical or It can be physical (traditional) or digital
digital realism. digital. realism.
3. Including drawing ,
Including drawing, Taken by camera
photography, painting photography, painting,
etc. sculptures, holograms
and imagination.
4. It can be abstract or It can be abstract or It is more realistic, capturing on actual
realistic. realistic. moment or scene.
5. Can convey emotion but Can convey emotion Often to convey emotion and tells
not always. but not always. stories.

2
6. Widely used in various Widely used in various They are often used in photography
context, including visual context, including visual and journalism context.
art , journalism, art and graphics design.
advertising etc.

Q3. Define digital image and digital image processing (DIP).

Answer:
Digital image:

When x, y and the amplitude values of f all are finite, discrete quantities, we can call the image a
digital image.

Digital image processing deals with manipulation of images through digital computer. It is a
subfield of signals and systems but focus particularly on images. DIP focuses in developing a
computer system that is able to perform processing on an image. The input of that system is digital
image and the system process that image using efficient algorithms, and gives an image as an
output.
Example: Adobe Photoshop. It is one of the widely used applications for processing digital images.

Q4. What are the components of digital image processing system?


Answer:

Followings are the components of digital image processing system:

1. Image acquisition:

Two elements are required to acquire digital images .the first is a physical device that is sensitive to
a band in the electromagnetic energy spectrum and that produces an electrical signal output
proportional to the level of energy sensed. The second called a digitizer, is a device for converting
the electrical output of the physical sensing device into digital form.
2. Storage:

Providing adequate storage is usually a challenge in the design of image processing systems. digital
storage for image processing applications fall into three principle categories: i) short term storage
for use during processing, ii) one line storage for relatively fast recall and iii) archival storage
cauterized by infrequent access.

3. Processing:

Processing of digital images involves procedures that are usually expressed in algorithmic and
display most image processing functions can be implemented in software. The only reason for
specialized image processing hardware is the need for speed in some applications or to overcome
some fundamental computer limitations.
3
4. Communication:

Communication in DIP primarily involves local Communications between processing system and
remote communication from one to another typically in connection with the transmission of image
data.

5. Display:

Monochrome and color TV monitors are the principle display devices used in modern image
processing systems. Monitors are driven by the outputs of the hardware image display module in
the back plane of the host computer or as part of the hardware associated with an image processor.

Fig: Fundamental components of a digital image processing system.

Q5. Discuss the fields that are use DIP. / Write the applications of digital image processing
(DIP).
Answer:

Almost in every field digital image processing fields a live effect on things and is growing with time
to time and with new technologies.

1. Image sharping and restoration:


It refers to the process in which we can modify the look and feel of an image. It basically
manipulates the images and achieves the desired output. It includes conversion, sharping,
blurring, detecting edge, retrieval, and recognition of image

2. Medical fields:
There are several applications under medical fields which depends on the functioning of digital
image processing.
4
i. Gamma ray imaging
ii. PET scan
iii. X-ray imaging
iv. Medical CT scan
v. UV imaging

3. Robot vision:
There are several robotic machines that work on the digital image processing. Through image
processing techniques robot finds their ways. For example, hurdle detection root and line
follower robot.

4. Pattern recognition:
It involves the study of image processing, it is also combined with artificial intelligence such that
computer aided diagnosis, handwriting recognition, and images recognition can be easily
implemented. Now a days image processing is used for pattern recognition.

5. Video processing:
It is also one of the application of digital image processing .a collection of frames or pictures are
arranged in such a way that makes the first movement of pictures. It involves frame rate
conversion, motion detection, reduction of noise and color space conversion etc.

Q6. Explain the fundamental steps of digital image processing.

Answer:

Followings are the fundamental steps of Digital Image Processing:

1. Image acquisition:
Image acquisition is the first step of the fundamental steps in DIP. In this stage, an image is
given in the digital form. Generally, in this stage, pre-processing such a scaling is done.
2. Image enhancement:
Image enhancement in the simplest and most attractive area in DIP. In this stage details which
are not known, or we can say that interesting features of an image is highlighted. Such as
brightness, contrast etc.
3. Image restoration:
Image restoration is the stage in which the appearance of an image is improved.
4. Color image processing:
Color image processing is a famous are because it has increased the use of digital images on
the internet. This includes color modeling, processing in a digital domain etc.
5. Wavelets and multi-resolution processing:
In this stage, an image is represented in various degrees of resolution. Image is divided into
smaller regions for data compression and for the pyramidal representation.
5
6. Compression:
Compression is a technique which is used for reducing the requirement of storing an image. It
is a very important stage because it is very necessary to compress data for internet use.
7. Morphological processing:
This deals with tools which are used for extracting the components of the image, which is
useful in the representation and description of shape.
8. Segmentation:
In this stage, an image is a partitioned into its objects. Segmentation is the most difficult tasks
in DIP. It is process which takes a lot of time for the successful solution of imaging problem.
9. Representation and description:
Representation and description follow the output of the segmentation stage. The output is a
raw pixel data which has all pairs if the region itself. To transform the raw data, representation
is the only solution where description is used for extracting information to different one class
of objects from another.
10. Object recognition:
In this stage, the label is assigned to the object, which is based on descriptors.
11. Knowledge base:
Knowledge is the last stage in DIP. In this stage, important information of the image is located,
which limits the searching process.

6
Q7. Define pixel, contrast, brightness and intensity.

Answer:

A pixel, short for "picture element," is the smallest unit of a digital image or display. It is a tiny
square or dot that represents a single point in an image and contains information about its color
and brightness.

Contrast in an image refers to the difference in brightness and color between various parts of the
image. High contrast means there are significant differences between light and dark areas, while low
contrast indicates that these differences are minimal.

Brightness refers to the overall lightness or darkness of an image or a specific area within an image.
It represents the amount of light emitted or reflected by a pixel or a region in an image. In simpler
terms, it describes how "bright" or "dark" an image or part of an image appears.

Intensity in the context of images typically refers to the brightness or luminance of a pixel or a
region of an image. It quantifies how much light or color is present at a specific point, often
measured in terms of grayscale values for black and white images or the red, green, and blue (RGB)
components for color images.

Q8. Define gray level and gray level scale of an image.


Answer:

Gray level, also known as gray value or intensity level, is a measure of the brightness or luminance
of a specific pixel in a digital image. It represents the intensity of light at that pixel and is typically
quantified on a scale from 0 (black) to 255 (white) in an 8-bit grayscale image, where intermediate
values represent various shades of gray.

A grayscale image, also known as a black-and-white image, is an image in which each pixel is
represented by a single gray level, as described above. It lacks color information and contains only
shades of gray, ranging from black (minimum intensity) to white (maximum intensity). Gray scale
images are commonly used in applications where color information is not required or to simplify
image processing and analysis.

Q9. Define color depth and bit planes.

Answer:

Color depth, also known as bit depth or pixel depth, refers to the number of bits used to represent
the color of each pixel in a digital image. It determines the range and variety of colors that can be
displayed. Common color depths include 8-bit (256 colors), 24-bit (true color with 16.7 million
colors), and 32-bit (true color with additional transparency information). Higher color depths allow
for more accurate and detailed color representation.

7
Bit planes are a way to represent an image by breaking it down into separate bit-level layers, with
each layer (bit plane) storing information about whether a pixel is on or off (1 or 0) for a specific bit
position. By combining these bit planes, you can reconstruct the original image. Bit planes are often
used in image processing and manipulation for tasks like image compression, encryption, and
certain visual effects. Each bit plane reveals different levels of image detail, with the most significant
bit plane containing the coarsest information and the least significant bit plane containing the finest
details.

Q10. What are the types of neighborhoods between pixels? / Define 4-connected and 8-
connected neighbors of pixel p(x,y).

Answer:

Types of neighborhoods between pixels:

1. 4-connected neighbor:
A pixel P at (x, y) has 4 horizontal / vertical neighbors at (x+1, y), (x-1, y), (x, y+1) and (x, y-1)
2. 8-connected neighbor:
A pixel P at (x, y) has 4 diagonal neighbors at (x+1, y+1), (x+1, y-1),(x-1, y+1) and (x-1, y-1). These
are called the diagonal neighbors of P : ND(P).

The 4- neighbors and the diagonal neighbors of P are called 8 neighbors of P: N8(P).

Q11. What do you mean by neighbors of a pixel? Explain different connectivity between pixels
in an image.

Answer:

The neighbors of a pixel refer to the pixels that surround a specific pixel in an image. The neighborhood
of a pixel is defined based on its spatial location in relation to the pixel of interest

8
Pixel connectivity is a central concept of both edge and region based approaches to segmentation.
Connectivity is adapted from neighborhood relation. Two pixels are connected if they are in the
same class (i.e. the same color or the same range of intensity) and they are neighbors of one another.

Connectivity between pixels is an important concept in digital image processing. It is used for
establishing boundaries of objects and components of regions in an image.

Two pixels are said to be connected:


i. if they are adjacent in some sense( neighbor pixels,4/8/m-adjacency)
ii. if their gray levels satisfy a specified criterion of similarity(equal intensity level)

There are three types of connectivity on the basis of adjacency. They are:
i. 4-connectivity:
Two or more pixels are said to be 4-connected if they are 4-adjacent with each others.
ii. 8-connectivity:
Two or more pixels are said to be 8-connected if they are 8-adjacent with each others.
iii. m-connectivity:
Two or more pixels are said to be m-connected if they are m-adjacent with each others.

Q12. Explain the connectivity between pixels of gray level image.

Answer:

Let V be the set of gray level values used to define connectivity, for example, in a binary image, V=
{1} for the connectivity of pixels with value 1. In a gray scale image, for the connectivity of pixels with
a range of intensity values of , say 32 to 64 , it follows that V= {32,33,_,63,64}. We consider three
types of connectivity-

i. 4-adjacency:
Two pixels p and q with values from V are 4-adjacent if q is in the set N4 (p).

ii. 8-adjacency:
Two pixels p and q with values from V are 8-adjacent if q is in the set N8 (p).

iii. m-adjacency(mixed adjacency):


Two pixels p and q with values from V are m-adjacent if

9
 q is in N4(p), or
 q is in ND(p) and the set N4(p)∩N4(q) is empty.

Mixed adjacency is a modification of 8-adjacency ''introduced to eliminate the multiple path


connections that often arise when 8- adjacency is used. For V= {1} the paths between 8 neighbors
of the center pixel are shown by dashed lines-

Ambiguity in path connections that result from allowing 8- connectivity. This ambiguity is
removed by using m-connectivity as shown in (c).

Q13. Why studying connectivity is important for image processing?

Answer:

Connectivity between pixels is an important concept used in establishing boundaries of objects and
components of regions in an image .to established weather two pixels are connected, it must be
determined if they are adjacent in some sense and if their grey level satisfied a specific criterion of
similarity. For example, in a binary image values 0 and 1, two pixels may be 4 neighbors, but they
are not said to be connected unless they have the same value.

Q14. Define adjacent, connectivity, region and boundaries.


Answer:

Adjacent refers to things that are next to or adjoining each other. Two elements are considered
adjacent if they share a common boundary or are in close proximity to each other.

Connectivity refers to the extent to which different components or elements within a system are
connected or linked. Connectivity can describe the relationships, interactions, or pathways that exist
between individual entities.

A region is an area or space that is defined by certain characteristics, features, or boundaries.


Regions can be geographical, cultural, administrative, or defined by other criteria. They are often
used to group and study areas that share common attributes.
Boundaries are lines or limits that define the extent of an area or separate one area from another.
Boundaries can be physical, such as geographic features (rivers, mountains), or they can be artificial,

10
such as political borders or property lines. Boundaries provide a demarcation between different
regions or entities.

Q15. Explain sampling and gray level quantizing of an image.

Answer:

Sampling in image processing involves capturing and representing a continuous image as a discrete
set of pixels. Continuous-tone images are essentially analog signals. To work with them digitally, we
need to convert this continuous data into discrete data, which involves taking samples at regular
intervals.

Imagine an image as a continuous surface, like a landscape. Sampling is akin to placing a grid over
this landscape and recording the color or intensity value at each grid intersection. These
intersections become the pixels in the digital representation.

The sampling rate determines how frequently we take these samples. Higher sampling rates capture
more details but also require more data storage.

Gray level quantization, also known as intensity quantization, involves reducing the number of
possible intensity levels in an image.

Consider an 8-bit grayscale image, where each pixel can have 256 different intensity values ranging
from 0 (black) to 255 (white). Quantization might involve reducing this to, say, 64 levels. This means
mapping several original intensity values to a single level in the quantized image.

The process often involves grouping nearby intensity values together. For example, values 0-31
might map to the new level 0, 32-63 to level 1, and so on. This reduces the image's dynamic range
but also reduces the amount of data needed to represent the image.

Q16. What do you mean by image sampling and quantization?


Answer:

Sampling in image processing involves capturing and representing a continuous image as a discrete
set of pixels. Continuous-tone images are essentially analog signals. To work with them digitally, we
need to convert this continuous data into discrete data, which involves taking samples at regular
intervals.

Imagine an image as a continuous surface, like a landscape. Sampling is akin to placing a grid over
this landscape and recording the color or intensity value at each grid intersection. These
intersections become the pixels in the digital representation.

The sampling rate determines how frequently we take these samples. Higher sampling rates capture
more details but also require more data storage.

11
Quantization is the process of assigning a finite number of levels to the sampled points' values.

Its purpose is to convert the continuous range of intensities into a set of discrete levels suitable for
digital representation.

It introduces some loss of information, as multiple original values may be mapped to the same
quantized value.

Example: If the original image has intensity values ranging from 0 to 255 (8-bit grayscale),
quantization may involve reducing these values to, say, 16 levels (0-15).

Q17. What do you mean by image sensing?

Answer:
Image sensing involves the use of specialized devices called image sensors to capture visual
information from the environment or objects. These sensors, such as charge-coupled devices (CCDs)
or complementary metal-oxide-semiconductor (CMOS) sensors, convert light patterns into electrical
signals. The process relies on the detection of photons, where the intensity and color of light at
different points on the sensor's surface create a representation of the visual scene. The sensors
consist of an array of pixels, with each pixel representing a small area on the sensor. The collective
information from all pixels forms the complete image. Image sensing is pivotal in numerous
applications, including digital cameras, smartphones, surveillance cameras, medical imaging, and
various other technologies that rely on the ability to capture, process, and interpret visual data.

Q18. What is quantization error?

Answer:

Quantization error, also known as quantization noise, is the difference between the actual
continuous signal and its quantized representation. When an analog signal is converted to a digital
representation through quantization (assigning discrete values), some information is lost due to the
finite precision of the representation.

The quantization error is the discrepancy between the original analog signal and the digitized
version. It arises because the continuous range of values in the analog signal cannot be perfectly
represented by the limited set of discrete values used in quantization. This error introduces
inaccuracies, and the greater the number of quantization levels, the smaller the quantization error
tends to be.

12
Q19. Describe a simple image formation/representation model.

Answer:

An image is represented by two dimensional functions of the form f(x, y). The value or amplitude of
f at spatial coordinates (x, y) is positive scalar quantity whose physical meaning is determined by the
surface of the image.

When an image is generated from a physical process, its values are proportional to energy radiated
by a physical source. As a consequence, f(x, y) must be non-zero and finite, that is,

0 < f(x, y) < ∞……………………… (1)


The function f(x, y) may be characterized by two components:

1. The amount of source illumination indicate on the scene being.


2. The amount of illumination reflected by the object in the scene being.

Appropriately, these are called the illumination and reflectance components and are denoted by i(x,
y) and r(x, y) respectively.
The two functions combine as a product to form f(x, y)

f (x, y) = i (x, y) r(x, y)…………………… (2)

Where,
0 < i(x, y) < ∞………………………….. (3)

And

0 < f(x, y) < ∞…………………………. (4)


Equation (4) indicates that reflectance is bounded by 0 (total absorption) and 1 (total reflectance).
The nature of i (x, y) is determined by the illumination source, and r (x, y) is determined by the
characteristics of the imaged objects.

Q20. Briefly explain relations, equivalence and transitive closure.

Answer:

A binary relation defines a relationship between two elements, usually from different sets. It
consists of ordered pairs where the first element is related to the second. For example, if we have a
set A = {1, 2, 3}, a binary relation R on A could be {(1, 2), (2, 3)}, indicating that 1 is related to 2 and
2 is related to 3. Binary relations are fundamental in various mathematical fields and are used to
represent connections, dependencies, or comparisons between elements.
Equivalence Relation:

An equivalence relation is a special kind of relation that satisfies three properties: reflexivity,
symmetry, and transitivity.
13
 Reflexivity: Every element is related to itself.
 Symmetry: If a is related to b, then b is related to a.
 Transitivity: If a is related to b and b is related to c, then a is related to c.
 Example: The relation "is congruent modulo n" is an equivalence relation.

Transitive Closure:

 The transitive closure of a relation is the smallest transitive relation that contains the original
relation.
 It involves adding pairs to the relation to make it transitive while keeping it as small as
possible.
 For example, if R = {(1, 2), (2, 3)}, the transitive closure might be {(1, 2), (2, 3), (1, 3)}.

Q21. Discuss various distance measurement techniques.

Answer:

Distance measurement techniques are crucial in various fields, including mathematics, physics,
computer science, and engineering.
Here are some common distance measurement techniques:

1. Euclidean Distance:

Measures the straight-line distance between two points in Euclidean space. In two dimensions, it's
calculated as the square root of the sum of squared differences in coordinates.

Formula for 2D: √ [(x2 – x1)2 + (y2 – y1)2]


2
General formula in n dimensions: ∑𝐧
𝐢=𝟏(𝐱𝐢𝟐 − 𝐲𝐢𝟏)

2. City Block Distance (Manhattan Distance):

Also known as the taxicab or L1 distance, measures the sum of absolute differences between the
coordinates of two points.

Formula for 2D: |x2 – x1| + |y2 – y1|

General formula in n dimensions: ∑𝐧


𝐢=𝟏 |𝐱𝐢 − 𝐲𝐢|

3. Minkowski Distance:

A generalization of both Euclidean and Manhattan distances. It includes both as special cases.
p 1/p
𝒊=𝟏 |𝒙𝒊 − 𝒚𝒊| )
Formula: (∑𝒏

When p=2, it is the Euclidean distance; when p=1, it is the Manhattan distance.

14
4. Chessboard Distance (Chebyshev Distance):

Measures the maximum absolute difference between the coordinates of corresponding elements in
two vectors.

Formula in 2D: max (|x2-x1|, |y2-y1|)

General formula in n dimensions: max (|xi2 – xi1|)

Q22. Describe arithmetic/logic operation in image processing.

Answer:
In image processing, arithmetic and logic operations are fundamental techniques used to
manipulate and enhance images.

Arithmetic Operations:

1. Addition: Pixel values of corresponding locations in two images are added together. This can
be used for tasks like image blending or intensity adjustments.
2. Subtraction: Pixel values of one image are subtracted from the corresponding values of
another. Useful for tasks like image differencing or background subtraction.
3. Multiplication: Pixel values at corresponding locations in two images are multiplied. This is
often used for contrast adjustments.
4. Division: Pixel values in one image are divided by the corresponding values in another. It can
be employed for tasks like normalization.
Logic Operations:

1. AND Operation: Performs a logical AND operation between corresponding pixels of two
binary images. The result is a binary image where a pixel is set to 1 if both input pixels are 1.
2. OR Operation: Performs a logical OR operation between corresponding pixels of two binary
images. The result is a binary image where a pixel is set to 1 if at least one of the input pixels
is 1.
3. NOT Operation: Inverts the pixel values of a binary image. Pixels that were 1 become 0, and
vice versa.
4. XOR Operation (Exclusive OR): Performs a logical XOR operation between corresponding
pixels of two binary images. The result is a binary image where a pixel is set to 1 if only one
of the input pixels is 1.

Q23. Define camera calibration.

Answer:
Camera calibration is the process of determining the intrinsic and extrinsic parameters of a camera
system. These parameters are essential for accurately mapping points in a three-dimensional world
to their corresponding two-dimensional image coordinates. Calibration corrects distortions and

15
imperfections in camera systems, allowing for precise measurements and accurate representation
of objects in images.

Q24. What is stereo imaging?

Answer:

Stereo imaging refers to the technique of creating a three-dimensional (3D) perception of the
environment or objects by capturing and processing images from two or more cameras, simulating
the way human vision works. In stereo imaging, the cameras are typically arranged in a way that
replicates the separation between human eyes, allowing the system to perceive depth and create a
sense of three-dimensionality. The process involves capturing images simultaneously or near-
simultaneously and then using computational techniques to analyze the disparities between the
images.

Q25. Suppose a 24 bit color image of size 1024*786*3 is stored in memory. Determine the
space in memory occupied by it in Byte.

Answer:
To determine the space occupied by the 24 bit color image is memory, we can use the formula:

Memory size (in bytes) = Width × Height × Number of channels × Bits per channel

Here, Width= 1024 pixels


Height= 786 pixels

No. of channels =3 (red, green, blue)

Bits per channel = 24 (as it is a 24 color image)

Memory size = (1024 × 786 × 3 × 24 ) / 8 (in bytes)


= 7243776 bytes (Ans)

Q26. Consider a color image of size 640*480. If it is true color system, calculate the number
of bytes needed to store the image.
Answer:

The number of bits = 640 × 480 × 24

= 7372800 [ For true color system 24 bits is used ]


The required bytes = 7372800 / 8 = 921600 bytes (Ans)

16
Q27. Consider a color image of size 1024*1024 and each primary color has 256 gray level.
Calculate the number of bytes needed to store the image.
Answer:

The number of gray level, G = 2m

→256 = 2m → 28 = 2m → m = 8

The number of required bits, b = N2m = (1024)2 × 8 = 8388608 / 8 bits = 1048576 bytes (Ans)

Q28. Calculate the number of bytes required to store a monochromic image with 24 gray
level.

Answer:

The number of bytes required to store a monochromatic image depends on the bit depth or the
number of bits per pixel used to represent each pixel's intensity. If a monochromatic image has 24
gray levels, it means there are 2^24 possible intensity levels, as 224=16,777,216
To calculate the bit depth needed to represent each pixel, you can use the formula:

Bit Depth=[log2 (Number of Intensity Levels) ]

For 24 gray levels:


Bit Depth=[ log2 (24) ] = 5

Now, to calculate the number of bytes per pixel, you divide the bit depth by 8 (since there are 8 bits
in a byte):
Bit Depth 5
Bytes per Pixel= =
8 8

However, the number of bytes per pixel must be rounded up to the nearest whole number because
you cannot have a fraction of a byte. Therefore:
Bytes per Pixel=1

So, for a monochromatic image with 24 gray levels, each pixel requires 1 byte to represent its
intensity.

Q29. Describe the origin of Digital Image Processing

Answer:
The origin of digital image processing can be traced back to the mid-20th century when
computers started to become more widely available and accessible.
17
Digital image processing originated in the mid-20th century with the advent of computers. As
computing technology advanced, researchers began exploring ways to analyze and enhance digital
images. The field gained prominence in the 1960s and saw significant growth in the following
decades with applications in medicine, remote sensing, and industry. The 21st century witnessed
widespread integration into everyday technologies, and advancements in artificial intelligence and
machine learning further impacted the field, making it essential in various domains.

18
Image Enhancement in the Spatial Domain & Frequency Domain

Q1. What do you mean by image enhancement?

Answer:
Image enhancement refers to the process of improving the visual appearance of an image to make
it more suitable for analysis or display. The goal is to highlight certain features of interest or to
improve the overall quality of the image. Image enhancement techniques can be applied to both
digital and analog images and are commonly used in various fields, including medical imaging,
satellite imaging, surveillance, photography, and computer vision.

There are different methods for image enhancement, and they can be broadly categorized into two
types:
Spatial Domain Techniques:

i. Point Processing: Involves operations on individual pixels based on their intensity values.
Common point processing techniques include contrast stretching, histogram equalization,
and intensity transformations.
ii. Neighbors Processing: Involves operations that consider the relationship between a pixel
and its neighboring pixels. Examples include smoothing filters (e.g., averaging) and
sharpening filters (e.g., Laplacian or high-pass filters).

Frequency Domain Techniques:

i. Fourier Transform: Images can be transformed from the spatial domain to the frequency
domain using techniques like Fourier transform. Enhancements can be applied in the
frequency domain, and the inverse transform is then used to bring the image back to the
spatial domain.

Q2. Describe various types of image enhancement.

Answer:

Image enhancement techniques are diverse and cater to different aspects of improving the visual
quality of images. Here are various types of image enhancement methods:

1. Contrast Enhancement:

Description: Improves the distinction between light and dark areas in an image, making details more
visible.
Methods: Contrast stretching, histogram equalization, and adaptive histogram equalization.

19
2. Color Enhancement:

Description: Adjusts the color balance and saturation to enhance the overall color appearance of an
image.

Methods: Color correction, histogram equalization in color channels, and color space
transformations.

3. Sharpness Enhancement:

Description: Emphasizes the edges and fine details in an image to make it appear sharper and
clearer.
Methods: Unsharp masking, high-pass filtering, and edge enhancement filters.

4. Noise Reduction:

Description: Minimizes unwanted variations or artifacts, such as random noise, to improve image
quality.
Methods: Smoothing filters (e.g., Gaussian smoothing), median filtering, and wavelet denoising.

5. Resolution Enhancement:

Description: Increases the spatial resolution of an image to reveal finer details.


Methods: Interpolation techniques (e.g., bilinear, bicubic), super-resolution, and deblurring.

6. Saturation Adjustment:

Description: Modifies the intensity of colors to enhance or reduce their vividness.

Methods: Saturation adjustments, color space transformations, and histogram equalization in color
channels.

7. Dynamic Range Adjustment:

Description: Expands or compresses the range of pixel intensities to better represent the full dynamic
range of a scene.

Methods: Tone mapping, dynamic range compression, and exposure adjustment.


8. Spatial Domain Transformations:

Description: Applies geometric transformations to modify the geometry or appearance of an image.

Methods: Rotation, scaling, cropping, and geometric transformations.

20
Q3. Explain the principle of spatial filtering.

Answer:

Spatial filtering is a technique used in image processing to enhance or suppress certain features
within an image based on their characteristics. The basic idea behind spatial filtering is to perform
operations on the pixel values of an image by considering the relationship between a pixel and its
neighboring pixels.

Here are some key principles of spatial filtering:

1. Convolution Operation:

Spatial filtering is often achieved through convolution, where a small matrix called a kernel or filter
is applied to the pixels of the image. The kernel is moved across the image, and at each position,
the pixel values in the neighborhood of the current position are multiplied by the corresponding
values in the kernel. The result is then summed to produce a new value for the central pixel. This
process is repeated for every pixel in the image.

Output(x, y) = ∑ai=−a ∑bj=−b Kernel(i. j) × image(x + i, y + j)

Here, a, and b the dimensions of the kernel, and (x, y) represents the coordinates of the pixel in the
output image.
2. Kernel Design:

The choice of the kernel determines the effect of the spatial filtering. Different kernels are designed
to achieve specific image processing tasks, such as blurring, sharpening, edge detection, noise
reduction, etc.

3. Filtering Effects:
i. Smoothing/Blurring Filters: These filters are designed to reduce noise and decrease image
detail. They are often used to create a more visually appealing or simplified representation
of an image.
ii. Sharpening Filters: These filters enhance the edges and fine details in an image, making it
appear more defined and crisp. They are often used to improve the visual clarity of an image.
iii. Edge Detection Filters: These filters highlight the boundaries between different regions in
an image, making edges more prominent. They are commonly used in applications such as
object detection.

4. Spatial Domain:

Spatial filtering operates in the spatial domain, meaning that it processes pixel values directly in
the image. This is in contrast to frequency domain filtering, which involves transforming the
image into the frequency domain using techniques like Fourier transform.

21
Q4. Write the basic steps of filtering in the frequency domain.

Answer:
Filtering in the frequency domain involves transforming an image from the spatial domain to the
frequency domain, applying filtering operations, and then transforming it back to the spatial
domain. Here are the basic steps:

1. Image Transformation to Frequency Domain:


i. Fast Fourier Transform (FFT): Convert the spatial domain image to the frequency domain
using FFT or other Fourier transform techniques. This step decomposes the image into its
frequency components.

2. Filtering in Frequency Domain:


i. Design the Frequency Domain Filter: Create a filter in the frequency domain that represents
the desired frequency response. This filter could be designed to pass, block, or modify specific
frequency components of the image. Common filters include low-pass, high-pass, and band-
pass filters.
ii. Multiply the Frequency Domain Image by the Filter: Point-wise multiply the image's
frequency domain representation by the frequency domain filter. This process modulates the
image's frequency components based on the characteristics of the filter.

3. Inverse Transformation to Spatial Domain:


i. Inverse Fourier Transform: Apply the inverse Fourier transform (usually FFT's inverse, such
as IFFT) to convert the filtered image back to the spatial domain. This step reconstructs the
image using the modified frequency components.

The filtering in the frequency domain allows for efficient processing of specific frequency
information, making it useful for tasks like image enhancement, noise reduction, and feature
extraction.

Q5. What is the principle objectives of image enhancement?


Answer:

The followings are the principle objectives of image enhancement:

i. Improved Visual Quality:


Enhance the overall appearance of the image to make it visually more appealing and clearer.
ii. Increased Interpretability:
Highlight relevant details and reduce visual noise to improve the interpretability of the image.
iii. Feature Enhancement:
Emphasize specific features or patterns in the image to facilitate easier analysis and
understanding.

22
iv. Better Information Extraction:
Facilitate the extraction of meaningful information by optimizing contrast, brightness, and
overall image quality.
v. Adaptation to Human Perception:
Tailor the image to better match human perceptual characteristics for improved human-
computer interaction and interpretation.

Q6. Describe image enhancement method for spatial domain.

Answer:

Image enhancement refers to the process of improving the visual appearance of an image to make
it more suitable for analysis or display. The goal is to highlight certain features of interest or to
improve the overall quality of the image. Image enhancement techniques can be applied to both
digital and analog images and are commonly used in various fields. There are different methods for
image enhancement, and they can be broadly categorized into two types: spatial domain method
and frequency domain method.
In the spatial domain, image enhancement involves direct manipulation of pixel values to improve
visual quality. Contrast stretching expands the range of pixel intensities, particularly useful for low-
contrast images. Histogram equalization redistributes pixel intensities to achieve a uniform
histogram, enhancing overall contrast. Intensity transformation functions, like gamma correction,
apply mathematical functions to pixel values for desired modifications. Spatial filtering, using
convolution with filters like smoothing or sharpening kernels, considers pixel relationships for
enhancement. Edge enhancement methods highlight edges through filters like Laplacian or Sobel.
Additionally, spatial domain interpolation techniques increase image resolution, crucial for
applications such as zooming.

Q7. Describe frequency domain method.

Answer:
The frequency domain method in image processing involves a different way of looking at and
enhancing images. Instead of dealing directly with pixels and their values, this method transforms
the image into its frequency components using a mathematical technique called Fourier Transform.
In the frequency domain, an image is represented by its various frequencies, distinguishing between
slow changes (low frequencies) and rapid details like edges and textures (high frequencies). By
manipulating these frequency components, such as applying filters to enhance or suppress specific
details, we can modify the image in meaningful ways. This is done through operations like
convolution or multiplication. Once the desired changes are made, the Inverse Fourier Transform is
23
applied to convert the image back to its original pixel-based form. Frequency domain methods find
applications in tasks like image filtering for sharpening or blurring and are fundamental in image
compression techniques. The Fast Fourier Transform (FFT) is often employed to efficiently perform
these frequency transformations. In essence, the frequency domain method provides a unique
perspective on images, allowing for targeted enhancements by working directly with their
underlying frequency characteristics.

Q8. What is point processing? Discuss contrast stretching. / Discuss the contrast stretching
technique of image enhancement.

Answer:
Point processing, also known as pixel-wise processing or point-wise processing, is a fundamental
concept in image processing. It involves the modification of individual pixels in an image
independently based on their original intensity values. Each pixel's new value is determined solely
by its corresponding original value and a predefined mathematical operation or function.

In point processing, the transformation applied to each pixel is the same and does not depend on
the values of neighboring pixels. The basic form of point processing is often represented by the
equation:

Output pixel value = f (Input pixel value)

Here, f is a function or operation applied to the intensity of each pixel. Common point processing
operations include contrast adjustments, brightness corrections, gamma correction, and
thresholding.

Contrast stretching, also known as contrast enhancement or normalization, is a point processing


technique used in image processing to expand the range of pixel intensities in an image. The goal
is to improve the visibility of details by spreading the original intensity values across a wider range.
This process is particularly useful when an image's pixel values are concentrated in a narrow range,
resulting in poor contrast and limited differentiation between objects or features.

24
The basic idea behind contrast stretching involves linearly scaling the pixel values in the image to a
new range. The process can be mathematically described using the following formula:

Output pixel value=


(Input pixel value−Min_Intensity_in_Image)×(New_Max_Intensity−New_Min_Intensity)
+ New_Min_Intensity
Max_Intensity_in_Image−Min_Intensity_in_Image

Here, the input pixel value is the original intensity, Min_Intensity_in_Image and
Max_Intensity_in_Image are the minimum and maximum pixel values in the original image, and
New_Min_Intensity and New_Max_Intensity define the desired range for the output pixel values.
Below figure shows a typical transformation function used for Contrast Stretching.

By changing the location of points (r1, s1) and (r2, s2), we can control the shape of the transformation
function. For example,

When r1 =s1 and r2=s2, transformation becomes a Linear function.


When r1=r2, s1=0 and s2=L-1, transformation becomes a thresholding function.

When (r1, s1) = (rmin, 0) and (r2, s2) = (rmax, L-1), this is known as Min-Max Stretching.

When (r1, s1) = (rmin + c, 0) and (r2, s2) = (rmax – c, L-1), this is known as Percentile Stretching.

Q9. Why image subtraction and averaging are done?

Answer:

Image subtraction and averaging are common techniques in image processing with distinct
purposes:

25
 Image Subtraction:
i. Purpose: Image subtraction is performed to highlight the differences between two images.
It is commonly used for tasks such as object detection, motion analysis, and change detection.
ii. Procedure: For each pixel, subtract the corresponding pixel value in one image from the pixel
value in another image. The result is a new image that emphasizes areas where the original
images differ.

Applications:

i. Motion Detection: Subtracting consecutive frames in a video sequence can highlight


moving objects by revealing changes in pixel values.
ii. Medical Imaging: Subtracting pre- and post-contrast images can enhance the visibility of
contrast-enhanced structures in medical imaging.
iii. Quality Control: Detecting defects or changes in industrial or manufacturing processes by
comparing images taken at different times.

 Image Averaging:
i. Purpose: Image averaging is performed to reduce noise and enhance the signal-to-noise
ratio in an image. By combining multiple images, random noise is averaged out while the
underlying signal is reinforced.
ii. Procedure: For each pixel, calculate the average of the corresponding pixel values across
multiple images.

Applications:

i. Low-light Imaging: In scenarios with low light, averaging multiple frames can help improve
image quality by reducing the impact of random noise.
ii. Medical Imaging: Averaging multiple scans can enhance the clarity of medical images,
especially in situations with limited data or high noise levels.
iii. Remote Sensing: In satellite or aerial imaging, averaging multiple images acquired over time
can provide a clearer view of the Earth's surface by minimizing atmospheric effects and sensor
noise.

Image subtraction is employed to highlight differences between images, while image averaging is
used to reduce noise and enhance the overall quality of an image.

Q10. What is gray level? Mentions its significance in image processing.


Answer:

A gray level (or intensity level) refers to the brightness or darkness of a pixel in a grayscale image.
Grayscale images are composed of pixels, each assigned a specific gray level value that represents
its intensity. The gray level is typically represented by an integer ranging from 0 to 255 in an 8-bit
image, where 0 corresponds to black, 255 corresponds to white, and the values in between represent
various shades of gray.

26
Followings are the significance of gray level in image processing:

i. Image Representation: Gray levels form the basis for representing images in grayscale. Each
pixel's gray level contributes to the overall appearance of the image, allowing us to visualize
and interpret the content.

ii. Contrast and Brightness: Gray levels play a crucial role in defining the contrast and
brightness of an image. Adjusting the distribution of gray levels can enhance or reduce image
contrast, making objects more distinguishable or blending them together.

iii. Image Enhancement: Techniques such as contrast stretching, histogram equalization, and
gamma correction involve manipulating gray levels to improve the visual quality of images.
These methods aim to enhance specific features by adjusting the distribution of intensity
values.

iv. Thresholding: Thresholding is a process where pixels with gray levels above or below a
certain threshold are classified as foreground or background. This technique is commonly
used in image segmentation and object recognition.

v. Image Analysis: In computer vision and pattern recognition, the analysis of gray levels is
essential for tasks such as edge detection, texture analysis, and feature extraction..

vi. Quantization: Reducing the number of gray levels in an image (quantization) is a process
that can be used for compression or simplifying image data while preserving important visual
information.

Q11. Discuss gray level slicing.

Answer:
Gray level slicing, also known as intensity slicing or contrast stretching, is an image processing
technique used to highlight specific ranges of gray levels in an image while compressing or
disregarding others. This technique is particularly useful when the information of interest lies within
a specific intensity range, and other intensity values are less relevant. Gray level slicing is a form of
point processing, where each pixel's intensity is individually modified based on its original value.

It can be implemented in several ways, but the two basic themes are:

27
i. One approach is to display a high value for all gray levels in the range of interest and a low
value for all other gray levels.

ii. The second approach, based on the transformation brightens the desired range of gray levels
but preserves gray levels unchanged.

Approach 1 Approach 2

Display in one value (e.g Brightens or darkens the


white) all the values in the desired range of intensities
range of interest, and in but leaves all other intensity
another (e.g. black) all other levels in the image
intensities unchanged

Q12. Define histogram. Discuss histogram processing.


Answer:

The histogram is used for graphical representation of a digital image. A graph is a plot by the
number of pixels for each tonal value. Nowadays, image histogram is present in digital cameras.
Photographers use them to see the distribution of tones captured.
Histogram processing:

i. Histogram Analysis:

Histogram processing starts with analyzing the histogram of the input image, a graphical
representation of pixel intensity distribution, providing insights into the image's characteristics.
ii. Contrast Stretching:

28
Contrast stretching involves expanding the range of pixel intensities in an image through linear
scaling, enhancing overall contrast. This process makes image details more visible and improves
visual quality.

iii. Histogram Equalization:

Histogram equalization aims to create a more uniform histogram by mapping pixel values based on
the cumulative distribution function (CDF). It enhances visibility and is particularly useful for images
with limited contrast.

iv. Histogram Matching/Specification:

Histogram matching modifies an image's histogram to match a specified target histogram, adjusting
image characteristics accordingly. This is valuable for color correction, style transfer, and contrast
enhancement.
v. Adaptive Histogram Equalization:

Adaptive histogram equalization adapts to local variations in an image, enhancing both global and
local contrast. It divides the image into smaller regions, applying histogram equalization
independently to each region.

vi. Histogram Thresholding:

Histogram thresholding segments an image by setting intensity thresholds to classify pixels as


foreground or background. Pixels above or below the threshold are assigned to specific regions,
aiding image segmentation.

vii. Applications:

Histogram processing is applied in medical imaging, satellite imaging, computer vision, and
photography. It is crucial for improving image quality, enhancing visibility, and facilitating
subsequent image analysis tasks in various domains.

Q13. Discuss the process of histogram equalization.

Answer:

Histogram equalization is a technique used in image processing to enhance the contrast of an


image by adjusting the distribution of pixel intensities across the entire range. The primary goal is
to make the histogram of the image as uniform as possible, thereby spreading the pixel values and
enhancing the visibility of details.
The process of histogram equalization involves the following steps:

i. Compute the Histogram:

Calculate the histogram of the input image. The histogram represents the frequency of occurrence
of each pixel intensity level.

29
ii. Calculate the Cumulative Distribution Function (CDF):

Compute the cumulative distribution function (CDF) from the histogram. The CDF is obtained by
summing up the histogram values, providing information about the cumulative distribution of pixel
intensities.
iii. Normalize the CDF:

Normalize the CDF to the range [0, 1]. This step ensures that the transformation will map the original
pixel values to a new range without exceeding the valid intensity limits.

iv. Map Original Pixel Values to New Values:

For each pixel in the input image, replace the original intensity value with its corresponding value in
the normalized CDF. This mapping redistributes the pixel values, effectively equalizing the
histogram.

Output Pixel Value = CDF (Input Pixel Value)

v. Scale to Desire Intensity Range:

If necessary, scale the transformed pixel values to cover the desired intensity range. For an 8-bit
image, this range is typically [0, 255].

Final Output Pixel Value = Scale × Output Pixel Value

vi. Generate the Equalized Image:

The resulting pixel values form the histogram-equalized image. This image exhibits enhanced
contrast, and details that were previously obscured may become more visible.

Q14. Discuss histogram specification.

Answer:
Histogram specification, also known as histogram matching or histogram equalization with a
specified distribution, is an image processing technique used to modify the histogram of an image
to match a predefined target histogram. The goal is to transform the pixel intensities in the original
image so that its histogram closely resembles the specified histogram, effectively adjusting the
image to exhibit specific desired characteristics.
The process of histogram specification involves the following steps:

i. Compute the Histograms:

Calculate the histogram of the input image and the target histogram that represents the desired
intensity distribution.

30
ii. Compute the Cumulative Distribution Functions (CDFs):

Calculate the cumulative distribution functions (CDFs) for both the input image and the target
histogram. The CDF represents the cumulative distribution of pixel intensities.
iii. Normalize the CDFs:

Normalize the CDFs to the range [0, 1] to ensure that the transformation does not exceed valid
intensity limits.

iv. Map Original Pixel Values to New Values:

For each pixel in the input image, replace the original intensity value with its corresponding value in
the normalized CDF of the target histogram. This mapping transforms the pixel values to match the
desired histogram.

Output Pixel Value = CDFTarget (CDF-1Input (Input pixel value))

Hare, CDF-1Input is the inverse function of the normalized CDF of the input image.
v. Scale to Desired Intensity Range:

If necessary, scale the transformed pixel values to cover the desired intensity range, such as [0, 255]
for an 8-bit image.

Final Output Pixel Value = Scale × Output Pixel Value


vi. Generate the Specified Histogram Image:

The resulting pixel values form the image with the specified histogram. This image should exhibit
characteristics similar to the desired target distribution.

Histogram specification is useful in various applications, including:

i. Color Correction: Matching the histogram of an image to a reference histogram can be used
for color correction, ensuring consistency across different images or scenes.
ii. Style Transfer: Adjusting the histogram of an image to match the histogram of an artistic
reference can be used for style transfer, giving an image a specific artistic or visual style.
iii. Contrast Enhancement: By specifying a target histogram, one can tailor the intensity
distribution to enhance specific features or improve visual quality according to specific
requirements.

31
Q15. Define tristimulus and trichromatic coefficient.

Answer:

Tristimulus refers to a set of three numerical values (X, Y, Z) used to represent the color of a stimulus
in a color space, often associated with the three primary color components of the human visual
system.

Trichromatic coefficients are the weights assigned to primary colors in a trichromatic system (e.g.,
RGB or XYZ), determining their contribution to the representation of a specific color. They play a
crucial role in color reproduction and analysis.

Q16. Show that histogram equalization transformation has a uniform probability density
function (pdf). [Incomplete]

Answer:

32
Q17. Describe High-boost filter.

Answer:
A high-boost filter is an image enhancement filter used in image processing to sharpen or enhance
the details of an image. It emphasizes high-frequency components, such as edges and fine details,
while suppressing low-frequency components, which typically represent smoother regions or
gradual transitions in intensity. The high-boost filter is particularly effective in enhancing the clarity
and fine details of an image.
Formula:

HPF = Original image - Low frequency components

LPF = Original image - High frequency components


HBF = A * Original image - Low frequency components

= (A - 1) * Original image + [Original image - Low frequency components]

= (A - 1) * Original image + HPF


Here,

 HPF = High pass filtering, which means the higher frequency components are allowed to pass
while low-frequency components are discarded from the original image.
 LPF = Low pass filtering, which means the lower frequency components are allowed to pass
while high-frequency components are discarded from the original image.

Q18. Show that the high pass filter image can be obtained in the spatial domain as:

High pass = original-low pass. For simplicity, assume 3*3 filters.

Answer:

In this case, we'll assume that the filter is a simple averaging (low-pass) filter, denoted as L. The high-
pass filter can then be expressed as H=I−L, where I is the original image.

Original Image (I):

Let I be the original image.


Low-pass Filtering (L):

Apply a 3x3 low-pass filter (L) to the original image. The low-pass filter is typically a simple averaging
filter.
1
L(x, y) = 9 ∑1𝑖=−1 ∑1𝑗=−1 𝐼(𝑥 + 𝑖, 𝑦 + 𝑗)

33
High-pass Filtering (H):

Subtract the low-pass filtered image from the original image to obtain the high-pass filtered image.
H(x,y)=I(x,y)−L(x,y)

Now, let's expand this expression using the low-pass filter definition:
1
H(x,y)=I(x,y)− ∑1𝑖=−1 ∑1𝑗=−1 𝐼(𝑥 + 𝑖, 𝑦 + 𝑗)
9

This demonstrates that the high-pass filter image (H) can indeed be obtained in the spatial domain
as the difference between the original image (I) and the low-pass filtered image (L).

Q19. Define an edge of a digital image. Discuss how gradient operators are used to detect
edge of an image.

Answer:

Edges are significant local changes of intensity in a digital image. An edge can be defined as a set
of connected pixels that forms a boundary between two disjoint regions. There are three types of
edges:

i. Horizontal edges
ii. Vertical edges
iii. Diagonal edges

Gradient operators are used for edge detection in images by calculating the rate of intensity
change at each pixel. Common operators like Sobel compute the gradient magnitude, highlighting
regions with rapid intensity transitions. The process involves thresholding and non-maximum
suppression to refine edge maps, providing valuable information for tasks like object recognition
and segmentation in computer vision.

Q20. Explain about different derivative filter.[Incomplete]

Answer:

Derivative filters provide a quantitative measurement for the rate of change in pixel brightness
information present in a digital image. When a derivative filter is applied to a digital image, the
resulting information about brightness change rates can be used to enhance contrast, detect edges
and boundaries, and to measure feature orientation.

34
Q21. What is filtering?

Answer:

Filtering techniques are used to enhance and modify digital images. Also, images filters are used to
blurring and noise reduction, sharpening and edge detection. Image filters are mainly use for
suppress high (smoothing techniques) and low frequencies (image enhancement, edge detection).

Classification of image filters is as follows:

Q22. Describe discrete cosine transform.


Answer:

In Digital Image Processing (DIP), the cosine transform is often used as a tool for image compression.
The specific form of cosine transform that finds applications in DIP is the Discrete Cosine Transform
(DCT). It is particularly popular in applications such as JPEG (Joint Photographic Experts Group)
compression for still images and MPEG (Moving Picture Experts Group) compression for videos. The
primary purpose of the DCT in these contexts is to efficiently represent image and video data in a
way that allows for compression while preserving visual quality to an acceptable degree.
Mathematical Definition of 1D DCT:

For a sequence of N value x0, x1, x2,………………., xN-1 the 1D DCT is given by the formula:

π 1
Xk = ∑N−1
n=0 𝑥𝑛cos ( k (n + ))
N 2

Where:

Xk is the DCT coefficient at frequency k.


xn is the input sequence element at index n.

35
The summation is over all elements of the input sequence.

Mathematical Definition of 2D DCT:

For a given N×N block of pixel values in an image, the 2D DCT is computed using the following
formula:
N (2x+1)uπ (2y+1)vπ
F (u,v)= C(u)C(v) ∑N−1 N−1
x=0 ∑y=0 f(x, y)cos[ ] cos[ ]
2 2N 2N

Where,
f (x,y) is the pixel value at coordinates (x, y) in the block.
F(u, v) is the DCT coefficient at spatial frequencies u and v.

C(u) and C(v) are normalization factors:


1
, if u = 0
√2
C(u)= ∫1,
Otherwise

36
Image Restoration
Q1. Discuss the model of the image Degradation /Restoration process.

Answer:
Image Degradation Model:

1. Blur:
 Images may suffer from blurring due to factors such as motion, defocus, or atmospheric
conditions.
 Blur can be modeled using a convolution operation with a point spread function (PSF),
representing how a point in the image spreads.
2. Noise:
 Image degradation often involves the addition of noise, which can be modeled as an additive
or multiplicative process.
 Common types of noise include Gaussian, salt-and-pepper, and Poisson noise.
3. Compression:
 During storage or transmission, images can be compressed, leading to loss of information.
 Compression artifacts, such as blockiness or ringing, are common in degraded images.
4. Resolution Reduction:
 Decreasing the resolution of an image can occur during various processes, impacting the
clarity and detail.

Image Restoration Model:

1. Inverse Filtering:
 In an ideal scenario, one could apply the inverse of the degradation process to restore the
image.
 However, this can amplify noise and is often impractical.
2. Wiener Filtering:
 Wiener filtering is a frequency-domain approach that balances noise reduction and
preservation of important image features.
 It minimizes the mean square error between the estimated and original images.
3. Iterative Restoration:
 Iterative methods involve refining an estimate of the original image through multiple
iterations.
 Algorithms like Richardson-Lucy are used for tasks like image deblurring.

4. Regularization Techniques:
 Regularization methods are applied to stabilize the restoration process and prevent over
fitting.
 Total Variation regularization, for instance, helps preserve edges while removing noise.

37
Color Image Processing
Q1. Discuss various color models.

Answer:
Color models are mathematical models describing the way colors can be represented as tuples of
numbers. These models provide a systematic way to express and manipulate colors in various
applications such as computer graphics, image processing, and design. Here are some commonly
used color models:

1. RGB (Red, Green, Blue):


i. Description: RGB is an additive color model where colors are represented as combinations
of red, green, and blue light.
ii. Representation: Colors are specified by intensity values for each of the three primary colors
(0-255 in an 8-bit per channel system).
iii. Applications: Used in digital displays, monitors, cameras, and web design.

2. CMY and CMYK (Cyan, Magenta, Yellow, Black):


i. Description: CMY is a subtractive color model used in color printing. CMYK is an extension
that includes a black key channel.
ii. Representation: Colors are represented by subtracting percentages of light-absorbing inks.
iii. Applications: Commonly used in color printing and design for producing a wide range of
colors.

3. HSV (Hue, Saturation, Value):


i. Description: Represents colors using their hue, saturation, and value/brightness.
ii. Representation: Hue is the type of color (e.g., red, green), saturation is the intensity or
vividness, and value is the brightness.
iii. Applications: Used in graphic design, image editing, and color adjustment tools.

4. HSL (Hue, Saturation, Lightness):


i. Description: Similar to HSV but uses lightness instead of value, where lightness represents
the perceived brightness.
ii. Representation: Hue, saturation, and lightness values determine the color.
iii. Applications: Graphic design, web development, and image editing.

5. YUV:
i. Description: YUV separates image luminance (Y) from chrominance (U and V), facilitating
image compression.
ii. Representation: Y represents brightness, while U and V represent color information.

38
Q2. Briefly describe RGB color model.

Answer:

The RGB color model is one of the most widely used color representation method in computer
graphics. It use a color coordinate system with three primary colors: R(red), G(green), B(blue)

Each primary color can take an intensity value ranging from 0(lowest) to 1(highest). Mixing these
three primary colors at different intensity levels produces a variety of colors. The collection of all the
colors obtained by such a linear combination of red, green and blue forms the cube shaped RGB
color space.

The corner of RGB color cube that is at the origin of the coordinate system corresponds to black,
whereas the corner of the cube that is diagonally opposite to the origin represents white. The
diagonal line connecting black and white corresponds to all the gray colors between black and white,
which is also known as gray axis.

In the RGB color model, an arbitrary color within the cubic color space can be specified by its color
coordinates: (r, g, b).

Example:
(0, 0, 0) for black, (1, 1, 1) for white,

(1, 1, 0) for yellow, (0.7, 0.7, 0.7) for gray

Color specification using the RGB model is an additive process. We begin with black and add on
the appropriate primary components to yield a desired color. The concept RGB color model is
used in Display monitor.

39
Q3. What are the purpose of a color model in image processing? Derive the conversion
formula to go from RGB to HSI color model.

Answer:
Following are the purpose of a color model in image processing:

1. Color models quantify and express colors numerically, facilitating storage and processing of
color information in images.
2. They allow for precise color adjustments, corrections, and enhancements in image processing
applications.
3. Color models provide standardized frameworks for consistent color representation across
devices and platforms.
4. These models support the analysis of image content by breaking down color information into
components for tasks like object detection.
5. Color models enable conversion between different color spaces, ensuring interoperability
among devices and applications.
6. Some models, like CIELAB, aim for perceptual uniformity, aligning numerical color differences
with perceived differences for accurate color reproduction.

40
Wavelet and other image transformation

Q1. Define Haar transform and Haar basis function.


Answer:

The Haar transform is an orthogonal wavelet transform used in signal and image processing,
decomposing a signal or image into a set of basis functions to capture details at different scales and
locations.

Haar basis functions are the elementary components of the Haar transform, characterized by
piecewise constant patterns. The mother wavelet represents positive changes, and the father wavelet
represents negative changes, forming the basis for signal decomposition and analysis.

Q2. What is wavelet?

Answer:

A wavelet is a mathematical function used in compression of images and digital signal processing.
It is in fact a basis function that can be isolated with respect to frequency/wavenumber and
time/spatial location.

Q3. Explain discrete wavelet transform for image.

Answer:
The Discrete Wavelet Transform (DWT) is a mathematical transformation widely used in image
processing for analyzing and representing images in a hierarchical and multi-resolution manner.
DWT decomposes an image into approximation and detail coefficients, allowing the extraction of
information at different scales.

1. The DWT starts by decomposing an image into approximation (low-frequency) and detail (high-
frequency) components along both the horizontal and vertical directions.
2. The image is convolved with a low-pass filter (also called the scaling or approximation filter)
and a high-pass filter (detail filter).
3. After filtering, the resulting signals are down sampled by keeping only every second sample.
This reduces the resolution of the signal but preserves its essential information.
4. The process is then repeated on the approximation signal to obtain further levels of
approximation and detail coefficients. Each level of decomposition results in a set of
approximation and detail coefficients at a different scale.
5. The output of the DWT can be represented as an image pyramid, with the top level containing
the coarsest approximation and each subsequent level containing finer details.

41
6. DWT is used in image compression algorithms, such as JPEG2000, where it provides a multi-
resolution representation that facilitates efficient coding. It is also used in image denoising,
where high-frequency noise can be separated from low-frequency signal components.

Q4. Discuss application of wavelet analysis in image processing.

Answer:

Wavelet transforms are widely used for image compression. The ability to represent images in both
low and high-frequency components allows for more efficient compression compared to traditional
methods like JPEG.
Followings are the applications of wavelet analysis in image processing:

1. Wavelet analysis provides a multi-resolution representation of images, allowing for the


analysis of features at different scales.
2. Wavelet transforms are powerful tools for edge detection in images. High-frequency
components obtained from wavelet analysis highlight edges in the image.
3. Wavelet analysis can be employed for feature extraction in images.
4. Wavelet-based techniques are employed in image watermarking to embed and retrieve
hidden information within images while maintaining perceptual quality.
5. Wavelet-based techniques are employed in image watermarking to embed and retrieve
hidden information within images while maintaining perceptual quality.
6. Wavelet analysis is used in biomedical image processing for tasks such as image enhancement,
feature extraction, and classification.

Q5. What do you mean by multi-resolution image?

Answer:
A multi-resolution image refers to an image that has been represented or decomposed into
multiple levels or scales of detail. In the context of image processing and analysis, a multi-resolution
representation is achieved through techniques such as pyramid structures or wavelet transforms.
Each level of the multi-resolution image contains information about the image content at a specific
scale.

42
Image Compression
Q1. What is an image compression? Why it is performed?

Answer:

Image compression is the process of reducing the size of an image file while retaining as much of
its visual quality as possible. The goal is to represent the image in a more compact form, which is
especially crucial when dealing with large image files that occupy significant storage space or need
to be transmitted over networks with limited bandwidth. Compression techniques aim to eliminate
redundant or irrelevant information in the image while preserving essential details.

Image compression is performed, because:

1. Compressing images reduces the amount of storage space required. This is particularly
important in applications where a large number of images need to be stored, such as in
databases, websites, and archival systems.
2. Smaller image files can be transmitted more quickly over networks, which is essential for
applications like web browsing, streaming, and mobile communication where fast data
transfer is crucial.
3. Image compression helps conserve bandwidth, making it more efficient to transmit images
over the internet or other communication channels.
4. Reduced storage requirements and faster transmission lead to cost savings in terms of
storage infrastructure and network bandwidth.
5. Compressed images can lead to improved performance in applications that involve image
processing, as the reduced data size allows for faster processing and manipulation of images.
6. Compressed images are more suitable for long-term data archiving.

Q2. Write the applications of image compression.

Answer:
1. Faster loading times for websites and mobile applications.
2. Quicker uploads and downloads on social platforms and messaging apps.
3. Efficient storage and transmission of medical images in telemedicine.
4. Transmission of large datasets in satellite imagery and remote sensing.
5. Storage and transmission efficiency in video surveillance systems.
6. Space-saving and improved efficiency in archiving and databases.
7. Efficient transmission of video content in broadcasting and streaming.
8. Reduced file sizes in digital cameras for increased storage.
9. Quick rendering of graphics in video games and virtual reality.
10. Optimized product images for better website performance.

43
Q3. Draw and explain the functional block diagram of a general image compression system.

Answer:

Fig: Functional block diagram of a general image compression system

1. Source image:
The “Source image” represents the input image data.

2. Forward image:

The "Forward Transform" block represents the transformation of the preprocessed image data using
a chosen transform (e.g., Discrete Cosine Transform or Discrete Wavelet Transform).
3. Quantization:

The "Quantization" block reduces the precision of transformed coefficients to achieve compression.

4. Entropy Encoding:

The "Entropy Encoding" block encodes the quantized coefficients to further reduce data size.
5. Storage/ Transmission:

The "Storage/Transmission" block represents the storage of the compressed bit stream or its
transmission over a communication channel.

6. Entropy Decoding:
The "Entropy Decoding" block decodes the entropy-encoded data.

7. Inverse Transform:

The "Inverse Transform" block applies the inverse of the forward transform to reconstruct the
coefficients.
8. Reconstructed image:
The "Reconstructed Image" blocks represent the final reconstructed image data.

44
Q4. Classify image compression techniques.

Answer:
The image compression techniques are broadly classified into two categories. These are:

1. Lossless compression techniques 2. Lossy compression techniques

Lossless compression technique refers to a process of resizing the images into a smaller version.
This technique does not fiddle with the image quality. Lossless image compression is particularly
useful when compressing text. That is because a small change in the original version can
dramatically change the text or data meaning.

For example, it will convert an image of 15 MB to 10 MB. However, it will still be too large to
display on a webpage.

Lossy compression technique reduces the image size by removing some of the image parts. It
eliminates the tags that are not very essential. Lossy compression works with a quality parameter
to measure the change in quality. In most cases, you have to set this parameter. If it is lower than
90, the images may appear low quality to the human eye.

For example, we can convert an image of 15 MB into 2200 Kb as well as 400 Kb.

Q5. List some image compression techniques.

Answer:

There are various image compression techniques, each with its own characteristics and applications.
Here is a list of some commonly used image compression techniques:

1. JPEG (Joint Photographic Experts Group):

Type: Lossy Compression

Description: Uses the Discrete Cosine Transform (DCT) to transform the image data. It allows
adjusting the trade-off between compression ratio and image quality.
2. JPEG2000:

Type: Wavelet-based Compression

Description: Utilizes the Discrete Wavelet Transform (DWT) and offers both lossless and lossy
compression. Known for its ability to provide high compression ratios with minimal loss.
3. PNG (Portable Network Graphics):

Type: Lossless Compression

Description: Commonly used for lossless compression, especially for images with transparency. It
employs a non-patented compression algorithm.

45
4. GIF (Graphics Interchange Format):

Type: Lossless Compression

Description: Primarily used for simple graphics and animations. Supports a limited color palette
and uses LZW compression.

5. WebP:

Type: Lossy and Lossless Compression

Description: Developed by Google, WebP is designed for both lossy and lossless compression. It
often provides smaller file sizes than JPEG and PNG.

6. BPG (Better Portable Graphics):

Type: Lossy and Lossless Compression

Description: Utilizes the High-Efficiency Video Coding (HEVC) standard for both lossy and lossless
compression. Offers high compression efficiency.

Q6. What are the lossy compression?


Answer:

Lossy compression reduces the image size by removing some of the image parts. It eliminates the
tags that are not very essential. Lossy compression works with a quality parameter to measure the
change in quality. In most cases, you have to set this parameter. If it is lower than 90, the images
may appear low quality to the human eye.

For example, we can convert an image of 15 MB into 2200 Kb as well as 400 Kb.

Q7. Write the advantages of lossy compression over lossless compression.

Answer:

Followings are the advantages of lossy compression over lossless compression:

1. Lossy compression typically achieves higher compression ratios compared to lossless


compression.
2. Lossy compression results in significantly smaller file sizes, leading to reduced storage
requirements.
3. Smaller file sizes in lossy compression translate to faster transmission and download speeds.
4. Lossy compression is more bandwidth-efficient.
5. Lossy compression is well-suited for streaming applications, such as video streaming services.
6. Lossy compression algorithms often provide flexibility in adjusting compression levels.
7. Lossy compression is widely used in multimedia applications such as digital photography,
video streaming, and audio encoding.
46
Q8. Write the full form of items: i) JPEG ii) PNG iii) BMP

Answer:
Here are the full forms of the mentioned items:

i) JPEG: Joint Photographic Experts Group

ii) PNG: Portable Network Graphics


iii) BMP: Bitmap

Q9. Write the algorithm of JPEG compression.


Answer:

Algorithm of JPEG Data Compression:

1. Splitting
We split our image into the blocks of 8*8 blocks. It forms 64 blocks in which each block is referred
to as 1 pixel.

2. Color Space Transform


In this phase, we convert R, G, B to Y, Cb, Cr model. Here Y is for brightness, Cb is color blueness
and Cr stands for Color redness. We transform it into chromium colors as these are less sensitive
to human eyes thus can be removed.

3. Apply DCT
We apply direct cosine transform on each block. The discrete cosine transform (DCT) represents
an image as a sum of sinusoids of varying magnitudes and frequencies.

4. Quantization
In the Quantization process, we quantize our data using the quantization table.

5. Serialization
In serialization, we perform the zig-zag scanning pattern to exploit redundancy.

6. Vectoring
We apply DPCM (differential pulse code modeling) on DC elements. DC elements are used to
define the strength of colors.

7. Encoding
In the last stage, we apply to encode either run-length encoding or Huffman encoding. The main
aim is to convert the image into text and by applying any encoding we convert it into binary form
(0, 1) to compress the data.

47
Q10. Define data redundancy and relative data redundancy.

Answer:

Data redundancy refers to the existence of duplicated or unnecessary data within a dataset, system,
or storage medium. It occurs when the same information is stored in multiple locations or in multiple
forms, leading to inefficiencies in terms of storage space and data processing. Redundancy can arise
due to various reasons, including poor database design, data entry errors, or the use of multiple
data sources.

Relative data redundancy is a measure of the extent to which data is duplicated or repeated within
a dataset compared to the minimum amount required for proper functioning. It is often expressed
as a ratio or percentage, indicating the proportion of redundant data in relation to the total amount
of data. The goal is to minimize relative data redundancy to optimize storage and improve data
processing efficiency.

Q11. Explain the coding redundancy.


Answer:

Coding redundancy, also known as coding inefficiency, occurs when the representation of data in
a given coding scheme or format contains more bits than necessary. It represents a form of
redundancy in the encoding of information and can be minimized through more efficient coding
techniques.

Coding redundancy is a crucial consideration in data compression. By minimizing coding


redundancy, compression algorithms can achieve higher compression ratios, reducing the size of
encoded data without loss of information. Efficient coding techniques are fundamental to various
compression standards, such as those used in image, audio, and video compression.

Minimizing Coding Redundancy:


1. Variable-Length Coding:

Implementing variable-length coding, where different symbols are represented using varying
numbers of bits based on their frequency of occurrence, can significantly reduce coding redundancy.

2. Huffman Coding:

Huffman coding is a widely used variable-length coding technique that assigns shorter codes to
more frequent symbols and longer codes to less frequent symbols. It helps minimize the average
code length, reducing coding redundancy.

3. Arithmetic Coding:

Arithmetic coding is another technique that represents entire messages with a single code. It assigns
shorter codes to more probable sequences, effectively reducing redundancy.

48
4. Entropy Coding:

Entropy coding methods, such as Huffman coding and arithmetic coding, exploit the statistical
properties of the data to create more efficient representations, minimizing redundancy.
5. Run-Length Coding:

In cases where consecutive identical symbols occur, run-length coding can be employed to
represent the repeated symbols with a count, reducing redundancy.

Q12. Describe psychovisual redundancy.

Answer:

Psychovisual redundancy refers to the concept in image and video compression where certain
details in visual data are deemed redundant from a human visual perception standpoint.
Psychovisual redundancy include:

1. Visual Masking:

The human visual system tends to focus on certain regions of an image, and details in less attended
areas may be less perceptible. Psychovisual redundancy takes advantage of this by allocating more
bits to important regions and fewer bits to less critical areas.
2. Color Perception:

The sensitivity of the human eye varies across different colors. Psychovisual redundancy
considerations take into account that certain color information may be less crucial than others,
allowing for more efficient encoding.

3. Spatial and Temporal Sensitivity:

The human eye is more sensitive to changes in luminance and detail in certain spatial and temporal
frequencies. Compression algorithms exploit this sensitivity to allocate more bits to critical frequency
components and fewer bits to less critical ones.
4. Texture and Edge Perception:

Psychovisual redundancy accounts for the fact that human vision is highly sensitive to texture and
edges. Compression techniques prioritize preserving these features while reducing information in
less perceptually significant areas.

5. Adaptive Quantization:

Adaptive quantization is a technique used to allocate more bits to regions with high visual
complexity and fewer bits to simpler areas. This dynamic allocation is based on psychovisual models
to optimize compression efficiency.

49
6. Perceptual Coding Models:

Advanced compression methods incorporate perceptual coding models that simulate aspects of
human vision. These models guide the compression process by identifying areas where details can
be reduced without a significant impact on perceived quality.

Q13. What is variable length coding? /Explain variable length coding for image
compression.

Answer:

Variable Length Coding is a technique used in image compression to represent symbols with
variable-length codes, where more frequently occurring symbols are assigned shorter codes, and
less frequent symbols are assigned longer codes. This contrasts with fixed-length coding, where
each symbol is represented by a fixed number of bits.

Steps in Variable Length Coding for Image Compression:

1. Frequency Analysis:

Determine the frequency of occurrence of each symbol (e.g., pixel intensity values) in the image.
More frequent symbols will be assigned shorter codes.

2. Code Assignment:

Assign variable-length codes to each symbol based on its frequency. The goal is to assign shorter
codes to more probable symbols, reducing the average code length.
3. Creation of Variable-Length Code Table:

Create a table that maps each symbol to its corresponding variable-length code. This table is crucial
for encoding and decoding processes.

4. Image Encoding:

Replace each symbol in the image with its corresponding variable-length code based on the code
table. This process generates a compressed representation of the image.
5. Decoding:

During decoding, the variable-length codes are translated back into the original symbols using the
same code table. This allows for the reconstruction of the compressed image.

50
Q14. Explain Huffman coding technique.

Answer:

Huffman coding is a variable-length coding technique widely used for data compression, including
image compression. It was developed by David A. Huffman in 1952 and is particularly effective in
situations where certain symbols or elements occur more frequently than others.

Steps in Huffman Coding:

1. Frequency Analysis:

Determine the frequency of occurrence of each symbol (e.g., pixel intensity values in image
compression). Symbols with higher frequencies will be assigned shorter codes.

2. Node Creation:

Create a leaf node for each symbol and assign a weight to each node based on its frequency.
3. Node Merging:

Merge the two nodes with the lowest weights to create a new internal node. Update the weight of
the new node to be the sum of the weights of the merged nodes.

4. Repeat Node Merging:

Repeat the merging process until only one node (the root node) remains. This final node represents
the entire set of symbols.

5. Code Assignment:

Assign binary codes to each symbol based on the path from the root to the leaf node. Assign shorter
codes to symbols with higher frequencies.
6. Creation of Huffman Tree:

The resulting structure is a binary tree known as the Huffman tree, where each leaf node corresponds
to a symbol and the binary code is obtained by traversing the tree from the root to the leaf.
Advantages of Huffman Coding:

1. Huffman coding achieves efficient compression, with more frequent symbols represented by
shorter codes.
2. Huffman codes are prefix-free, meaning no code is a prefix of another. This property simplifies
decoding as there is no ambiguity.
3. Variable-length coding allows for optimal utilization of bits, reducing the overall size of the
encoded data.

51
Morphological image processing
Q1. Define the terms Dilation and Erosion.
Answer:

Dilation is a morphological operation in image processing that involves adding pixels to the
boundaries of objects in an image. The process enlarges or thickens the shapes within an image. It
is commonly used to bridge small gaps in objects, connect disjointed parts, and make objects more
visible.

Erosion is another morphological operation in image processing that involves removing pixels from
the boundaries of objects in an image. The process erodes or shrinks the shapes within an image.
Erosion is useful for separating objects, removing small protrusions, and simplifying the shapes in
an image.
Key points:

1. Dilation adds pixels to the object boundaries.


2. Erosion removes pixels from the object boundaries.
3. Both operations use a structuring element, which determines the neighborhood considered
during the operation.
4. Dilation and erosion are fundamental operations in morphological image processing and are
often combined in various ways for tasks like opening and closing operations.

Q2. List some properties of dilation and erosion.


Answer:

Properties of Dilation:

1. Expansion: Dilation expands or thickens objects in an image, making them larger.


2. Connectivity: Dilation can connect disjointed parts of objects in an image, closing small gaps
and creating a more connected structure.
3. Shape Changes: Dilation can cause changes in the shape of objects, especially when the
structuring element used is not symmetrical.
4. Brightness Increase: In grayscale images, dilation can result in an increase in pixel intensities,
making objects appear brighter.
5. Binary Image Operation: In binary images, dilation causes regions with pixel value 1 to
expand, and regions with pixel value 0 to remain unchanged.

52
Properties of Erosion:

1. Contraction: Erosion shrinks or thins objects in an image, making them smaller.


2. Separation: Erosion can separate adjacent objects by removing pixels from their shared
boundaries.
3. Shape Preservation: Erosion tends to preserve the essential shape of objects, especially when
the structuring element used is small.
4. Brightness Decrease: In grayscale images, erosion can result in a decrease in pixel intensities,
making objects appear darker.
5. Binary Image Operation: In binary images, erosion causes regions with pixel value 1 to shrink,
and regions with pixel value 0 to remain unchanged.
6. Boundary Effect: Erosion may lead to the disappearance of small details and fine structures
at the boundaries of objects.

Q3. Explain morphological reconstruction by dilation and prison.

Answer:

Morphological reconstruction is a technique used in image processing to extract or enhance certain


features in an image. It is commonly employed for tasks like region filling, image segmentation, and
feature extraction. Two main operations involved in morphological reconstruction are dilation and
erosion.
Reconstruction by Dilation:

1. Marker Image (M):

Begin with a marker image (M) that represents the initial feature or structure of interest. This marker
image is typically a subset of the original image.

2. Dilation (⊕):

Perform dilation (⊕) on the marker image (M) using a structuring element. The result is dilated, and
the structuring element is adjusted based on the characteristics of the feature being reconstructed.

3. Intersection with Mask (I):

Take the intersection (I) of the dilated image and a mask image (I), where the mask image is derived
from the original image. The mask restricts the dilation to certain regions of the original image.
4. Update Marker Image (M):

Update the marker image (M) by taking the maximum pixel values from the dilated image and the
intersection with the mask.

5. Repeat:
Repeat the process iteratively until the marker image stabilizes, i.e., no further changes occur.

53
Reconstruction by Erosion:

1. Marker Image (M):

Begin with a marker image (M) that represents the initial feature or structure of interest. This marker
image is typically a subset of the original image.

2. Erosion (⊖):

Perform erosion (⊖) on the marker image (M) using a structuring element. The result is eroded, and
the structuring element is adjusted based on the characteristics of the feature being reconstructed.
3. Union with Mask (U):

Take the union (U) of the eroded image and a mask image (I), where the mask image is derived from
the original image. The mask restricts the erosion to certain regions of the original image.

4. Update Marker Image (M):

Update the marker image (M) by taking the minimum pixel values from the eroded image and the
union with the mask.

5. Repeat:

Repeat the process iteratively until the marker image stabilizes, i.e., no further changes occur.

54

You might also like