Dip R20 Unit-5 Notes
Dip R20 Unit-5 Notes
UNIT-6
Morphological Image Processing
Introduction
The word morphology commonly denotes a branch of biology that deals with the form
and structure of animals and plants. Morphology in image processing is a tool for extracting
image components that are useful in the representation and description of region shape, such
as boundaries and skeletons. Furthermore, the morphological operations can be used for
filtering, thinning and pruning. The language of the Morphology comes from the set theory,
where image objects can be represented by sets.
Some Basic Concepts form Set Theory :
If every element of a set A is also an element of another set B, then A is said to be a
subset of B, denoted as A ⊆ B
The union of two sets A and B, denoted by C = A∪B
The intersection of two sets A and B, denote by D = A∩B
Dilation
Dilation is used for expanding an element A by using structuring element B. Dilation
of A by B and is defined by the following equation:
This equation is based on obtaining the reflection of B about its origin and shifting
this reflection by z. The dilation of A by B is the set of all displacements z, such that and A
overlap by at least one element. Based On this interpretation the equation of (9.2-1) can be
rewritten as:
Dilation is typically applied to binary image, but there are versions that work on gray
scale image. The basic effect of the operator on a binary image is to gradually enlarge the
boundaries of regions of foreground pixels (i.e. white pixels, typically). Thus areas of
foreground pixels grow in size while holes within those regions become smaller.
Any pixel in the output image touched by the dot in the structuring element is set to
ON when any point of the structuring element touches a ON pixel in the original image. This
tends to close up holes in an image by expanding the ON regions. It also makes objects
larger. Note that the result depends upon both the shape of the structuring element and the
location of its origin.
Summary effects of dilation:
Expand/enlarge objects in the image
Fill gaps or bays of insufficient width
Fill small holes of sufficiently small size
Connects objects separated by a distance less than the size of the window
Erosion
Erosion is used for shrinking of element A by using element B. Erosion for Sets A
and B in Z2, is defined by the following equation:
This equation indicates that the erosion of A by B is the set of all points z such that B,
translated by z, is combined in A.
Any pixel in the output image touched by the · in the structuring element is set to ON
when every point of the structuring element touches a ON pixel in the original image. This
tends to makes objects smaller by removing pixels.
Duality between dilation and erosion:
Dilation and erosion are duals of each other with respect to set
complementation and reflection. That is,
Opening:
An erosion followed by a dilation using the same structuring element for both
operations.
Smooth contour
Break narrow isthmuses
Remove thin protrusion
Closing:
A Dilation followed by a erosion using the same structuring element for both
operations.
Smooth contour
Fuse narrow breaks, and long thin gulfs.
Remove small holes, and fill gaps.
Hit-or-Miss Transform:
The hit-and-miss transform is a basic tool for shape detection. The hit-or-miss
transform is a general binary morphological operation that can be used to look for particular
patterns of foreground and background pixels in an image.
Concept: To detect a shape:
Hit object
Miss background
Let the origin of each shape be located at its center of gravity.
If we want to find the location of a shape– X , at (larger) image, A
Let X be enclosed by a small window, say – W.
The local background of X with respect to W is defined as the set difference (W -
X).
Apply erosion operator of A by X, will get us the set of locations of the origin of X,
such that X is completely contained in A.
It may be also view geometrically as the set of all locations of the origin of X at
which X found a match (hit) in A.
Apply erosion operator on the complement of A by the local background set (W – X).
Notice, that the set of locations for which X exactly fits inside A is the intersection of
these two last operators above.
If B denotes the set composed of X and it’s background B = (B1,B2) ; B1 = X ,
B2 = (W-X).
The match (or set of matches) of B in A, denoted as
The structural elements used for Hit-or-miss transforms are an extension to the ones
used with dilation, erosion etc. The structural elements can contain both foreground and
background pixels, rather than just foreground pixels, i.e. both ones and zeros. The
structuring element is superimposed over each pixel in the input image, and if an exact match
is found between the foreground and background pixels in the structuring element and the
image, the input pixel lying below the origin of the structuring element is set to the
foreground pixel value. If it does not match, the input pixel is replaced by the boundary pixel
value.
Basic Morphological Algorithms
Boundary Extraction:
The boundary of a set A is obtained by first eroding A by structuring element B and
then taking the set difference of A and it’s erosion. The resultant image after subtracting the
eroded image from the original image has the boundary of the objects extracted. The
thickness of the boundary depends on the size of the structuring element. The boundary of a
set A is obtained by first eroding A by structuring element B and then taking the set
difference of A and it’s erosion. The resultant image after subtracting the eroded image from
the original image has the boundary of the objects extracted. The thickness of the boundary
depends on the size of the structuring element. The boundary β (A) of a set A is
Convex Hull:
A is said to be convex if a straight line segment joining any two points in A lies
entirely within A.
The convex hull H of set S is the smallest convex set containing S
The set difference H-S is called the convex deficiency of S
The convex hull and convex deficiency useful for object description. This algorithm
iteratively applying the hit-or-miss transforms to A with the first of B element, unions it with
A, and repeated with second element of B.
Let Bi , i=1,2,3,4 represents the four structuring elements. Then we need to
implement the
Let us consider
If
Thinning:
The thinning of a set A by a structuring element B, can be defined by terms of the hit-
and-miss transform:
The process is to thin by one pass with B1, then thin the result with one pass with B2,
and so on until A is thinned with one pass with Bn. The entire process is repeated until no
further changes occur. Each pass is preformed using the equation:
Thickening:
Thickening is a morphological dual of thinning and is defined as
the structuring elements used for thickening have the same form as in thinning, but with all
1’s and 0’s interchanged.
Skeletons:
The skeleton of A is defined by terms of erosions and openings:
with
k times, and K is the last iterative step before A erodes to an empty set in other words:
The S(A) can be obtained as the union of skeleton subsets Sk(A). A can be also
reconstructed from subsets Sk(A) by using the equation
Dilation:
Equation for gray-scale dilation is
Df and Db are domains of f and b. The condition that (s-x),(t-y) need to be in the
domain of f and x,y in the domain of b, is analogous to the condition in the binary definition
of dilation, where the two sets need to overlap by at least one element.
We will illustrate the previous equation in terms of 1-D. and we will receive an
equation for 1 variable:
The requirements the (s-x) is in the domain of f and x is in the domain of b implies
that f and b overlap by at least one element. Unlike the binary case, f, rather than the
structuring element b is shifted. Conceptually f sliding by b is really not different than b
sliding by f. The general effect of performing dilation on a gray scale image is twofold:
If all the values of the structuring elements are positive than the output image tends to
be brighter than the input. Dark details either are reduced or eliminated, depending on how
their values and shape relate to the structuring element used for dilation
Erosion:
Gray-scale erosion is defined as:
The condition that (s+x),(t+y) have to be in the domain of f, and x,y have to be in the
domain of b, is completely analogous to the condition in the binary definition of erosion,
where the structuring element has to be completely combined by the set being eroded. The
same as in erosion we illustrate with 1-D function
The structuring element is rolled underside the surface of f. All the peaks that are
narrow with respect to the diameter of the structuring element will be reduced in amplitude
and sharpness. The initial erosion removes the details, but it also darkens the image. The
subsequent dilation again increases the overall intensity of the image without reintroducing
the details totally removed by erosion.
Opening a G-S picture is describable as pushing object B under the scan-line graph,
while traversing the graph according the curvature of B
Closing:
In the closing of a gray-scale image, we remove small dark details, while relatively
undisturbed overall gray levels and larger dark features
The structuring element is rolled on top of the surface of f. Peaks essentially are left
in their original form (assume that their separation at the narrowest points exceeds the
diameter of the structuring element). The initial dilation removes the dark details and
brightens the image. The subsequent erosion darkens the image without reintroducing the
details totally removed by dilation
Closing a G-S picture is describable as pushing object B on top of the scan-line graph,
while traversing the graph according the curvature of B. The peaks are usually remains in
their original form.
Morphological smoothing
Perform opening followed by a closing
The net result of these two operations is to remove or attenuate both bright and
dark artifacts and noise.
Morphological gradient
Dilation and erosion are use to compute the morphological gradient of an image,
denoted g:
The top-hat transform is used for light objects on a dark background and the bottom-
hat transform is used for the converse.
Textural segmentation:
The objective is to find the boundary between different image regions based on
their textural content.
Close the input image by using successively larger structuring elements.
Then, single opening is preformed ,and finally a simple threshold that yields the
boundary between the textural regions.
Granulometry:
Granulometry is a field that deals principally with Determining the size distribution
of particles in an image.
Because the particles are lighter than the background, we can use a morphological
approach to determine size distribution. To construct at the end a histogram of it.
Based on the idea that opening operations of particular size have the most effect on
regions of the input image that contain particles of similar size.
This type of processing is useful for describing regions with a predominant
particle-like character.
Image Segmentation
Image segmentation divides an image into regions that are connected and have some
similarity within the region and some difference between adjacent regions. The goal is
usually to find individual objects in an image. For the most part there are fundamentally two
kinds of approaches to segmentation: discontinuity and similarity.
Detection of Discontinuities:
There are three kinds of discontinuities of intensity: points, lines and edges. The most
common way to look for discontinuities is to scan a small mask over the image. The mask
determines which kind of discontinuity to look for.
9
R w1 z1 w2 z 2 ... w9 z9 w z
i 1
i i
Point Detection: Whose gray value is significantly different from its background
R T
where T : a nonnegativ e threshold
Line Detection:
Only slightly more common than point detection is to find a one pixel wide line in an
image.
For digital images the only three point straight lines are only horizontal, vertical, or
diagonal (+ or –45).
Edge Detection:
Edge is a set of connected pixels that lie on the boundary between two regions
• ’Local’ concept in contrast to ’more global’ boundary concept
• To be measured by grey-level transitions
• Ideal and blurred edges
First derivative can be used to detect the presence of an edge (if a point is on a ramp).
The sign of the second derivative can be used to determine whether an edge pixel lie
on the dark or light side of an edge
Second derivative produces two value per edge
Zero crossing near the edge midpoint
Non-horizontal edges – define a profile perpendicular to the edge direction
Gradient
– Vector pointing to the direction of maximum rate of change of f at coordinates (x,y)
Gx fx
f f
G y y
– Magnitude: gives the quantity of the increase (some times referred to as gradient too)
f mag (f ) Gx2 Gy2
1
2
Partial derivatives computed through 2x2 or 3x3 masks. Sobel operators introduce
some smoothing and give more importance to the center point
Laplacian
– Second-order derivative of a 2-D function
2 f 2 f
f 2 2
2
x y
– Digital approximations by proper masks
The Laplacian of h is
The Laplacian of a Gaussian sometimes is called the Mexican hat function. It also can
be computed by smoothing the image with the Gaussian smoothing mask, followed by
application of the Laplacian mask.
The Hough transform consists of finding all pairs of values of and which satisfy
the equations that pass through (x,y). These are accumulated in what is basically a
2-dimensional histogram. When plotted these pairs of and will look like a sine wave. The
process is repeated for all appropriate (x,y) locations.
Thresholding:
The range of intensity levels covered by objects of interest is different from the
background.
1 if f ( x, y ) T
g ( x, y )
0 if f ( x, y ) T
Illumination:
There are two main approaches to region-based segmentation: region growing and
region splitting.
Region Growing:
For example: P(Rk)=TRUE if all pixels in Rk have the same gray level.
Region Splitting:
PREVIOUS QUESTIONS
1. With necessary figures, explain the opening and closing operations.
2. Explain the following morphological algorithms i) Boundary extraction ii) Hole
filling.
3. Explain the following morphological algorithms i) Thinning ii) Thickening
4. What is Hit-or-Miss transformation? Explain.
5. Discuss about Grey-scale morphology.
6. Write a short notes on Geometric Transformation
7. Explain about edge detection using gradient operator.
8. What is meant by edge linking? Explain edge linking using local processing
9. Explain edge linking using Hough transform.
10. Describe Watershed segmentation Algorithm
11. Discuss about region based segmentation.
12. Explain the concept of Thresholding in image segmentation and discuss its
merits and limitations.
UNIT-4
Color Image Processing
Introduction
The use of color in image processing is motivated by two principal factors. They are
Color is a powerful descriptor that often simplifies object identification and extraction from a
scene. Humans can discern thousands of color shades and intensities, compared to about only
two dozen shades of gray. Color in image processing is divided into two major areas,
Full-color processing: Images acquired with a full-color sensor, such as color TV camera or
Color scanner.
Pseudo-color processing: Assigning a color to a particular monochrome intensity or range of
Intensities.
Fig. Absorption of light by the red, green and blue cones in the human eye as a function
of wavelength
Cones are the sensors in the eye responsible for color vision. Detailed experimental
evidence has established that the 6 to 7 million cones in the human eye can be divided into
three principal sensing categories, corresponding roughly to red, green, and blue.
Approximately 65% of all cones are sensitive to red light, 33% are sensitive to green
light, and only about 2% are sensitive to blue (but the blue cones are the most sensitive). The
above figure shows average experimental curves detailing the absorption of light by the red,
green, and blue cones in the eye. Due to these absorption characteristics of the human eye,
colors are seen as variable combinations of the so- called primary colors red (R), green (G),
and blue (B).
The primary colors can be added to produce the secondary colors of light --magenta
(red plus blue), cyan (green plus blue), and yellow (red plus green). Mixing the three
primaries or a secondary with its opposite primary color, in the right intensities produces
white light.
Images represented in the RGB color model consist of three component images, one
for each primary color. When fed into an RGB monitor, these three images combine on the
phosphor screen to produce a composite color image.
Fig. Generating the RGB image of the cross Sectional color plane
The number of bits used to represent each pixel in RGB space is called the pixel
depth. Consider an RGB image in which each of the red, green, and blue images is an 8-bit
image. Under these conditions each RGB color pixel [that is, a triplet of values (R, G, B)] is
said to have a depth of 24 bits C image planes times the number of bits per plane). The term
full-color image is used often to denote a 24-bit RGB color image. The total number of
colors in a 24-bit RGB image is (28)3 = 16,777,216.
Where, again, the assumption is that all color values have been normalized to the
range [0, 1]. The above equation demonstrates that light reflected from a surface coated with
pure cyan does not contain red (that is, C = 1 — R in the equation). Similarly, pure magenta
does not reflect green, and pure yellow does not reflect blue. So, the RGB values can be
obtained easily from a set of CMY values by subtracting the individual CMY values from 1.
Equal amounts of the pigment primaries, cyan, magenta, and yellow should produce black. In
practice, combining these colors for printing produces a muddy-looking black. So, in order to
produce true black, a fourth color, black is added, giving rise to the CMYK color model.
In the following figure the primary colors are separated by 120° and the secondary
colors are 60° from the primaries, which means that the angle between secondaries is also
120°.
The hue of the point is determined by an angle from some reference point. Usually
(but not always) an angle of 0° from the red axis designates 0 hue, and the hue increases
counter clockwise from there. The saturation (distance from the vertical axis) is the length of
the vector from the origin to the point. The origin is defined by the intersection of the color
plane with the vertical intensity axis. The important components of the HSI color space are
the vertical intensity axis, the length of the vector to a color point, and the angle this vector
makes with the red axis.
It is assumed that the RGB values have been normalized to the range [0, 1] and that
angle θ is measured with respect to the red axis of the HST space. The SI values are in [0,1]
and the H value can be divided by 360o to be in the same range.
4.2.5. Conversion from HSI color model to RGB color model
Given values of HSI in the interval [0,1 ], one can find the corresponding RGB values
in the same range. The applicable equations depend on the values of H. There are three
sectors of interest, corresponding to the 120° intervals in the separation of primaries.
RG sector (0o ≤ H <120°):
When H is in this sector, the RGB components are given by the equations
It indicates that the components of c are simply the RGB components of a color
image at a point. If the color components are a function of coordinates (x, y) by using the
notation
For an image of size M × N, there are MN such vectors, c(x, y), for x = 0,1, 2,...,M- l;
y = 0,1,2,...,N- 1. In order for per-color-component and vector-based processing to be
equivalent, two conditions have to be satisfied: First, the process has to be applicable to both
vectors and scalars. Second, the operation on each component of a vector must be
independent of the other components.
Fig. Spatial masks for (a)gray-scale and (b) RGB color images.
The above figure shows neighborhood spatial processing of gray-scale and full-color
images. Suppose that the process is neighborhood averaging. In Fig. (a), averaging would be
accomplished by summing the gray levels of all the pixels in the neighborhood and dividing
by the total number of pixels in the neighborhood. In Fig. (b), averaging would be done by
summing all the vectors in the neighborhood and dividing each component by the total
number of vectors in the neighborhood. But each component of the average vector is the sum
of the pixels in the image corresponding to that component, which is the same as the result
that would be obtained if the averaging were done on a per-color-component basis and then
the vector was formed.
4.5.1. Formulation
We can model color transformations using the expression
g(x, y) = T[f(x, y)]
Where f(x, y) is color input image, g(x, y) is the transformed color output image and
T is the operator over a spatial neighborhood of (x, y). Each f(x, y) component is a triplet in
the chosen color space. For a given transformation the cost of converting from one color
space to another is also a factor to implement it. Hence, we wish to modifying intensity of an
image in different color spaces, using the transform
g(x, y) = k f(x, y)
When only data at one pixel is used in the transformation, we can express the transformation
as:
si = Ti( r1, r2,……,rn) i= 1, 2, …, n
Where ri = color component of f(x, y)
si = color component of g(x, y)
model itself. The model of choice for many color management system (CMS) is the CIE
L*a*b model.
Like the HIS system, the L*a*b system is an excellent decoupler of intensity
(represented by lightness L*) and color (represent by a* for red minus green and b* for green
minus blue). The tonal range of an image, also called its key type, refer to its general
distribution of color intensities. Most of the information in high-key images are located
predominantly at low intensities; middle-key images lie in between.
4.5.5. Histogram Processing
Histogram processing transformations can be applied to color images in an automated
way. As might be expected, it is generally unwise to histogram equalize the component of a
color image independently. This results in erroneous color. A more logical approach is to
spread the color intensities uniformly, leaving the colors themselves (e.g., hues) unchanged.
The HSI color space is ideally suited to this type of approach.
Where the subscripts R, G, and B, denote the RGB components of vectors a and z.
The locus of points such that D(z, a) ≤ Do is a solid sphere of radius Do. Points contained
within or on the surface of the sphere satisfy the specified color criterion; points outside the
sphere do not. Coding these two sets of points in the image with, say, black and white,
produces a binary segmented image.
PREVIOUS QUESTIONS
1. Explain about RGB, CMY and CMYK color models?
2. What is Pseudocolor image processing? Explain.
3. Explain about color image smoothing and sharpening.
4. Explain about histogram processing of color images.
5. Explain the procedure of converting colors from RGB to HSI and HSI to RGB.
6. Discuss about noise in color images.
7. Explain about HSI colour model.
8. Consider the following RGB triplets. Convert each triplet to CMY and YIQ
i) (1 1 0) ii) (1 1 1) iii). ( 1 0 1 )
9. Explain in detail about how the color models are converted to each other.
10. Discuss about color quantization and explain about its various types.
11. What are color complements? How they are useful in image processing.
12. What is meant by Luminance, brightness, Radiance & trichromatic Coefficients.