0% found this document useful (0 votes)
3 views

Unit1_Biometrics(Part-2) (1)

The document provides an overview of image processing basics, including definitions of images, types of images (binary, grey-scale, color), and various image processing techniques such as acquisition, enhancement, restoration, and segmentation. It discusses the advantages and disadvantages of binary images, the concept of thresholding for segmentation, and the importance of interpolation in resizing images. Additionally, it covers histogram equalization as a method for improving image contrast and representation.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Unit1_Biometrics(Part-2) (1)

The document provides an overview of image processing basics, including definitions of images, types of images (binary, grey-scale, color), and various image processing techniques such as acquisition, enhancement, restoration, and segmentation. It discusses the advantages and disadvantages of binary images, the concept of thresholding for segmentation, and the importance of interpolation in resizing images. Additionally, it covers histogram equalization as a method for improving image contrast and representation.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 145

21CSE252T- BIOMETRICS

Unit – 1 (Part – 2)

Cloud Computing 1
Session – 5 & 6
• Image Processing Basics
– What is an Image?
– Image Acquition
– Type, Point Operations
– Geometric Transformations
– First and Second order derivatives
• Steps in Edge Detection, Smoothing,
Enhancement, Thresholding & Localization
Image Processing - Basics

Image
• An image may be defined as a two-dimensional function, f(x, y), where x and y
are spatial (plane) coordinates, and the amplitude of f at any pair of
coordinates (x, y) is called the intensity or gray level of the image at that point.
• When x, y, and the amplitude values of f are all finite, discrete quantities, we
call the image a digital image.
• These elements are referred to as picture elements, image elements, pels, and
pixels. An image can be defined by a two-dimensional array arranged in rows
and columns.
Types of Images

Binary Images
• It is the simplest type of image. Also called as
silhouette image
• It takes only two values i.e, Black and White or 0
and 1.
• The binary image consists of a 1-bit image and it
takes only 1 binary digit to represent a pixel.
• Binary images are mostly used for general shape or
outline.
• For Example: Optical Character Recognition (OCR).
Pros & Cons of Binary Images

Advantages
• Easy to acquire: simple digital cameras can be used or low-cost
scanners, or thresholding may be applied to grey-level images.
• Low storage: no more than 1 bit/pixel
• Simple processing: the algorithms are much simpler than those
applied to grey-level images.
Disadvantages
• Limited application: Application is restricted to tasks where
internal detail is not required as a distinguishing characteristic.
• Does not extend to 3D:
• Specialised lighting is required: it is difficult to obtain reliable
binary images without restricting the environment. The simplest
example is an overhead projector or light box.
Thresholding

• Binary images are generated using Threshold Operation.


• An image may consist of a single object or several
separated objects of relatively high intensity, viewed
against a background of relatively low intensity. This
allows figure/ground separation by thresholding.
• In order to create the two-valued binary image, a simple
threshold may be applied so that all the pixels in the
image plane are classified into object and background
pixels.
• A binary image function can then be constructed such
that pixels above the threshold are foreground (``1'')
and below the threshold are background (``0'').
Binary Image
Grey-Scale Images
Grey-scale images
• Greyscale images are monochrome images, means they
have only one color.
• Greyscale images do not contain any information about
color.
• Each pixel determines available different grey levels.
• A normal greyscale image contains 8 bits/pixel data,
which has 256 different grey levels.
• In medical images and astronomy, 12 or 16 bits/pixel
images are used.
Grey-Scale Image
Grey-scale images
Color Images
Color images
• Colour images are three band monochrome
images in which, each band contains a different
color and the actual information is stored in the
digital image.
• The color images contain grey level information
in each spectral band.
• The images are represented as red, green and
blue (RGB images). Each color image has 24
bits/pixel means 8 bits for each of the three color
band(RGB).
Color Image
Color images
8-Bit Color Format
8-bit color format
• 8-bit color is used for storing image information in a
computer's memory or in a file of an image.
• In this format, each pixel represents 8 bits.
• It has 0-255 range of colors, in which 0 is used for black,
255 for white and 127 for gray color.
• The 8-bit color format is also known as a grayscale
image. Initially, it was used by the UNIX operating
system.
16-Bit Color Format (High Color)
16-bit color format
• The 16-bit color format is also known as high color
format.
• It has 65,536 different color shades. It is used in the
system developed by Microsoft.
• The 16-bit color format is further divided into three
formats which are Red, Green, and Blue also known as
RGB format.
• In RGB format, there are 5 bits for R, 6 bits for G, and 5
bits for B. One additional bit is added in green because in
all the 3 colors green color is soothing to eyes.
16-bit color format
Color images
16-bit color format
24-Bit Color Format (True Color)
24-bit color format
• The 24-bit color format is also known as the true color
format.
• The 24-bit color format is also distributed in Red, Green,
and Blue.
• As 24 can be equally divided on 8, so it is distributed
equally between 3 different colors like 8 bits for R, 8 bits
for G and 8 bits for B.
Types of Images
Color images
24-bit color format
Types of Images
Image Processing - Basics
Image Processing - Basics
Fundamental Steps in Image Processing
Image Processing - Basics

1.ACQUISITION– It could be as simple as being given an


image which is in digital form. The main work involves:
a) Scaling
b) Color conversion (RGB to Gray or vice-versa)

2.IMAGE ENHANCEMENT– It is amongst the simplest and


most appealing in areas of Image Processing it is also used to
extract some hidden details from an image and is subjective.

3.IMAGE RESTORATION– It also deals with appealing of an


image but it is objective (Restoration is based on
mathematical or probabilistic model or image degradation).
Image Processing - Basics
4.COLOR IMAGE PROCESSING– It deals with pseudocolor and full
color image processing color models are applicable to digital image
processing.

5.WAVELETS AND MULTI-RESOLUTION PROCESSING– It is


foundation of representing images in various degrees.

6.IMAGE COMPRESSION-It involves in developing some functions


to perform this operation. It mainly deals with image size or
resolution.

7.MORPHOLOGICAL PROCESSING-It deals with tools for extracting


image components that are useful in the representation &
description of shape.
Image Processing - Basics

8.SEGMENTATION PROCEDURE-It includes partitioning


an image into its constituent parts or objects.
Autonomous segmentation is the most difficult task in
Image Processing.

9.REPRESENTATION & DESCRIPTION-It follows output of


segmentation stage, choosing a representation is only the
part of solution for transforming raw data into processed
data.

10.OBJECT DETECTION AND RECOGNITION-It is a process


that assigns a label to an object based on its descriptor
Image Processing - Basics

Benefits of Image Processing


• The digital image can be made available in any desired
format (improved image, X-Ray, photo negative, etc)
• It helps to improve images for human interpretation
• Information can be processed and extracted from
images for machine interpretation
• The pixels in the image can be manipulated to any
desired density and contrast
• Images can be stored and retrieved easily
• It allows for easy electronic transmission of images to
third-party providers
Image Pre-processing Techniques

• Sometimes, pictures taken from the camera may provide


poor quality.
• They were the result of the lighting conditions at that
moment. So, we need to use image processing techniques
to enhance image quality
Some of the image processing techniques are
– Point Operation
– Auto-contrast Adjustment
– Modified Auto-Contrast
– Histogram Equalization
– Histogram Specification
– Etc..
Point Operations

• Point Operation is the modification of the pixel value


without changing in the size, geometry and local structure
of the image.
• The new pixel value depends only on the previous value.
• They are mapped by a function f(a), if the function f() not
depend on the coordinate, it is called “global” or
“homogeneous” operation.
• Another one is called “non-homogeneous” point
operation, if it depends on the coordinate.
• Non-homogeneous point operation is used to compensate
for uneven lighting during image acquisition.
Point Operations

The common examples of homogeneous


operation include:
• Modifying contrast and brightness
• Limiting the Result by Clamping
• Inverting Image
• Threshold Operation
Point Operations
• Here, each pixel value is replaced with a new value obtained
from the old one.
• If we want to increase the brightness to stretch the contrast, we
can simply multiply all pixel values by a scalar, say by 2, to
double the range.
• Conversely, to reduce the contrast, we can divide all point values
by a scalar.
• If the overall brightness is controlled by a level, l, (e.g., the
brightness of global light) and the range is controlled by a gain, k,
the brightness of the points in a new picture, N, can be related to
the brightness in old picture, O, by
Point Operations
Question:
In the two 32-bit images shown here, white pixels have values of
one and black pixels have values of zero (gray pixels have values
somewhere in between).

What would be the result of multiplying the images together? And


what would be the result of dividing the left image by the right
image?
Point Operations
Answer:
Multiplying the images effectively results in everything outside the
white region in the right image being removed from the left image
(i.e. set to zero).

Dividing has a similar effect, except that instead of becoming zero


the masked-out pixels will take one of three results, depending
upon the original pixel’s value in the left image:
• if it was positive, the result is + infinity (shown here as yellow)
• if it was negative, the result is - infinity
• if it was zero, the result is NaN (‘not a number’ -– indicating 0/0 is undefined;
shown here as red)
Point Operations

• Point Operation can be defined as follows:


• g(x, y) = T(f(x, y)) where
– g (x, y) is the output image
– T is an operator of intensity transformation
– f (x, y) is the input image
Point Operations
For example, the Basic Intensity Transformation Functions
is a point operation
• The simplest image enhancement method is to use a 1 x 1
neighbourhood size.
• In this case, the output pixel (‘s’) only depends on the input pixel
(‘r’), and the point operation function can be simplified as
follows:
s = T(r) Where T is the point operator of a certain gray-
level mapping relationship between the original image and
the output image.
s,r: denote the gray level of the input pixel and the output
pixel.
Different Types of Point Operations
Thresholding

• In many vision applications, it is useful to separate out


the regions of the image corresponding to objects in
which we are interested, from the background.
• Thresholding provides an easy way to perform this
segmentation on the basis of the different intensities or
colors in the foreground and background regions of an
image.
• It is useful to see what areas of an image consist of
pixels whose values lie within a specified range, or band
of intensities (or colors).
Thresholding

• It selects pixels that have a particular value or


are within a specified range.
• It can be used to find objects within a picture
if their brightness level (or range) is known.
• This implies that the object’s brightness must
be known as well.
• There are two main forms: uniform and
adaptive thresholding
Point Operation - Thresholding

• In uniform thresholding, pixels above a specified


level are set to white, those below the specified
level are set to black
• It provides a way of isolating points of interest.
• It requires knowledge of the grey level, or the
target features might not be selected in the
thresholding process.
• If the level is not known, histogram equalization
or intensity normalization can be used
Point Operation - Thresholding

Optimal or Adaptive Thresholding


• It seeks to select a value for the threshold that
separates an object from its background.
• This suggests that the object has a different
range of intensities to the background, in order
that an appropriate threshold can be chosen
Interpolation
• Image interpolation occurs in all digital photos at
some stage
• Interpolation is the process of using known data
to estimate values at unknown locations
• It happens anytime you resize or remap (distort)
your image from one pixel grid to another.
• Image resizing is necessary when you need to
increase or decrease the total number of pixels
• This works in two directions and tries to achieve
the best approximation of a pixel’s intensity based
on the values of surrounding pixels
Interpolation

• Even if the same image resize or remap is


performed, the results can vary significantly
depending on the interpolation algorithm.
• It is only an approximation, therefore an image
will always lose some quality each time
interpolation is performed.
Interpolation

• The basic idea of interpolation is quite simple: first, reconstruct a


“continuous” image from the discrete input image, then sample
this image onto the grid of the output image. Interpolation
methods differ from the way the “continuous” image is
reconstructed.

• Principle of interpolation. The green image is the continuous


image. The blue and red grid schematize the input and output
grids, respectively. The question is: how to compute the intensity
of the pixels of the red grid from the one of the blue grid?
Interpolation

• Interpolation is needed to find the value of the


image at the grid points in the target coordinate
system.
• The mapping T locates the grid points of A in the
coordinate system of B, but those grid points are
not on the grid of B.
Interpolation

• To find the values on the grid points of B we


need to interpolate from the values at the
projected locations.
• Finding the closest projected points to a given
grid point can be computationally expensive.
• Projecting the grid of B into the coordinate
system of A maintains the known image values
on a regular grid. This makes it simple to find the
nearest points for each interpolation calculation
Interpolation

• Let Qg be the homogeneous grid coordinates of B


and let H be the transformation from A to B. Then
represents the projection from B to A.
• We want to find the value at each point P given
from the values on Pg, the homogeneous grid
coordinates of A.
Interpolation

Types of Interpolation Algorithms


• Common interpolation algorithms can be grouped into two
categories: adaptive and non-adaptive.
• Adaptive methods change depending on what they are
interpolating (sharp edges vs. smooth texture), whereas non-
adaptive methods treat all pixels equally.
• Non-adaptive algorithms include: nearest neighbor, bilinear,
bicubic, spline, sinc, lanczos and others. Depending on their
complexity, these use anywhere from 0 to 256 (or more) adjacent
pixels when interpolating. The more adjacent pixels they include,
the more accurate they can become, but this comes at the expense
of much longer processing time. These algorithms can be used to
both distort and resize a photo
Histogram
Histogram
Histogram
Histogram Equalization

• Histogram equalization is used for equalizing all the pixel


values of an image.
• Transformation is done in such a way that uniform
flattened histogram is produced.
• Histogram equalization increases the dynamic range of
pixel values and makes an equal count of pixels at each
level which produces a flat histogram with high contrast
image.
• Histogram of an image provides a global description of
the appearance of an image.
• Information obtained from histogram is very large in
quality.
Histogram Equalization

• Histogram of an image represents the relative


frequency of occurrence of various gray levels
in an image
• Let’s assume that an Image matrix is given as:

• This image matrix contains the pixel values at (i,


j) position in the given x-y plane which is the 2D
image with gray levels.
Histogram Equalization
There are two ways to plot a Histogram of an image:
• Method 1: Here, the x-axis has grey levels/
Intensity values and the y-axis has the number of
pixels in each grey level. The Histogram value
representation of the above image is:
Histogram Equalization
• Method 2: Here, the x-axis represents the grey
level, while the y-axis represents the probability of
occurrence of that grey level.

• Below table shows the probability of each


intensity level of an pixel

• Now we can create a histogram graph for each


pixel and corresponding occurrence probability.
Applications of Histogram

• In digital image processing, histograms are used for simple


calculations in software.
• It is used to analyze an image. Properties of an image can be
predicted by the detailed study of the histogram.
• The brightness of the image can be adjusted by having the details
of its histogram.
• The contrast of the image can be adjusted according to the need by
having details of the x-axis of a histogram.
• It is used for image equalization. Gray level intensities are expanded
along the x-axis to produce a high contrast image.
• Histograms are used in thresholding as it improves the appearance
of the image
Sampling & Quantization
Types of Neighbourhood
Neighbourhood Types
Types of Neighbourhood
4 – Connected Vs 8 - Connected
Geometric Transformations
Geometric Transformations
Geometric Transformations
Geometric Transformations
Major Geometric Transformations
Translation
• Let us now look at the transformations in Fig.
and define their concrete mapping equations.
Translation is simply a matter of shifting the
image horizontally and vertically with a given off-
set (measured in pixels) denoted Ax and Ay. For
translation the mapping is thus defined as

• So if Ax = 100 and Ay = 100 then each pixel is


shifted 100 pixels in both the x- and y-direction
Major Geometric Transformations
Translation
Major Geometric Transformations
Scaling
• When scaling an image, it is made smaller or bigger in the x-
and/or y-direction.
• Say we have an image of size 300 x 200 and we wish to transform
it into a 600 x 100 image. The x-direction is then scaled by:
600/300 = 2.
• We denote this the x-scale factor and write it as Sx = 2. Similarly
Sy = 100/200 = 1/2. Together this means that the pixel in the
image f(x, y) at position (x, y) = (100,100) is mapped to anew
position in the image g(x’,y’), namely (x ‘ ,y’) = (100 · 2,100 · 1/2)
= (200, 50).
• In general, scaling is expressed as
Major Geometric Transformations
Scaling
Major Geometric Transformations
Rotation
• When rotating an image, as illustrated in Fig. we
need to define the amount of rotation in terms
of an angle.
• We denote this angle θ meaning that each pixel
in f(x, y) is rotated θ degrees.
• The transformation is defined as
Major Geometric Transformations
Rotation
Major Geometric Transformations
Shearing
• To shear an image means to shift pixels either
horizontally, Bx, or vertically, By.
• The difference from translation is that the
shifting is not done by the same amount, but
depends on wherein the image a pixel is.
• The transformation is defined as
Major Geometric Transformations
Shearing
Edge Detection

• Edges: are abrupt changes in intensity, discontinuity in


image brightness or contrast; usually edges occur on the
boundary of two regions.
• Edge detection: is an image processing technique for
finding the boundaries of objects within images.
• It works by detecting discontinuities in brightness.
• Edge detection is used for image segmentation and data
extraction in areas such as image processing, computer
vision, and machine vision.
Edge Detection

Why we use edge detection?


• Reduce unnecessary information in the image while
preserving the structure of the image.
• Extract important features of an image such as corners,
lines, and curves.
• Edges provide strong visual clues that can help the
recognition process.
• Type of edges
Edge Detection Methods

There are various methods, and the following


are some of the most commonly used methods-
– Roberts edge detection
– Prewitt edge detection
– Sobel edge detection
– Laplacian edge detection
– Canny edge detection
First Order & Second Order Derivatives
• 1st order derivative, take the intensity value across the
image and find points where the derivative is maximum
then the edge could be located.
• The gradient is a vector, whose components measure how
rapid pixel value are changing with distance in the x and y
direction.
• Magnitude of Gradient vector is used for implementation
of first order derivative in image processing
• Laplacian is for 2nd order derivative implementation in
image processing.
Edge Detection using Derivatives
1st & 2nd Order Derivatives
Edge Detection Using First Derivative
(Gradient)

The gradient is a vector which has magnitude and


direction:
Edge Detection Using First Derivative
(Gradient)
Magnitude: indicates edge strength.
Direction: indicates edge direction.
– i.e., perpendicular to edge direction
Session 7

• Robert's method,
• Sobel's method,
• Perwitts
Roberts Edge Detection

• The Roberts Cross operator performs a simple,


quick to compute, 2-D spatial gradient
measurement on an image.
• It highlights regions of high spatial frequency
which often correspond to edges.
• The input to the operator is a grayscale image,
as is the output.
• Pixel values at each point in the output
represent the estimated absolute magnitude
of the spatial gradient of the input image at
that point
Roberts Edge Detection
• Robert Operator: It is a gradient-based
operator computes the sum of squares of the
differences between diagonally adjacent pixels
in an image through discrete differentiation.
• Then the gradient approximation is made.
• It uses the following 2 x 2 kernels or masks –
The Roberts cross operator
Roberts Edge Detection
1: Input – Read an image
2: Convert the true-color RGB image to the greyscale
image
3: Pre-allocate the filtered_image matrix with zeros
4: Define Robert Operator Mask
6: Edge Detection Process (Compute Gradient
approximation and magnitude of vector)
7: Display the filtered image
8: Thresholding on the filtered image
9: Display the edge-detected image
Roberts Edge Detection
Roberts Edge Detection

Advantages:

1.Detection of edges and orientation are very


easy
2.Diagonal direction points are preserved

Limitations:

3.Very sensitive to noise


4.Not very accurate in edge detection
Sobel Edge Detection Method
• The Sobel operator performs a 2-D spatial
gradient measurement on an image and so
emphasizes regions of high spatial frequency that
correspond to edges.
• It is used to find the approximate absolute
gradient magnitude at each point in an input
grayscale image.
• The operator consists of a pair of 3×3 convolution
kernels.
• One kernel is simply the other rotated by 90°.
• This is very similar to the Roberts Cross operator.
Sobel Edge Detection Method
• The Sobel Operator
Sobel Edge Detection Method
Sobel Edge Detection Method
Sobel Edge Detection Method
• It works by calculating the gradient of image intensity at
each pixel within the image.
• It finds the direction of the largest increase from light to
dark and the rate of change in that direction.
• The result shows how abruptly or smoothly the image
changes at each pixel, and therefore how likely it is that
that pixel represents an edge. It also shows how that
edge is likely to be oriented.
• The result of applying the filter to a pixel in a region of
constant intensity is a zero vector.
• The result of applying it to a pixel on an edge is a vector
that points across the edge from darker to brighter
values.
Sobel Edge Detection Method
The Prewitt operator

• Prewitt operator is a differentiation operator.


• Prewitt operator is used for calculating the
approximate gradient of the image intensity
function.
• In an image, at each point, the Prewitt operator
results in gradient vector or normal vector.
• In Prewitt operator, an image is convolved in the
horizontal and vertical direction with small,
separable and integer-valued filter. It is
inexpensive in terms of computations.
Templates of Prewitt operator
Summary
Second Order Derivatives

• Laplacian of Gaussian,
• Zero crossing
Laplacian of Gaussian
• The Laplacian is a 2-D isotropic measure of
the 2nd spatial derivative of an image.
• The Laplacian of an image highlights
regions of rapid intensity change and is
often used for edge detection.
• The Laplacian is often applied to an image
that has first been smoothed with
something approximating a Gaussian
smoothing filter in order to reduce its
sensitivity to noise, and hence the two
variants will be described together here
• The operator normally takes a single grey-
level image as input and produces another
Second Order Derivatives

• A very popular second order operator is the Laplacian operator.


• The Laplacian of a function f(x,y), is defined by:

• The Template for LT is

• In Laplacian, the input image is represented as a set of discrete


pixels. So, discrete convolution kernel which can approximate
second derivatives in the definition is found.
Second Order Derivatives
Templates for LT Operator

There are disadvantages to the use of second order


derivatives.
• Second derivatives will exaggerated noise twice as much.
• No directional information about the edge is given.
• The problems that the presence of noise causes when
using edge detectors means we should try to reduce the
noise in an image prior to or in conjunction with the edge
detection process.
Gaussian Smoothing

• Gaussian smoothing performs a weighted average of


surrounding pixels based on the Gaussian distribution
• Gaussian smoothing is performed by convolving an
image with a Gaussian operator which is defined below.
• By using Gaussian smoothing in conjunction with the
Laplacian operator, or another Gaussian operator, it is
possible to detect edges.
• Lets look at the Gaussian smoothing process first.
• The Gaussian distribution function in two
variables, g(x,y), is defined by

• where sigma is the standard deviation representing the


width of the Gaussian distribution
Gaussian Smoothing
LOG
The LOG operator
• This method of edge detection was first proposed by
Marr and Hildreth at MIT who introduced the principle
of the zero-crossing method.
• The basic principle of this method is to find the position
in an image where the second derivatives become
zero.
• These positions correspond to edge positions as shown
in
LOG
Zero Crossing
The LOG operator

• Zero Crossing based Edge Detection

• The Gaussian function firstly smooths or blurs


any step edges.
• The second derivative of the blurred image is
taken; it has a zero-crossing at the edge.
NOTE: Blurring is advantageous here:
- Laplacian would be infinity at (unsmoothed) step edge.
- Edge position still preserved
The LOG based Edge Detection
The LOG based Edge Detection

• LOG operator is still susceptible to noise, but the effects


of noise can be reduced by ignoring zero-crossings
produced by small changes in image intensity.
• LOG operator gives edge direction information as well
as edge points - determined from the direction of the
zero-crossing
The LOG based Edge Detection
Session 8 and 9

• Low Level Features Extraction


• Describing Image Motion
• High Level Features Extraction
• Template Matching

• Hough Transform for Lines


• Hough Transforms for Circles and Elipses
Low Level Feature Extraction


Low Level Feature Extraction


Low Level Feature Extraction
• Difference Image
Approaches to Detect Motions
• There are many approaches for detecting the motion in
an image
• Area Based Approach
• Differential Approach
Area Based Approach
Area Based Approach
Area Based Approach
Differential Approach
Differential Approach
Differential Approach
Difference between AB and DB
Approaches
Difference between AB and DB
Approaches
High Level Features
Template Matching
Template Matching
Template Matching
Hough Transform
• The Hough transform is a technique which can be used
to isolate features of a particular shape within an image.
• Because it requires that the desired features be specified
in some parametric form, the classical Hough transform
is most commonly used for the detection of regular
curves such as lines, circles, ellipses, etc.
• The idea behind the Hough technique for line detection
is that each input measurement (e.g. coordinate point)
indicates its contribution to a globally consistent
solution (e.g. the physical line which gave rise to that
image point).
Hough Transform
• As an example, consider the common problem of fitting
a set of line segments to a set of discrete image points
(e.g. pixel locations output from an edge detector).
• Figure shows some possible solutions to this problem.
• Here the lack of a priori knowledge about the number
of desired line segments render this problem under-
constrained.
Hough Transform
• The Hough Space is a 2D plane that has a horizontal
axis representing the slope and the vertical axis
representing the intercept of a line on the edge image.
• A line on an edge image is represented in the form of
y = ax + b
• One line on the edge image produces a point on the
Hough Space since a line is characterized by its
slope a and intercept b.
• On the other hand, an edge point (xᵢ, yᵢ) on the edge
image can have an infinite number of lines pass
through it.
• Therefore, an edge point produces a line in the Hough
Space in the form of b = axᵢ + yᵢ
Hough Transform
• In the Hough Transform algorithm, the Hough Space is
used to determine whether a line exists in the edge
image.
• There is one flaw with representing lines in the form
of y = ax + b and the Hough Space with the slope and
intercept.
• In this form, the algorithm won’t be able to detect
vertical lines because the slope a is undefined/ infinity
for vertical lines
Hough Transform
• This means that a computer would need an
infinite amount of memory to represent all
possible values of a.
• To avoid this issue, a straight line is instead
represented by a line called the normal line that
passes through the origin and perpendicular to
that straight line.
• The form of the normal line is ρ = x cos(θ) + y
sin(θ) where ρ is the length of the normal line
and θ is the angle between the normal line and
the x axis.
Hough Transform
• So, a convenient equation for describing a set of lines uses
parametric or normal notion: ρ = x cos(θ) + y sin(θ)
• where ρ is the length of a normal from the origin to this line
and theta (θ) is the orientation of ρ with respect to the X-
axis.
• For any point (x,y ) on this line, ρ and θ are constant.
• Instead of representing the Hough Space with the
slope a and intercept b, it is now represented with ‘ρ’ and
‘θ’ where the horizontal axis is for the θ values and the
vertical axis are for the ρ values.
• The mapping of edge points onto the Hough Space works in
a similar manner except that an edge point (xᵢ, yᵢ) now
generates a cosine curve in the Hough Space instead of a
straight line
Hough Transform
Hough Transform
• As mentioned, an edge point produces a cosine curve
in the Hough Space.
• From this, if we were to map all the edge points from
an edge image onto the Hough Space, it will generate a
lot of cosine curves.
• If two edge points lay on the same line, their
corresponding cosine curves will intersect each other
on a specific (ρ, θ) pair.
• Thus, the Hough Transform algorithm detects lines by
finding the (ρ, θ) pairs that have a number of
intersections larger than a certain threshold.
Hough Transform – Algorithm
1. Determine the range of ρ and θ. Typically, the range of
θ is [0, 180] degrees and ρ is [-d, d], where d is the
diagonal length of the edge. It is important to quantize
the range of ρ and θ, which means there should only
be a finite number of possible values.

2. Create a 2D array called the “accumulator” with the


dimensions (num_rhos, num_thetas) to represent the
Hough Space and set all its values to zero.

3. Perform edge detection on the original image (ED).


You can do this with any ED technique you like.
Hough Transform – Algorithm
4. For every pixel on the edge image, check whether the
pixel is an edge pixel. If it is an edge pixel, loop over all
possible values of θ, compute the corresponding ρ, find
the θ and ρ index in the accumulator, then increase the
accumulator base on those index pairs.

5. Iterate over the accumulator’s values. If the value is


larger than a certain threshold, get the ρ and θ index, get
the value of ρ and θ from the index pair which can then be
converted back to the form of y = ax + b.
Hough Transform – Algorithm
Advantages
• The HT benefits from not requiring all pixels on a single
line to be contiguous. As a result, it can be quite
effective when identifying lines with small gaps due to
noise or when objects are partially occluded.
Disadvantages
• It can produce deceptive results when objects align by
accident;
• Rather than finite lines with definite ends, detected lines
are infinite lines defined by their (m,c) values.
Circles Detection using HT
Circles
Circles
Circles
Ellipse Detection using HT
Ellipse Detection using HT
Ellipses Axes
HT for Ellipses
HT for Ellipses
End of UNIT 1 (Part 2)

You might also like