MODULE-2 updated
MODULE-2 updated
Image processing
• Image processing is the technique of performing operations on an
image to enhance it, extract useful information, or transform it for a
specific purpose.
• More neighborhood operators
• Linear filters can perform a wide variety of image transformations.
• Non-linear filters, such as edge-preserving median or bilateral filters, can
sometimes perform even better.
• Examples of neighborhood operators include
• Morphological operators that operate on binary images,
• Semi-global operators that compute distance transforms and find connected
components in binary images
Non-linear filtering
• Non-linear filtering is a technique where the output pixel value is
determined by a non-linear function of the neighborhood pixel
values.
• Non-linear filters are more powerful than linear filters, e.g.
• Suppression of non-Gaussian noise, e.g. spikes
• Edge preserving properties
• The image showcases the effects of different filtering techniques on images corrupted
with Gaussian noise and shot noise (impulse noise). The figure consists of two rows:
• The first row (a–d) demonstrates filtering on an image with Gaussian noise.
• The second row (e–h) demonstrates filtering on an image with shot noise
(impulse noise).
(a) Original image with Gaussian noise:
•The image has been corrupted with Gaussian noise, which affects all pixel values to varying degrees.
(b) Gaussian filtered:
•A Gaussian filter is applied to smooth the noise.
•It reduces noise but also blurs edges due to its averaging nature.
(c) Median filtered:
• A median filter replaces each pixel with the median of its neighborhood.
•It is effective in reducing noise while preserving edges better than the Gaussian filter.
(d) Bilaterally filtered:
•A bilateral filter is applied, which smoothens while preserving edges.
•It is better at edge preservation than the Gaussian filter.
(e) Original image with shot noise:
•The image contains impulse noise (salt-and-pepper noise or shot noise), where some pixels are
randomly bright or dark.
(f) Gaussian filtered:
•The Gaussian filter blurs the noise but does not effectively remove shot noise since it considers all
pixels equally.
g) Median filtered:
•The median filter effectively removes shot noise because it replaces each pixel with the median of
its neighborhood.
•It preserves edges well.
(h) Bilaterally filtered:
•The bilateral filter is not very effective for shot noise.
•Since shot noise pixels are highly different from their neighbors, the bilateral filter fails to remove
them effectively.
• Gaussian filtering is good for reducing Gaussian noise but blurs the
image.
• Median filtering is highly effective against shot noise and preserves
edges well.
• Bilateral filtering is effective in preserving edges while reducing
Gaussian noise but fails for extreme noise variations like shot noise.
Median and bilateral Filtering
• Median Filter (Median = 4)
• The median filter replaces a pixel's value with the median of its
neighborhood.
• The highlighted green pixel (4) represents the median value from the sorted
neighborhood: [1,1,2,2,3,4,5,6,7]
• The median filter is effective for removing salt-and-pepper noise while
preserving edges better than a mean filter.
• α-Trimmed Mean Filter (α-mean = 4.6)
• The α-trimmed mean filter removes the highest and lowest intensity values
before computing the mean.
• The highlighted green pixels are the selected ones for averaging.
• This method reduces the influence of extreme values (outliers), providing a
balance between mean and median filtering.
• Domain Filter (Spatial Weights)
• The domain filter assigns weights based on spatial distance in a Gaussian
manner.
• The numbers in this grid (e.g., 0.1, 0.3, 0.4, etc.) represent the influence of
neighboring pixels.
• Closer pixels (center of kernel) have higher weight (e.g., 0.4), while distant
pixels have lower influence.
• This is the spatial component of a bilateral filter, smoothing nearby pixels
while preserving structure. Range Filter (Intensity Weights)
• The range filter assigns weights based on intensity similarity.
• The numbers represent how much a neighboring pixel contributes to the
filtering.
• Pixels with similar intensity get higher weights (e.g., 1.0 for identical pixels).
• This is the intensity component of a bilateral filter, ensuring that only similar
pixels are averaged, preserving edges.
Median Filter (Best for Spike/Impulse
Noise)
• Replaces each pixel with the median of its neighborhood.
• Great for removing salt-and-pepper noise without blurring edges.
• Works better than the Gaussian filter for images with impulsive noise.
• Median values can be computed in expected linear time using a randomized
select algorithm
• The shot noise value usually lies well outside the true values in the
neighborhood,
• the median filter is able to filter away such bad pixels
• Median filter
• Moderate computational cost
• It selects only one input pixel value to replace each output pixel, it is not as efficient at
averaging away regular Gaussian noise
• A better choice may be the α-Trimmed Mean which averages
together all of the pixels except for the fraction that are the smallest
and the largest
• A weighted median, in which each pixel is used a number of times
depending on its distance from the center.
• This turns out to be equivalent to minimizing the weighted objective function
Bilateral filtering
• Bilateral filtering is a non-linear, edge-preserving, and noise-
reducing smoothing technique.
• Unlike Gaussian filtering, which blurs edges, bilateral filtering selectively
smooths an image while preserving important edges by considering both
spatial proximity and intensity similarity.
• It is commonly used in denoising, cartoonization, and HDR imaging.
Bi lateral Filtering
• This figure visually explains the bilateral filtering process using a 3D surface
representation of an image. Each subfigure represents a different stage in
the filtering process.
(a) Noisy Step Edge Input
• The input image has a sharp edge with noise (random variations).
• The noise appears as small spikes on the surface.
• The goal is to smooth the noise while preserving the sharp edge.
b) Domain Filter (Gaussian Spatial Filter)
• A Gaussian filter is applied based on pixel spatial distance.
• This blurs the image, including edges.
(c) Range Filter (Intensity Similarity Filter)
• A filter is applied based on intensity similarity to the center pixel.
• Pixels with similar intensities receive higher weights, while dissimilar ones receive
lower weights.
• This helps preserve edges while smoothing.
(d) Bilateral Filter (Combination of Spatial + Range Filtering)
• Combines both the domain (spatial) filter and the range (intensity) filter.
• Noise is reduced, but the step edge is preserved.
(e) Filtered Step Edge Output
• The final result: a cleaned-up step edge without excessive smoothing.
• Unlike a simple Gaussian blur, the edge remains sharp.
(f) 3D Distance Between Pixels
• A visualization of how bilateral filtering considers both spatial distance and
intensity similarity.
• The red arrow represents spatial distance.
• The green arrow represents intensity difference.
• The total distance (blue) determines the final weight.
Iterated adaptive smoothing
• IAS applies multiple iterations of smoothing while adapting the filter
strength based on local image characteristics.
• The goal is to reduce noise without excessively blurring edges
• How It Works:
• A Gaussian or bilateral filter is applied iteratively.
• At each iteration, the filter adapts based on local image structure.
• Pixels near edges are smoothed less, while flat regions are smoothed more.
Anisotropic Diffusion (Perona-Malik
Filter)
• Anisotropic Diffusion (AD) is a partial differential equation (PDE)-
based technique that smooths an image while respecting edge
boundaries.
• It mimics heat diffusion but stops diffusion at edges to avoid blurring
them.
Morphology
• While non-linear filters are often used to enhance grayscale and color
images, they are also used extensively to process binary images.
• Such images often occur after a thresholding operation
Morphological Operations in Binary
Image Processing
• Morphological operations are fundamental techniques in image processing,
used primarily for shape manipulation in binary images.
• These operations help analyze and process structures in an image based on
their shape.
• Morphological operations work by probing an image with a structuring
element (SE), which defines how the operation interacts with objects in the
image.
• A binary image consists of pixels that are either 0 (black) or 1 (white).
• A structuring element (SE) is a small shape (e.g., square, circle) that slides over the
image and modifies the pixels based on a predefined rule.
• The process is similar to convolution, but instead of multiplying and summing
values, we apply logical rules (like min, max, or conditional checks).
Binary image morphology
• The figure demonstrates the effects of different morphological operations on
a binary image, where white pixels represent the foreground and black pixels
represent the background.
• A 5 × 5 square structuring element was used for all operations.
(a) Original Image
• This is the input binary image before any morphological operations are applied.
• It contains irregular edges, a small detached dot, and some noise.
(b) Dilation
• Expands the white regions (foreground).
• Effect: The shapes become thicker, and small gaps are filled.
• The detached dot at the top increases in size.
(c) Erosion
• Shrinks the white regions (foreground).
• Effect: The object loses thin structures, and small white regions disappear.
• The detached dot becomes smaller, and thin parts of the main shape erode away.
(d) Majority
• Applies a majority filter, which means a pixel is turned white if the majority
of its neighbors are white.
• Effect: It smooths sharp corners and reduces small noise.
• Unlike erosion or dilation, this operation is more subtle
(e) Opening (Erosion + Dilation)
• Erosion first removes small noise, and then dilation restores the shape.
• Effect: The thin dot disappears (if small enough), and the main object
remains mostly unchanged.
• Since the dot is too large, opening fails to remove it completely.
(f) Closing (Dilation + Erosion)
• Dilation first expands the object, followed by erosion to restore its size.
• Effect: Fills small gaps and smooths edges.
• The main object becomes more connected, but no significant change to the
dot.
Operations - Fundamental binary image
morphology operations using
mathematical notation
Distance transforms
• The Distance Transform (DT) is a fundamental operation in image
processing that computes the distance from every pixel in a binary
image to the nearest object (or background) pixel.
• It is widely used in various applications, including image
segmentation, shape analysis, and feature extraction.
Example
• This image illustrates the City Block Distance Transform using a step-
by-step process.
• The City Block Distance (or Manhattan Distance) transform calculates
the shortest path between pixels, considering only horizontal and
vertical movements.
• The D1 city block distance transform can be efficiently computed
using a forward and backward pass of a simple raster-scan algorithm
Connected components
• Useful semi-global image operation is finding connected components,
which are de fined
asregionsofadjacentpixelsthathavethesameinputvalue(orlabel)
Fourier transforms
• Fourier analysis could be used to analyze the frequency characteristics
of various filters.
• The Fourier Transform (FT) is a fundamental tool in image processing
that helps analyze the frequency content of an image.
• It is widely used for image filtering, compression, and enhancement.
• Input sinusoid whose frequency is f, angular frequency is = 2πf, and phase is
ϴi
• If we convolve the sinusoidal signal s(x) with a filter whose impulse response
is h(x), we get another sinusoid of the same frequency but different
magnitude A and phase ϴ0,
• A convolution can be expressed as a weighted summation of shifted input
signals and that the summation of a bunch of shifted sinusoids of the
same frequency is just a single sinusoid at that frequency
• The new magnitude A is called the gain or magnitude of the filter, while the
phase difference ΔФ = Ф0 - Фi is called the shift or phase.
• The discrete form of the Fourier transform is known as the Discrete Fourier
Trans form (DFT)
Fourier transform pairs
Two-dimensional Fourier transforms
Wiener filtering
Discrete cosine transform
• The discrete cosine transform (DCT) is a variant of the Fourier
transform particularly well suited to compressing images in a block-
wise fashion.
• The one-dimensional DCT is computed by taking the dot product of
each N-wide block of pixels with a set of cosines of different
frequencies
Application: Sharpening, blur, and
noise removal
• Acommon application of image processing is the enhancement of
images through the use of sharpening and noise removal operations,
which require some kind of neighborhood processing.
PYRAMIDS AND WAVELETS
• All of the image transformations produce output images of the same size as the
inputs.
• An image pyramid is a multi-scale representation of an image, where multiple
copies of the image are created at different resolutions.
• This is useful for various applications like image blending, object detection, and efficient
image processing.
• We may need to interpolate a small image to make its resolution match that of the
output printer or computer screen.
• Interpolation is used when enlarging an image, filling in missing pixels by estimating their
values from surrounding pixels
• We may want to reduce the size of an image to speed up the execution of an
algorithm or to save on storage space or transmission time
• Downscaling (or subsampling) is used when reducing an image's size, often applying a low-
pass filter to avoid aliasing (jagged edges or moiré patterns).
Wavelet
Interpolation Equation
•This equation essentially spreads or distributes each pixel of the original image
across a higher-resolution grid using the kernel h(x,y)
• The kernel h(x,y)defines how neighboring pixels contribute
to the new pixel values.
• Nearest-neighbor interpolation uses a simple step
function.
• Bilinear interpolation uses a linear function.
• Bicubic interpolation uses a cubic function for smoother
results.
Signal Interpolation
• Bicubic interpolation is a higher-quality interpolation method
compared to bilinear interpolation.
Visually compares different image interpolation techniques for enlarging an
image.
Comparison
Decimation
• While interpolation can be used to increase the resolution of an
image, decimation (downsam pling) is required to reduce the
resolution.
• To perform decimation, we first (conceptually) convolve the image
with a low-pass filter (to avoid aliasing) and then keep every rth
sample.
Signal decimation process, which is a key
concept in downsampling an image or a signal.
Why use low- pass filter
Multi-resolution representations
• Pyramids can be used to accelerate coarse-to-fine search algorithms,
to look for objects or patterns at different scales, and to per form
multi-resolution blending operations.
• The Laplacian pyramid is a multi-scale representation of an image
used in computer vision and image processing.
• It is constructed by recursively blurring and downsampling an image, then
storing the difference between consecutive levels.
• This method is particularly useful for image compression, texture synthesis,
and blending.
Traditional pyramid
The Gaussian Pyramid shown as a
signal processing
The Laplacian pyramid- the Laplacian pyramid,
which is a multi-resolution image representation
commonly used in image processing,
compression, and blending.
Wavelets
• Wavelets are filters that localize a signal in both space and frequency and are
defined over a hierarchy of scales.
• Wavelets provide a smooth way to decompose a signal into frequency
components without blocking and are closely related to pyramids.
• Wavelets are mathematical functions that decompose signals (such as
images) into different frequency components, while preserving spatial
information.
• They are commonly used in multi-resolution analysis (MRA), where an image is
broken down into different scales.
• A wavelet transform decomposes an image into:
• Low-frequency components (approximation): Represents the general structure of the
image.
• High-frequency components (details): Captures edges, textures, and fine details.
Comparison
Types of Wavelet Transforms
Two-dimensional wavelet
decomposition
One dimensional wavelet transform
(b) Lifted Wavelet Transform (LWT)
Lifted transform shown as a signal
processing diagram
• Point-wise transformations
• (like the one in the equation) adjust intensity values while keeping pixel
locations fixed.
• Geometric transformations
• modify the domain of the image, changing pixel positions while maintaining
intensity values.
Image Warping Concept
"Image warping involves modifying the domain of an image function
rather than its range.“