Lecture 8 Image Segmentationi n Computer Vision 2025
Lecture 8 Image Segmentationi n Computer Vision 2025
CPCS432 Al
Lecture 8
Image Segmentation
Computer Vision
Applying Deep Learning Algorithms
for Image Segmentation
1
Image Segmentation
CPCS432 Lecture 8
Dr. Arwa Basbrain
Image Segmentation
CPCS432 Lecture 8
Dr. Arwa Basbrain
Image segmentation methods
Semantic Segmentation
In semantic segmentation, each pixel is classified into a class label, so the entire image is
segmented into categories like "car," "road," "sky," etc. This method does not distinguish
between separate instances of the same object.
Applications: Used in autonomous driving to understand road scenes, in AR for scene
understanding, and in satellite imaging for land classification.
Example: Labelling each pixel in a street image as road, sidewalk, building, or vehicle.
Instance Segmentation
Instance segmentation is similar to semantic segmentation, but it also distinguishes between
different instances of the same class. For example, it would label two cars as separate objects
rather than combining them into one "car" category.
Applications: Instance segmentation is critical in robotics and autonomous vehicles for
object-level understanding and interaction, and in retail (e.g., counting items on a shelf).
Example: In a street scene, marking each car as a distinct entity, rather than a general “car”
label for all cars.
Instance Segmentation
Panoptic Segmentation
Panoptic segmentation combines semantic and
instance segmentation. Each pixel is assigned a
semantic label, and each object instance is
uniquely labelled. This approach is holistic,
covering both things (objects with instances,
like cars) and stuff (regions without distinct
instances, like roads).
Applications: Used in complex scenes where
object interactions need to be fully understood,
such as in autonomous driving and augmented
reality
Example: In a driving scene, labeling roads,
cars, buildings, and trees with distinct
boundaries and semantic meaning.
CPCS432 Lecture 8 5/11/2024 11
Use cases for image segmentation
Image segmentation has become an essential tool in a variety of fields.
aiding tasks like tumor detection, brain segmentation, disease diagnosis and surgical
planning.
2. Autonomous vehicles: Image segmentation allows self-driving cars to avoid
obstacles like pedestrians and other cars, as well as identify lanes and traffic signs. It
is similarly used to inform navigation in robotics.
3. Satellite imaging: Semantic and instance segmentation automate the identification of
different terrain and topographical features.
4. Smart cities: Image segmentation powers tasks like real-time traffic monitoring and
surveillance.
5. Manufacturing: in addition to powering robotics tasks, image segmentation powers
product sorting and the detection of defects.
6. Agriculture: image segmentation helps farmers estimate crop yields and detect
weeds for removal.
Image segmentation techniques
Image Segmentation Techniques are methods used to divide an image into multiple parts or
regions, each representing meaningful sections or objects. These techniques are essential in
computer vision tasks where the goal is to isolate specific objects or regions of interest within an
image, enabling more focused analysis and interpretation. Here’s a comprehensive overview of the
main types of image segmentation techniques, along with examples and applications for each.
Traditional image segmentation techniques use information from a pixel’s colour values (and
related characteristics like brightness, contrast or intensity) for feature extraction, and can be
quickly trained with simple machine learning algorithms for tasks like semantic classification.
• Edge detection: Edge detection methods identify the boundaries of objects or classes by
detecting discontinuities in brightness or contrast.
• Edge-based segmentation detects boundaries between objects in an image by identifying
changes in pixel intensity, often representing the edges of objects. This method is commonly
used when objects have well-defined boundaries.
to
pivedrecode
Prominent deep learning models used in
image segmentation include:
Fully Convolutional Networks (FCNs):
FCNs, often used for semantic
segmentation, are a type of convolutional e
neural network (CNN) with no fixed
layers. An encoder network passes visual
input data through convolutional layers to
extract features relevant to segmentation
or classification, and compresses (or
downsamples) this feature data to remove
non-essential information. This L
a
compressed data is then fed into decoder
layers, upsampling the extracted feature
data to reconstruct the input image with
segmentation masks.
CPCS432 Lecture 8 5/11/2024 18
Image segmentation techniques
Deep learning models
Fully Convolutional Networks (FCNs):
Pixel-wise Prediction
The pixel-wise prediction block at the end of the
network represents the output of the model. This
output is a segmentation mask where each pixel is
assigned a class label based on the highest
prediction confidence.
For example, pixels that belong to the dog, cat, and
background regions will each have a unique label in
the output mask.
Output Resolution: The output segmentation map u
has the same spatial dimensions as the input image,
allowing each pixel to have a class label. The our wi
segmentation mask (e.g., dog in purple, cat in Lay
brown, background in green) shows the model’s
predictions for each pixel, effectively identifying
the regions corresponding to each object class.
CPCS432 Lecture 8 5/11/2024 19
Image segmentation techniques
Deep learning models
Prominent deep learning models used in image segmentation include:
U-Nets: U-Nets modify FCN architecture to reduce data loss during downsampling with skip
connections, preserving greater detail by selectively bypassing some convolutional layers as
information and gradients move through the neural network. Its name is derived from the shape of
diagrams demonstrating the arrangement of its layers.
Deeplab: Like U-Nets, Deeplab is a modified FCN architecture. In addition to skip connections, it
uses diluted (or “atrous”) convolution to yield larger output maps without necessitating additional
computational power.
Mask R-CNNs: Mask R-CNNs are a leading model for instance segmentation. Mask R-CNNs
combine a region proposal network (RPN) that generates bounding boxes for each potential
instance with an FCN-based “mask head” that generates segmentation masks within each
confirmed bounding box.
Transformers: inspired by the success of transformer models like GPT and BLOOM in natural
- -
language processing, new models like Vision Transformer (ViT) using attention mechanisms in
place of convolutional layers have matched or exceeded CNN performance for computer vision
tasks.