0% found this document useful (0 votes)
36 views22 pages

Wepik Advancing Object Detection Unveiling The Potential For Precision and Efficiency 202401081226449LyU

The document discusses various techniques for object detection in images and videos. It covers traditional approaches like Haar Cascades and HOG, as well as modern deep learning-based approaches like YOLO and Faster R-CNN. Haar Cascades and HOG use image gradients and histograms to detect objects, while YOLO and Faster R-CNN are neural network-based methods that can detect objects in real-time with high accuracy.

Uploaded by

Zee Foxer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views22 pages

Wepik Advancing Object Detection Unveiling The Potential For Precision and Efficiency 202401081226449LyU

The document discusses various techniques for object detection in images and videos. It covers traditional approaches like Haar Cascades and HOG, as well as modern deep learning-based approaches like YOLO and Faster R-CNN. Haar Cascades and HOG use image gradients and histograms to detect objects, while YOLO and Faster R-CNN are neural network-based methods that can detect objects in real-time with high accuracy.

Uploaded by

Zee Foxer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Object Detection

Outline
• Introduction
• Traditional approaches of Object Detection
• Haar Cascades
• HOG
• Modern approaches of Object Detection
• YOLO
• Faster R-CNN
Introduction

• Object detection is the ability to find and identify objects in


images or videos. It is a crucial skill for computers to have, as
it enables many applications that benefit from visual
understanding, such as self-driving cars, security cameras, and
medical imaging
Cont…

• The main goal of object detection is to enable


computers to see and understand the visual world
as humans do, by finding and identifying objects
of interest in images or videos.
• Image annotation
• Vehicle counting
• Activity recognition
• Face detection
• Text detection
• Pose detection
Techniques of Object Detection

• Traditional Techniques
• Haar Cascades
• HOG
• Modern(Advanced) Techniques
• YOLO
• Faster R-CNN
Haar Cascades Technique

• A way of finding objects in images or videos


• Uses simple shapes, like rectangles, to measure the difference
in brightness between different parts of the image
• Combines multiple shapes and rules in a hierarchy to build a
classifier
How does it works ?
• Slides a window over the image and partitions it into two or
more rectangular areas
• Calculates the difference in the sum of pixel intensities
between these areas
• Compares the difference with a threshold to decide if there
is an object or not
• Repeats the process with different window sizes and shapes
until the whole image is scanned
Advantage
• Fast and easy to use
• Can detect different kinds of objects, such as
faces, cars, animals, or text
• Works well for simple and clear objects
• Does not require complex features or deep
learning models
Limitations

• May not be very accurate or reliable


• Sensitive to noise, illumination, and occlusion
• Requires a lot of positive and negative images to train
the classifier
• May not work well for complex or tilted objects
Histogram of Oriented Gradients (HOG) Technique

• A feature descriptor used in computer vision and image


processing for object detection and image classification
• Counts occurrences of gradient orientation in localized portions
of an image
How does it work?
• Divides the image into small connected regions called cells
• Computes a histogram of gradient directions for the pixels within
each cell
• Concatenates the histograms of all cells to form the descriptor
• Optionally, normalizes the histograms by a measure of contrast in
a larger region called a block
Advantages
• Invariant to geometric and photometric transformations,
except for object orientation
• Can detect different kinds of objects, such as faces, cars,
animals, or text
• Fast and easy to use, does not require complex features or
deep learning models
Limitations
• May not be very accurate or reliable
• Sensitive to noise, illumination, and occlusion
• Requires a lot of positive and negative images to train the
classifier
• May not work well for complex or tilted objects
YOLO (You Only Look Once) Techniques

• YOLO is a fast and good way of doing object detection. It uses a


big network of math operations to look at the whole picture or
video at once and find the things in it. It is different from other
ways of doing object detection, which look at parts of the
picture or video many times.
How does YOLO work?

• YOLO works by dividing the picture or video into small squares. Then it
uses the network of math operations to guess what kind of thing is in
each square and how big it is. It also guesses how sure it is about its
guess. Then it removes the guesses that are not very sure or overlap
with other guesses. This way, it keeps the best guesses for the things in
the picture or video.
Advantages
• Fast: It can process images at 45 FPS, which is much faster than
other object detection systems
• Accurate: It has high detection accuracy and few background
errors
• Generalizable: It can learn general representations of objects and
can detect different kinds of objects, such as faces, cars, animals,
or text
Limitations
• Low resolution: It resizes the input image to 448*448, which may lose
some details and affect the detection quality
• Poor localization: It divides the image into a 7*7 grid, which limits the
number of objects it can detect in a cell and the accuracy of the bounding
boxes
• Sensitive to orientation: It is not invariant to object orientation, which
means it may fail to detect objects that are rotated or tilted
Faster R-CNN

• A deep learning method for object detection that uses a Region


Proposal Network (RPN) and a Region-based Convolutional Neural
Network (R-CNN) in a single network
• The RPN generates region proposals, or possible locations of objects,
by sliding a small network over the image features
• The R-CNN takes the region proposals and the image features as input
and outputs the object class and the bounding box coordinates
How does it work?

• It takes an image as input and passes it through a base network, such


as VGG-16 or ResNet-50, to extract feature maps
• It applies the RPN on the feature maps to generate region proposals,
each with an objectness score
• It applies a Region of Interest (ROI) pooling layer on the feature maps
and the region proposals to produce fixed-length feature vectors
• It feeds the feature vectors to a fully connected layer and two output
layers: one for the object class and one for the bounding box
coordinates
Advantages
• Fast: It can process images at 30 FPS, which is much faster than other
object detection methods
• Accurate: It has high detection accuracy and few background errors
• End-to-end: It can be trained with a single optimization objective
Limitations
• Complex: It requires a lot of parameters and computations, which
may cause over fitting or gradient vanishing2
• Fixed: It uses a fixed number of region proposals, which may not
adapt well to different scales or shapes of objects2
• Sensitive: It is not invariant to object orientation, which means it
may fail to detect objects that are rotated or tilted3
T H A N KYO
U!

You might also like