RO47002 - Lecture 2A - Case Study Visual Object Detection
RO47002 - Lecture 2A - Case Study Visual Object Detection
Case study:
Visual object detection
Course: RO47002
Lecturer: Julian Kooij
2
• Measures of success:
– Don’t miss any pedestrians
Classification
– Don’t report any pedestrians where there are none
– The bounding box should tightly fit around the pedestrian Regression
6
Is this a pedestrian?
Yes / No
8
Is this a pedestrian?
Yes / No
9
False positive:
Incorrectly classified
as pedestrian
False negative:
Incorrectly classified
as not pedestrian
10
x1 93
• •
• But, almost all of the possible images look like •
•
‘noise’, only relatively small amount can occur • •
• Visualization by projecting
images to a 3D dimensional
space using Principal
Component Analysis (PCA)
Image Features x1
Function x = f(I) extracts of features on image I x2
and represents its content by feature vector x. …
xM
x1 x1
x2 x2 Feature
vector
…
captures the
…
image content
xM xM
dense features sparse features
13
Classifying Objects
1. Feature extraction: turn image region into a D-dimensional feature vector x
2. Classification: apply classifier h, test if h(x) is above a threshold 𝜏
ℎ x ≥𝜏
classify as
“Pedestrian”
x2
ℎ x <𝜏
classify as
“Not Pedestrian”
x1
14
Classifying Objects
How to obtain useful classifier h(x) ? Training data
• Use representative training data to optimize its decision boundary
positive class:
“Pedestrian”
? negative class:
“Not pedestrian”
x2
Test data
sample to be
classified
decision boundary
x1
15
w -1
• 𝒘⊤ x is positive if α < 90°
α x • 𝒘⊤ x is negative if α > 90°
𝐚⊤ ⋅ 𝐛 = 𝐚 𝐛 cos 𝛼
-1
• 𝒘⊤ x is positive if α < 90°
w
x • 𝒘⊤ x is negative if α > 90°
𝐚⊤ ⋅ 𝐛 = 𝐚 𝐛 cos 𝛼
b'
• Classification rule: 𝑦ො = sign(𝒘⊤ x + b)
• Easy to compute, only dot product!
• Parameter wi weights contribution of xi
18
Problem 1: dimensionality
• Amount of training data needed grows exponentially with dimensionality
• 1D case, 2 samples: few decision boundries possible
?
x2
x1
20
Problems
• More complicated to train
• More parameters to optimize
• More expensive to evaluate
x2
? Remember:
many proposals to evaluate!
x1
21
Apply transformation f
x‘
x’ = f(x)
?
x11
x'
22
Intuition
• Captures relevant object information: shape, color distribution, …
• Not affected by irrelevant photometric and geometric changes, noise
• Encodes image content locally (robust to partial occlusion)
• Is efficient: feature vector dimensionality << number of image pixels
Example
Summary
• We have seen how ML can be applied to a task in
Computer Vision
– Classify image patches, not images
– 1 Image does not equal 1 Classification problem
• Not all CV tasks require ML necessarily, but often
required for challenging real-world conditions
• Example of linear classification
• Role of image feature extraction