Computer_vision_part1
Computer_vision_part1
Sources:
http://www.cs.cmu.edu/~16385/
http://cs231n.stanford.edu/2018/syllabus.html
http://www.cse.psu.edu/~rtc12/CSE486/
What is Computer vision?
“
Vision is the act of knowing what is
where by looking.
”
--Aristotle
!2
Computer vision vs. Image
processing
• Computer vision is distinct
from image processing.
!3
CV tasks (4 Rs)
1.Reconstruction
2.Registration
3.Reorganization
4.Recognition
!4
CV tasks (4 Rs)
1.Reconstruction
2.Registration
3.Reorganization
4.Recognition
!5
CV tasks (4 Rs)
1.Reconstruction
2.Registration
3.Reorganization
4.Recognition
3.Reorganization
4.Recognition
!7
CV tasks (4 Rs)
1.Reconstruction
2.Registration
3.Reorganization
4.Recognition
!8
Why study Computer Vision?
• movie post-processing
• face finding
!9
Earth view (3d modelling)
!10
Optical character recognition (OCR)
!11
Face and smile detection
Who is she?
“How the Afghan Girl was Identified by Her Iris Patterns” Read the story
!14
Sports and games
!15
Robotics
!16
Medical imaging
!17
CV Challenges
!18
How machines see an image?
• Machines see and process everything using numbers, including images and
text. How do you convert images to numbers?
!19
Image
• Every number represents the pixel intensity at that particular location. e.g.,
for a grayscale image where every pixel contains only one value i.e. the
intensity of the black color at that location.
!20
What is an image?
• Color images will have multiple values for a single pixel. These values represent the
intensity of respective channels – Red, Green and Blue channels for RGB images,
for instance.
!21
Challenges of recognition
!22
Challenge: Viewpoint variation
!23
Challenge: Viewpoint variation
!24
Challenge: Illumination
!25
What is Color?
Surface
Reflection
!26
Color
• Color percepts are a composition of three factors (illumination, surface
reflectance, sensor response)
• We can’t easily factor the color we see in the image to infer illumination and
material (even if sensor properties are fixed and known).
!27
Is The Dress Blue and Black or White and Gold?
This dress manages to simultaneously gather more than 670,000 people on Buzzfeed, and convince
900,000 visitors to take a poll.
!28
Challange: Color Constancy
!29
Challenge: Deformation
!30
Challenge: Occlusion
!31
Challenge: Variation
!32
Distance Metric to compare images
!33
Distance metrics on pixels
!34
CV main Operations
!35
CV pipeline
!36
CV main Operations
!37
CV main Operations
!38
CV main Operations
!39
CV Pipeline
!40
Images as functions
!41
Images as functions
!42
Input Image
• By default, the imread function reads images in the BGR (Blue-Green-Red) format. We can read
images in different formats using extra flags in the imread function:
!43
CV Pipeline
!44
Image Augmentation
!45
Image transformations
!46
Image transformations
!47
Image transformations
!48
Transformation: Warping
!49
Forward Warping
!50
Forward Warping
(resizing)
!51
Backward Warping
!52
Backward Warping
!53
Where the pixels go?
p = (x,y) p’ = (x’,y’)
!54
Linear Transformation
• Uniform scaling by s:
(0,0) (0,0)
!55
Linear Transformation
θ
(0,0) (0,0)
!56
Transformation with 2x2 Matrices
• What types of transformations can be represented with a
2x2 matrix?
2D mirror about Y axis?
!57
Affine Transformation
!58
Affine Transformation
!59
Basic transformation
⎡ x ' ⎤ ⎡1 0 t x ⎤ ⎡ x ⎤ ⎡ x '⎤ ⎡ s x 0 0⎤ ⎡ x ⎤
⎢ y '⎥ = ⎢0 1 t ⎥ ⎢ y ⎥ ⎢ y '⎥ = ⎢ 0 sy ⎥ ⎢
0⎥ ⎢ y ⎥ ⎥
⎢ ⎥ ⎢ y ⎥⎢ ⎥ ⎢ ⎥ ⎢
⎢⎣ 1 ⎥⎦ ⎢⎣0 0 1 ⎥⎦ ⎢⎣ 1 ⎥⎦ ⎢⎣ 1 ⎥⎦ ⎢⎣ 0 0 1⎥⎦ ⎢⎣ 1 ⎥⎦
Translate Scale
!60
Projective transformation
affine transformation
!61
Projective transformation
Called a homography
(or planar perspective map)
!62
Projective transformation
• Projective transformations …
– Affine transformations, and ⎡ x' ⎤ ⎡ a b c ⎤⎡ x ⎤
⎢ y '⎥ = ⎢d e f ⎥⎢ y ⎥
– Projective warps ⎢ w'⎥ ⎢ g h i ⎥⎦ ⎢⎣ w⎥⎦
⎣ ⎦ ⎣
!64
Bilinear interpolation
Bilinear interpolation; the output pixel value is a weighted average of pixels in the
nearest 2-by-2 neighborhood
!65
Linear Interpolation
(recall)
!66
Linear Interpolation
(recall)
!67
Bilinear Interpolation
!68
Image resizing
• Machine learning models work with a fixed sized input. The same idea
applies to computer vision models as well. The images we use for training
our model must be of the same size.
!69
Resizing with OpenCV
!70
Rotate
!71
Rotation with OpenCV
!72
Shifting
!73
CV Pipeline
!74
Image transformations
!75
Image filtering
1D
2D
!76
Point processing
!77
Point processing
(How to implement them?)
!78
Filtering
!79
Enhancing Examples
!80
Image filtering
!81
2D discrete-space systems
(filters)
!82
Filter example:
Moving average
• Also known as Box filter
mask or
weight kernel
!83
Filter example:
Moving average
!84
Filter example:
Moving average
!85
Filter example:
Moving average
!86
Filter example:
Moving average
!87
Filter example:
Moving average
!89
Filter example:
Image Segmentation
• Non-contextual: grouping pixels with similar global features.
!90
Filtering properties
!91
Shift invariant
• Filter replaces each pixel by a linear combination of its
neighbors (and possibly itself). The combination is
determined by the filter’s kernel.
!92
Shift invariant
Is the moving average system shift invariant?
!93
Shift invariant
Is the moving average system shift invariant?
!94
Linear filtering
!95
Convolution
!96