0% found this document useful (0 votes)

3 views

Computer_vision_part1

Temp

Uploaded by

Sơn Nguyễn Kim

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Computer_vision_part1

Temp

Uploaded by

Sơn Nguyễn Kim

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 96

Computer Vision

IFT6758 - Data Science

Sources:
http://www.cs.cmu.edu/~16385/
http://cs231n.stanford.edu/2018/syllabus.html

http://www.cse.psu.edu/~rtc12/CSE486/
What is Computer vision?
“
Vision is the act of knowing what is
where by looking.
”
--Aristotle

• Computer vision is a field of study focused on the problem of

helping computers to see.

!2
Computer vision vs. Image
processing
• Computer vision is distinct
from image processing.

• Image processing is the

process of creating a new
image from an existing image,
typically simplifying or
enhancing the content in some
way.

• Computer vision is concerned

with understanding the
content of an image.

!3
CV tasks (4 Rs)

1.Reconstruction

2.Registration

3.Reorganization

4.Recognition

!4
CV tasks (4 Rs)

1.Reconstruction

2.Registration

3.Reorganization

4.Recognition

Multiview Geometry, 3D Vision, Shape-from-X

!5
CV tasks (4 Rs)

1.Reconstruction

2.Registration

3.Reorganization

4.Recognition

Tracking, Alignment, Optical Flow, Correspondence

!6
CV tasks (4 Rs)

Clustering, Unsupervised Learning, Segmentation, Perceptual

1.Reconstruction
Organization
2.Registration

3.Reorganization

4.Recognition

!7
CV tasks (4 Rs)

1.Reconstruction

2.Registration

3.Reorganization

4.Recognition

Verification, Identification, Detection

!8
Why study Computer Vision?

• Images and movies are everywhere

• Fast-growing collection of useful applications

• building representations of the 3D world from pictures

• automated surveillance (who’s doing what)

• movie post-processing

• face finding

• Greater understanding of human vision

!9
Earth view (3d modelling)

Image from Microsoft’s Virtual Earth

(see also: Google Earth)

!10
Optical character recognition (OCR)

Technology to convert scanned docs to text

• If you have a scanner, it probably came with OCR software

Digit recognition, AT&T labs License plate readers

http://www.research.att.com/~yann/ http://en.wikipedia.org/wiki/Automatic_number_plate_recognition

!11
Face and smile detection

Many new digital cameras now detect faces

• Canon, Sony, Fuji, …

Who is she?

Sony Cyber-shot® T70 Digital Still Camera

!12
Vision biometric

“How the Afghan Girl was Identified by Her Iris Patterns” Read the story

Face recognition systems now

beginning to appear more widely
Fingerprint scanners on http://www.sensiblevision.com/
many new laptops,
other devices
!13
Object recognition

This is becoming real:

• Microsoft Research
• Point & Find, Nokia
LaneHawk by EvolutionRobotics
“A smart camera is flush-mounted in the
checkout lane, continuously watching for
items… “

!14
Sports and games

Digimask: put your face on a 3D avatar.

Nintendo Wii has camera-based IR

Sportvision first down line
tracking built in.
Nice explanation on www.howstuffworks.com

!15
Robotics

NASA’s Mars Spirit Rover

http://www.robocup.org/
http://en.wikipedia.org/wiki/Spirit_rover

!16
Medical imaging

Image guided surgery

3D imaging
Grimson et al., MIT
MRI, CT

!17
CV Challenges

!18
How machines see an image?

• Machines see and process everything using numbers, including images and
text. How do you convert images to numbers?

!19
Image
• Every number represents the pixel intensity at that particular location. e.g.,
for a grayscale image where every pixel contains only one value i.e. the
intensity of the black color at that location.

!20
What is an image?
• Color images will have multiple values for a single pixel. These values represent the
intensity of respective channels – Red, Green and Blue channels for RGB images,
for instance.

!21
Challenges of recognition

!22
Challenge: Viewpoint variation

!23
Challenge: Viewpoint variation

!24
Challenge: Illumination

Object appearance changes with respect to lighting magnitude and direction.

!25
What is Color?

Illumination Sensor Response

Surface
Reflection

!26
Color
• Color percepts are a composition of three factors (illumination, surface
reflectance, sensor response)

• We can’t easily factor the color we see in the image to infer illumination and
material (even if sensor properties are fixed and known).

!27
Is The Dress Blue and Black or White and Gold?

This dress manages to simultaneously gather more than 670,000 people on Buzzfeed, and convince
900,000 visitors to take a poll.

!28
Challange: Color Constancy

!29
Challenge: Deformation

!30
Challenge: Occlusion

!31
Challenge: Variation

!32
Distance Metric to compare images

!33
Distance metrics on pixels

!34
CV main Operations

!35
CV pipeline

!36
CV main Operations

!37
CV main Operations

!38
CV main Operations

!39
CV Pipeline

!40
Images as functions

!41
Images as functions

!42
Input Image

• By default, the imread function reads images in the BGR (Blue-Green-Red) format. We can read
images in diﬀerent formats using extra flags in the imread function:

cv2.IMREAD_COLOR: Default flag for loading a color image

cv2.IMREAD_GRAYSCALE: Loads images in grayscale format
cv2.IMREAD_UNCHANGED: Loads images in their given format, including the alpha channel. Alpha
channel stores the transparency information – the higher the value of alpha channel, the more opaque
is the pixel

!43
CV Pipeline

!44
Image Augmentation

• Data augmentation uses the available data samples to

produce the new ones, by applying image operations
like rotation, scaling, translation, etc. This makes our
model robust to changes in input and leads to better
generalization.

!45
Image transformations

!46
Image transformations

!47
Image transformations

!48
Transformation: Warping

!49
Forward Warping

!50
Forward Warping
(resizing)

!51
Backward Warping

!52
Backward Warping

!53
Where the pixels go?

p = (x,y) p’ = (x’,y’)

• Transformation T is a coordinate-changing machine:

p’ = T(p)
• What does it mean that T is global?
– Is the same for any point p
– can be described by just a few numbers (parameters)
• Let’s consider linear xforms (can be represented by a 2D matrix):

!54
Linear Transformation

• Uniform scaling by s:

(0,0) (0,0)

What is the inverse?

!55
Linear Transformation

• Rotation by angle θ (about the origin)

θ
(0,0) (0,0)

What is the inverse?

For rotations:

!56
Transformation with 2x2 Matrices
• What types of transformations can be represented with a
2x2 matrix?
2D mirror about Y axis?

2D mirror across line y = x?

!57
Affine Transformation

• Affine transformations are combinations of …

– Linear transformations, and
–Translations ⎡ x'⎤ ⎡ a b c ⎤ ⎡ x ⎤
⎢ y '⎥ = ⎢d e f ⎥ ⎢ y ⎥
⎢ w ⎥ ⎢ 0 0 1 ⎥ ⎢ w⎥
⎣ ⎦ ⎣ ⎦⎣ ⎦
• Properties of affine transformations:
– Origin does not necessarily map to origin
–Lines map to lines
–Parallel lines remain parallel
–Ratios are preserved
– Closed under composition

!58
Affine Transformation

any transformation with

last row [ 0 0 1 ] we call an
affine transformation

!59
Basic transformation

⎡ x ' ⎤ ⎡1 0 t x ⎤ ⎡ x ⎤ ⎡ x '⎤ ⎡ s x 0 0⎤ ⎡ x ⎤
⎢ y '⎥ = ⎢0 1 t ⎥ ⎢ y ⎥ ⎢ y '⎥ = ⎢ 0 sy ⎥ ⎢
0⎥ ⎢ y ⎥ ⎥
⎢ ⎥ ⎢ y ⎥⎢ ⎥ ⎢ ⎥ ⎢
⎢⎣ 1 ⎥⎦ ⎢⎣0 0 1 ⎥⎦ ⎢⎣ 1 ⎥⎦ ⎢⎣ 1 ⎥⎦ ⎢⎣ 0 0 1⎥⎦ ⎢⎣ 1 ⎥⎦
Translate Scale

⎡ x'⎤ ⎡cos θ − sin θ 0⎤ ⎡ x ⎤ ⎡ x '⎤ ⎡ 1 shx 0⎤ ⎡ x ⎤

⎢ y '⎥ = ⎢ sin θ cos θ 0⎥⎥ ⎢⎢ y ⎥⎥ ⎢ y '⎥ = ⎢ sh ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎢ ⎥ ⎢ y 1 0⎥ ⎢ y ⎥
⎢⎣ 1 ⎥⎦ ⎢⎣ 0 0 1⎥⎦ ⎢⎣ 1 ⎥⎦ ⎢⎣ 1 ⎥⎦ ⎢⎣ 0 0 1⎥⎦ ⎢⎣ 1 ⎥⎦
2D in-plane rotation Shear

!60
Projective transformation

what happens when we mess with this

row?

affine transformation

!61
Projective transformation

Called a homography
(or planar perspective map)

!62
Projective transformation

• Projective transformations …
– Affine transformations, and ⎡ x' ⎤ ⎡ a b c ⎤⎡ x ⎤
⎢ y '⎥ = ⎢d e f ⎥⎢ y ⎥
– Projective warps ⎢ w'⎥ ⎢ g h i ⎥⎦ ⎢⎣ w⎥⎦
⎣ ⎦ ⎣

• Properties of projective transformations:

– Origin does not necessarily map to origin
– Lines map to lines
– Parallel lines do not necessarily remain parallel
– Ratios are not preserved
– Closed under composition
!63
Backward Warping

!64
Bilinear interpolation

Bilinear interpolation; the output pixel value is a weighted average of pixels in the
nearest 2-by-2 neighborhood

!65
Linear Interpolation
(recall)

!66
Linear Interpolation
(recall)

!67
Bilinear Interpolation

!68
Image resizing

• Machine learning models work with a fixed sized input. The same idea
applies to computer vision models as well. The images we use for training
our model must be of the same size.

• Images can be easily scaled up and down

• Diﬀerent interpolation and downsampling methods are supported by

OpenCV. OpenCV’s resize function uses bilinear interpolation by default.

!69
Resizing with OpenCV

!70
Rotate

• Suppose we are building an image classification model for identifying the

animal present in an image. So, both the images shown below should be
classified as ‘dog’:

!71
Rotation with OpenCV

!72
Shifting

!73
CV Pipeline

!74
Image transformations

!75
Image filtering

!76
Point processing

!77
Point processing
(How to implement them?)

!78
Filtering

!79
Enhancing Examples

!80
Image filtering

!81
2D discrete-space systems
(filters)

!82
Filter example:
Moving average
• Also known as Box filter

mask or
weight kernel

• 2D moving average over a 3×3 window of neighborhood

!83
Filter example:
Moving average

!84
Filter example:
Moving average

!85
Filter example:
Moving average

!86
Filter example:
Moving average

!87
Filter example:
Moving average

Achieve smoothing effect (remove sharp features)

!88
Filter example:
Image Segmentation

!89
Filter example:
Image Segmentation
• Non-contextual: grouping pixels with similar global features.

• Contextual: grouping pixels with similar features and in close locations.

!90
Filtering properties

!91
Shift invariant
• Filter replaces each pixel by a linear combination of its
neighbors (and possibly itself). The combination is
determined by the filter’s kernel.

!92
Shift invariant
Is the moving average system shift invariant?

!93
Shift invariant
Is the moving average system shift invariant?

!94
Linear filtering

Linear filtering means linear combination of neighboring pixel values.

• Is the moving average system a linear system?

• Is thresholding a linear system?

!95
Convolution

• Any linear, shift invariant operator can be represented as

convolution!

!96

Exploratory Research
No ratings yet
Exploratory Research
13 pages
Computer Vision
100% (1)
Computer Vision
48 pages
Lec00 Intro For Web Highlighted
No ratings yet
Lec00 Intro For Web Highlighted
72 pages
Chapter 1 - Introduction To CV
No ratings yet
Chapter 1 - Introduction To CV
49 pages
Lect1 PDF
100% (1)
Lect1 PDF
45 pages
Computer Vision ch1
No ratings yet
Computer Vision ch1
80 pages
1 Intro Visión Artificial
No ratings yet
1 Intro Visión Artificial
50 pages
Lec00 Intro For Web
No ratings yet
Lec00 Intro For Web
81 pages
CV Module 1
No ratings yet
CV Module 1
166 pages
00CV Intro Full
No ratings yet
00CV Intro Full
58 pages
Computer Vision Intorduction
No ratings yet
Computer Vision Intorduction
57 pages
Lec 01 CompVision N DIP Intro
No ratings yet
Lec 01 CompVision N DIP Intro
91 pages
Introduction to Data Science: (Khoa học dữ liệu)
No ratings yet
Introduction to Data Science: (Khoa học dữ liệu)
91 pages
CV #1 Course Introduction-1
No ratings yet
CV #1 Course Introduction-1
61 pages
803 (A) Image Processing and Computer Vision#: Subject In-Charge: Prof Shilpa Sharma
No ratings yet
803 (A) Image Processing and Computer Vision#: Subject In-Charge: Prof Shilpa Sharma
44 pages
1 Vision Lec 1
No ratings yet
1 Vision Lec 1
49 pages
PDF Joiner
No ratings yet
PDF Joiner
38 pages
Computer Vision: Linda Shapiro
No ratings yet
Computer Vision: Linda Shapiro
73 pages
Lec01 Intro
No ratings yet
Lec01 Intro
61 pages
Lec 00
No ratings yet
Lec 00
76 pages
CS 143: Introduction To Computer Vision
No ratings yet
CS 143: Introduction To Computer Vision
38 pages
Lecture 01 Introduction
No ratings yet
Lecture 01 Introduction
62 pages
Cv2021-Lec1-Introduction 1600 PDF - Gdrive.vip
No ratings yet
Cv2021-Lec1-Introduction 1600 PDF - Gdrive.vip
61 pages
CVIP-Module-01-Reviewer
No ratings yet
CVIP-Module-01-Reviewer
20 pages
01 Introduction
No ratings yet
01 Introduction
62 pages
Computer Vision
No ratings yet
Computer Vision
52 pages
T2310 TDS3651 L01 Introduction
No ratings yet
T2310 TDS3651 L01 Introduction
73 pages
What Is Computer Vision?: (Slides From James Hays, Brown University)
No ratings yet
What Is Computer Vision?: (Slides From James Hays, Brown University)
25 pages
Week-16 Lecture-32
No ratings yet
Week-16 Lecture-32
65 pages
CSE480: Machine Vision
No ratings yet
CSE480: Machine Vision
51 pages
Lec01 CT Intro
No ratings yet
Lec01 CT Intro
61 pages
CV - Lec01 - Introduction
No ratings yet
CV - Lec01 - Introduction
50 pages
Computer Vision Applications
No ratings yet
Computer Vision Applications
35 pages
1a. Introduction
No ratings yet
1a. Introduction
32 pages
Computer Vision
No ratings yet
Computer Vision
30 pages
INT345 Computer Vision
No ratings yet
INT345 Computer Vision
31 pages
Lec1 - Computer Vision - v1
No ratings yet
Lec1 - Computer Vision - v1
38 pages
Introduction To Digital Image Processing
100% (1)
Introduction To Digital Image Processing
81 pages
unit 3_1_1708079910427 (1)
No ratings yet
unit 3_1_1708079910427 (1)
25 pages
CS 474 Lec 01 Introduction
No ratings yet
CS 474 Lec 01 Introduction
69 pages
UNIT-I_Introduction to Computer Vision
No ratings yet
UNIT-I_Introduction to Computer Vision
45 pages
Introduction To Computer Vision: by James Hays
No ratings yet
Introduction To Computer Vision: by James Hays
32 pages
CS312 Module 4
No ratings yet
CS312 Module 4
21 pages
1 Sirg Bsu - 1
No ratings yet
1 Sirg Bsu - 1
46 pages
CS7.505: Computer Vision: Spring 2022
No ratings yet
CS7.505: Computer Vision: Spring 2022
46 pages
What Is Computer Vision?: (Slides From James Hays, Brown University)
No ratings yet
What Is Computer Vision?: (Slides From James Hays, Brown University)
25 pages
Practical Computer Vision with SimpleCV The Simple Way to Make Technology See Kurt Demaagd - The full ebook version is just one click away
100% (1)
Practical Computer Vision with SimpleCV The Simple Way to Make Technology See Kurt Demaagd - The full ebook version is just one click away
61 pages
Practical Computer Vision with SimpleCV The Simple Way to Make Technology See Kurt Demaagddownload
100% (2)
Practical Computer Vision with SimpleCV The Simple Way to Make Technology See Kurt Demaagddownload
55 pages
What Is Computer Vision?: (Slides From James Hays, Brown University)
No ratings yet
What Is Computer Vision?: (Slides From James Hays, Brown University)
25 pages
computer-vision-revision-notes_250322_101703
No ratings yet
computer-vision-revision-notes_250322_101703
4 pages
Week5_Computer_Vision
No ratings yet
Week5_Computer_Vision
58 pages
computer vision technology
No ratings yet
computer vision technology
29 pages
02 Feature Extraction & DLCV
No ratings yet
02 Feature Extraction & DLCV
165 pages
Computer Vision SM-1
No ratings yet
Computer Vision SM-1
26 pages
Lecture 1
No ratings yet
Lecture 1
21 pages
unit 1
No ratings yet
unit 1
179 pages
An Introduction To Computer Vision
No ratings yet
An Introduction To Computer Vision
7 pages
Department of Computer Science and Engineering - University of Bologna
No ratings yet
Department of Computer Science and Engineering - University of Bologna
23 pages
Cv Unit 1 Overview of Computer Vison and Application
No ratings yet
Cv Unit 1 Overview of Computer Vison and Application
51 pages
CV 01 Introduction
No ratings yet
CV 01 Introduction
14 pages
Mastering OpenCV with Practical Computer Vision Projects
From Everand
Mastering OpenCV with Practical Computer Vision Projects
Shervin Emami
No ratings yet
Computer_vision_part2
No ratings yet
Computer_vision_part2
62 pages
AIO2024 Module02 Extra SQL Big Data
No ratings yet
AIO2024 Module02 Extra SQL Big Data
94 pages
Topic 03 - Basic Statistics
No ratings yet
Topic 03 - Basic Statistics
42 pages
Topic 02 - Data Collection
No ratings yet
Topic 02 - Data Collection
44 pages
Teruaki Mukaiyama - : Y. Ishihara Baran Lab Group Meeting
No ratings yet
Teruaki Mukaiyama - : Y. Ishihara Baran Lab Group Meeting
9 pages
Daphniphyllum Alkaloids Final MDP
No ratings yet
Daphniphyllum Alkaloids Final MDP
15 pages
Cylindrospermopsin Synthesis
No ratings yet
Cylindrospermopsin Synthesis
8 pages
03a-GP Organomet Cat
No ratings yet
03a-GP Organomet Cat
40 pages
DLL Reading & Writing
No ratings yet
DLL Reading & Writing
2 pages
UAS - MKU - GENAP - 2023-2024 (Bahasa Inggris)
No ratings yet
UAS - MKU - GENAP - 2023-2024 (Bahasa Inggris)
4 pages
SOME Catalogo Arranques
No ratings yet
SOME Catalogo Arranques
80 pages
The Cetane Index Is A Figure Which Denotes The Quality of A Diesel Fuel
No ratings yet
The Cetane Index Is A Figure Which Denotes The Quality of A Diesel Fuel
4 pages
Chapter 1: Engineering and Management
No ratings yet
Chapter 1: Engineering and Management
4 pages
Alpha Strike Fun Bois
No ratings yet
Alpha Strike Fun Bois
11 pages
Kuniyal 2005
No ratings yet
Kuniyal 2005
17 pages
Module - 5 - Possessive Nouns
No ratings yet
Module - 5 - Possessive Nouns
4 pages
Breathing and Exchange of Gases Practice Questions
No ratings yet
Breathing and Exchange of Gases Practice Questions
1 page
Aggregate
No ratings yet
Aggregate
2 pages
Project From Hell Case Study - Group 6 Presenatation
No ratings yet
Project From Hell Case Study - Group 6 Presenatation
5 pages
Jane Street - Software Engineer
No ratings yet
Jane Street - Software Engineer
1 page
4.1 06 - Areas Related To A Circle Solved Questions PDF
No ratings yet
4.1 06 - Areas Related To A Circle Solved Questions PDF
11 pages
Sports Management Collection
0% (1)
Sports Management Collection
6 pages
CL 10... Parallel Connection (Experiment 3)
No ratings yet
CL 10... Parallel Connection (Experiment 3)
2 pages
Iveco EUROCARGO RANGE
No ratings yet
Iveco EUROCARGO RANGE
3 pages
Multiple Matching Worksheet 1: Exotic Fruit
No ratings yet
Multiple Matching Worksheet 1: Exotic Fruit
6 pages
What Is Table
No ratings yet
What Is Table
7 pages
International Supreme Price List 2021
No ratings yet
International Supreme Price List 2021
41 pages
UcD-Super Lite V.2 FINAL
No ratings yet
UcD-Super Lite V.2 FINAL
1 page
مساعد الطالب انكليزي رابع اعدادي
No ratings yet
مساعد الطالب انكليزي رابع اعدادي
154 pages
transferReceipt 606148598
No ratings yet
transferReceipt 606148598
3 pages
CH 10 - Disconnection & Reconnection
No ratings yet
CH 10 - Disconnection & Reconnection
1 page
Musical Instrument Inventory
No ratings yet
Musical Instrument Inventory
1 page
Hajimari Wa Kimi No Sora (Preview Ver.) Piano Sheet Music
No ratings yet
Hajimari Wa Kimi No Sora (Preview Ver.) Piano Sheet Music
4 pages
Learnforexbasic Candlestick Patterns
No ratings yet
Learnforexbasic Candlestick Patterns
10 pages
List of Indian Importers of Product(s) Classified
No ratings yet
List of Indian Importers of Product(s) Classified
188 pages
Delay Analysis by Long
No ratings yet
Delay Analysis by Long
36 pages

Computer_vision_part1

Uploaded by

Computer_vision_part1

Uploaded by

Computer Vision

IFT6758 - Data Science

• Computer vision is a field of study focused on the problem of

• Image processing is the

• Computer vision is concerned

Multiview Geometry, 3D Vision, Shape-from-X

Tracking, Alignment, Optical Flow, Correspondence

Clustering, Unsupervised Learning, Segmentation, Perceptual

Verification, Identification, Detection

• Images and movies are everywhere

• Fast-growing collection of useful applications

• building representations of the 3D world from pictures

• automated surveillance (who’s doing what)

• Greater understanding of human vision

Image from Microsoft’s Virtual Earth

Technology to convert scanned docs to text

Digit recognition, AT&T labs License plate readers

Many new digital cameras now detect faces

Sony Cyber-shot® T70 Digital Still Camera

Face recognition systems now

This is becoming real:

Digimask: put your face on a 3D avatar.

Nintendo Wii has camera-based IR

NASA’s Mars Spirit Rover

Image guided surgery

Object appearance changes with respect to lighting magnitude and direction.

Illumination Sensor Response

cv2.IMREAD_COLOR: Default flag for loading a color image

• Data augmentation uses the available data samples to

• Transformation T is a coordinate-changing machine:

What is the inverse?

• Rotation by angle θ (about the origin)

What is the inverse?

2D mirror across line y = x?

• Affine transformations are combinations of …

any transformation with

⎡ x'⎤ ⎡cos θ − sin θ 0⎤ ⎡ x ⎤ ⎡ x '⎤ ⎡ 1 shx 0⎤ ⎡ x ⎤

what happens when we mess with this

• Properties of projective transformations:

• Images can be easily scaled up and down

• Diﬀerent interpolation and downsampling methods are supported by

• Suppose we are building an image classification model for identifying the

• 2D moving average over a 3×3 window of neighborhood

Achieve smoothing effect (remove sharp features)

• Contextual: grouping pixels with similar features and in close locations.

Linear filtering means linear combination of neighboring pixel values.

• Is the moving average system a linear system?

• Is thresholding a linear system?

• Any linear, shift invariant operator can be represented as

You might also like