0% found this document useful (0 votes)

10 views

Lecture 01

Uploaded by

jinyaoz

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Lecture 01

Uploaded by

jinyaoz

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 79

Computer Vision!

CS-E4850, 5 study credits!

!
Juho Kannala!
Aalto University!
Plan for today!

• Background!
• What is computer vision?!
• Why to study computer vision?!

• Overview of the course!

• Lecture 1: Image formation!

Credits: Material for slides borrowed from Victor Prisacariu, Andrew Zisserman, Esa Rahtu, James Hays, !
Derek Hoiem, Svetlana Lazebnik, Steve Seitz, David Forsyth, and others!
Course personnel!

!
• Lecturer: !
Juho Kannala
[email protected]!

• Main course assistant:!

Xiaotian Li
[email protected] !
A few words about me!

Juho Kannala!
Assistant Professor of Computer vision!
• PhD, University of Oulu 2010!

• Professor at Aalto since 2016!

• Working with computer vision since 2000 !

• Recent projects and other info available on my homepage: https://users.aalto.fi/~kannalj1/ !

Motivation - what is computer vision?!
Make computers understand images!

• What kind of scene?!

• Where are the cars?!
• How far are the buildings?!
• Where are the cars going?!
• …..!
Many data modalities!

• 2D or 3D still images !
• Video frames!
• X-ray !
• Ultra-sound!
• Microscope!
• ….!
What kind of information can be extracted?!

Semantic information! Geometric information!

What do we have here?!

… seems pretty easy…

Wrong! Very hard big data problem…!

• Hardware perspective:!
• RGB stereo images with 30 frames per second -> 100s MB/s data stream.!
• Non-trivial processing per each byte.!
• Massive image collections.!

• Mathematical perspective!
• Information is highly implicit or lost by perspective projection!
• 2D -> 3D mapping is ill-posed and ill-conditioned -> need to use constraints!
Wrong! Very hard big data problem…!

• Artificial intelligence perspective!

• Images have uneven information content !
• Computational visual semantics is hard (what does visual stuff mean exactly?)!
• If we have limited time, what is the important visual stuff right now?!

Still a massive challenge - if we want genuine autonomy.!

Natural vision !

• Humans see effortlessly!

Natural vision!

• Humans see effortlessly, but… it is very hard work for our brains!!
• There are billions of neurons in human brain!
• Years of evolution generated hardwired priors.!

So why bother?
What are the advantages?
Why computer vision matters?!

• Engineering point of view - Computer Vision helps to

solve many practical problems: business potential!
• Scientific point of view - Human kind of visual system is
one of the grand challenges of Artificial Intelligence (AI)!
• AI itself is a grand challenge of computing !
Why computer vision matters?!

• Safety!
• Health!
• Security!
• Fun!
• Access!
• ….!
Computer vision is already here!

• You are surrounded by !

devices using computer vision!
• Imagine what can be done !
with already installed cameras!!
Motivation - Success stories!
Recognizing “simple” patterns!
Face recognition!
Object detection and recognition!
Reconstruction: 3D from photo collections!

The Visual Turing test for Scene Reconstruction,!

Shan, Adams, Curless, Furukawa, Seitz, in 3DV 2013. YouTube video.!
A recent commercial 3D reconstruction system!

YouTube!
Robotics!

NASA’s Mars Rover! Robocup!

See “Computer Vision on Mars”! See www.robocup.org !

STAIRS at Stanford!
Saxena et al. 2008 !
Self-driving cars (Nvidia @ CES 2016)!
Visual odometry and SLAM!
Augmented Reality (AR) and Virtual Reality (VR)!
Image generation!

A style-based generator architecture for generative adversarial networks. Karras, Laine, Aila. CVPR 2019.!
Current state of the affairs!

• Many of the previous examples are less than 5 years old!!

• Many new applications to appear in the next 5 years!
• Strong open source culture!
• Many recent state-of-the-art methods are freely available!
• See papers from top conferences like CVPR, ECCV, ICCV, and NeurIPS!
5160

Rapidly growing area!

2019
Attendees and submissions to IEEE Conference on !
Computer Vision and Pattern Recognition (CVPR)!
Rapidly growing area !

Ref. Google Scholar top publications.!

Rapidly growing area - substantial commercial interest!

CVPR 2018 sponsors!

Plenty of job opportunities!

• Companies are looking for computer vision and deep learning experts.!
• Big Internet players are investing heavily (Apple, Google, Facebook,
Microsoft, Baidu, Tencent, …) as well as car industry (Tesla, BMW,…)!
• Strong imaging ecosystem also in Finland!
Specifics of this course!
Course textbooks!

• Szeliski: Computer Vision!

• Full-copy freely available!

• Hartley & Zisserman: Multiple!

View Geometry in Computer Vision!
• Available as an e-book via library!

• Forsyth & Ponce: Computer Vision!

• Full-copy freely available!
What will you learn on this course?!

• Course content (numbers refer to chapters in Szeliski’s book,1st edition):!

• Image formation and processing (2, 3)!
• Feature detection and matching (4)!
• Feature based alignment and image stitching (6,9)!
• Optical flow and tracking (8)!
• Basics of image classification and convolutional neural networks!
• Object recognition and detection (14)!
• Structure from motion, stereo and 3D reconstruction (7, 11, 12)!
What will you NOT learn on this course?!

• Software packages!
• PyTorch, TensorFlow, Keras, Caffe, etc.!
• We have simple exercises with Python/Matlab though!

• In-depth deep learning!

• Tweaking architectures, loss functions, etc.!
• Note that there exists a separate deep learning course (CS-E4890) !

• All the bells and whistles in the state-of-the-art systems!

• We concentrate on the basic concepts (get them right and the rest is easier for you)!
Organization!

• Lectures on Mondays at 8-10 (12 lectures)!

• Exercises on Fridays at 12-14 (12 sessions)!
• The solutions of weekly homework assignments should be returned before the session!
• The solutions are presented in the session !

• Guidance available if needed!

• Slack and guidance sessions on Thursdays (see MyCourses)!

• Presence is not rewarded, only returned homework and exam counts!

Requirements!

• Get more than 0 points from at least 8 exercise rounds !

(i.e. solve at least 1 task from 8 different weekly rounds)!
• Pass the exam!
Hints!

• Doing homework takes time but is often a good way to learn in depth!
• Try to do more than the minimum - homework points are taken into
account in the grading (i.e. weighted exercise points are added to
exam points)!
• Note that the amount of work and bonus points varies a bit between
weeks - exercises are published early so that you can do them in
advance if needed!
Questions at this point?!
Lecture 1: Camera model!
Relevant reading!

• Chapters 2, 3, and 6 in [Hartley & Zisserman]!

• Comprehensive presentation of the core content!

• Chapter 2 in [Szeliski]!
• Broader overview of the image formation!
This is (a picture of) a cat!

Credits: Victor Prisacariau!

Cat lives in a 3D world!

The point X in world space projects to the point x in image space.!

Credits: Victor Prisacariau!
Going from X in 3D to x in 2D!

The output would be blurry if film just exposed to the cat.!

Pinhole camera!

All rays passing through a single point (center of projection)!

Pinhole camera!
Pinhole camera!
What happens in the projection?!

• Projection from 3D to 2D -> information is lost!

• What properties are preserved?!
• Straight lines!
• Incidence!

• What properties are not preserved?!

• Angles!
• Lengths!
Projective geometry - what is lost?!
Length is not preserved!
Angles are not preserved!
Straight lines are still straight!
Vanishing points and lines!

• Parallel lines in the world!

intersect at a “vanishing point”!
Constructing the vanishing point of a line!
Vanishing points and lines!

All parallel lines will have the same vanishing point.!

Homogenous coordinates!

• The projection x1 = fX1/x3 is non linear!!

• Can be made linear using
homogenous coordinates!
• Homogenous coordinates allow for
transforms to be concatenated easily!
Homogenous coordinates!

Conversion to homogenous coordinates!

Conversion from homogenous coordinates!

Invariance to scaling!

E.g. [1,2,3] is the same as [3,6,9] and both represent !

the same inhomogeneous point [0.33,0.66]. !
Basic geometry in homogenous coordinates!

• Line equation: ax+by+c=0!

!
• A pixel p in homogenous coordinates:!
!
• Line is given by cross product of two points!
!
• Intersection of two lines is given by cross !
product of the lines!
3D Euclidean transformation!

• Cat moves through 3D space!

• The movement of the nose can be !
described using an Euclidean Transform!
Building the 3D rotation matrix R!

• R can be build from various representations (Euler angles, quaternion,

angle-axis representation, latter ones recommended)!
• Euler angles represent the rotation using three parameters, one for
each axis:!
!
!
!
!
!
!
!
!
3D Euclidean transformation!

• Concatenation of successive transforms is a mess!!

Homogenous coordinates save the day!!

• Replace 3D points with homogenous versions!

• The Euclidean transform becomes!

• Transformation can now be concatenated by matrix multiplication!

More 3D-3D and 2D-2D transformations!

3
Examples of 2D-2D transforms!
Perspective transformation (3D-2D)!
Perspective using homogenous coordinates!
Perspective using homogenous coordinates!
Wait! Our setup has several assumptions!

• Camera at world origin!

• Camera aligned with world
coordinates!
• Ideal pinhole camera!
Removing the initial assumptions!

• It is useful to split the overall projection matrix into three parts:!

• A part that depends on the internals of the camera (intrinsic)!
• A vanilla projection matrix!
• An Euclidean transformation between the world and camera frames (extrinsic)!

• Assume first that the world is aligned with camera coordinates!

-> the extrinsic camera matrix is an identity!
More realistic setting - camera pose!

• Assume the camera is translated and rotated with respect to the world!
The camera pose!

• The non-ideal camera pose can be taken into account by first

rotating and translating points from world frame to the camera frame!
The intrinsic parameters!

• Transformation to pixel units from metric units !

• Describe the hardware properties of a real camera!
• The image plane might be skewed!
• The pixels might not be square!
Summary of steps from scene to image!

• Move the scene point (Xw,1)T into camera coordinate system by!
4x4 (extrinsic) Euclidean transformation:!
!
!
• Project into ideal camera via the vanilla perspective transformation!
!

• Map the ideal image into the real image using intrinsic matrix!
Camera projection matrix P!
Beyond pinholes: Radial distortion!

• Common in wide-angle lenses!

• Creates non-linear terms in projection! Original!

• Usually handled by solving non-linear!

terms and then correcting the image!

Corrected!
Things to remember!

• Pinhole camera model!

!
!
• Homogenous coordinates!
!
!
• Camera projection matrix!
The end!

Transnet General Worker Employment Letter 1 AwvOWz790GINGEpx
No ratings yet
Transnet General Worker Employment Letter 1 AwvOWz790GINGEpx
2 pages
Trackpad Ver. 2.0 Class 6
From Everand
Trackpad Ver. 2.0 Class 6
Nidhi Arora
No ratings yet
Practical Computer Vision with SimpleCV The Simple Way to Make Technology See Kurt Demaagddownload
100% (2)
Practical Computer Vision with SimpleCV The Simple Way to Make Technology See Kurt Demaagddownload
55 pages
Lec01 CT Intro
No ratings yet
Lec01 CT Intro
61 pages
Practical Computer Vision with SimpleCV The Simple Way to Make Technology See Kurt Demaagd - The full ebook version is just one click away
100% (1)
Practical Computer Vision with SimpleCV The Simple Way to Make Technology See Kurt Demaagd - The full ebook version is just one click away
61 pages
CV #1 Course Introduction-1
No ratings yet
CV #1 Course Introduction-1
61 pages
EC-803 Computer Vision: Lecture-1
No ratings yet
EC-803 Computer Vision: Lecture-1
43 pages
Administrivia: CMPSCI 370: Introduction To Computer Vision
No ratings yet
Administrivia: CMPSCI 370: Introduction To Computer Vision
12 pages
Computer Vision ch1
No ratings yet
Computer Vision ch1
80 pages
Cv2021-Lec1-Introduction 1600 PDF - Gdrive.vip
No ratings yet
Cv2021-Lec1-Introduction 1600 PDF - Gdrive.vip
61 pages
Book
No ratings yet
Book
2 pages
CS436 CS5310 EE513 L01 Introduction
No ratings yet
CS436 CS5310 EE513 L01 Introduction
54 pages
Lec 00
No ratings yet
Lec 00
76 pages
DL4CV_Week01_Part01
No ratings yet
DL4CV_Week01_Part01
35 pages
Computer Vision Introduction
No ratings yet
Computer Vision Introduction
42 pages
CS7.505: Computer Vision: Spring 2022
No ratings yet
CS7.505: Computer Vision: Spring 2022
46 pages
01_Introduction_To_MachineVision
No ratings yet
01_Introduction_To_MachineVision
53 pages
intro
No ratings yet
intro
66 pages
Cv Digital Notes
No ratings yet
Cv Digital Notes
77 pages
Computer Vision: From Recognition To Geometry
No ratings yet
Computer Vision: From Recognition To Geometry
26 pages
Computer Vision Notes
No ratings yet
Computer Vision Notes
72 pages
Computer Vision
No ratings yet
Computer Vision
52 pages
lecture 1 AI Summary
No ratings yet
lecture 1 AI Summary
31 pages
lecture_2_handout
No ratings yet
lecture_2_handout
154 pages
Cv Unit 1 Overview of Computer Vison and Application
No ratings yet
Cv Unit 1 Overview of Computer Vison and Application
51 pages
Lec00 Intro For Web
No ratings yet
Lec00 Intro For Web
81 pages
Lec 1
No ratings yet
Lec 1
51 pages
CompVisNotes PDF
No ratings yet
CompVisNotes PDF
115 pages
01 Introduction
No ratings yet
01 Introduction
19 pages
3dv Slides
No ratings yet
3dv Slides
153 pages
Lec01 Intro
No ratings yet
Lec01 Intro
61 pages
What Computer Vision With The OpenCV
100% (5)
What Computer Vision With The OpenCV
137 pages
LectureNotes PDF
No ratings yet
LectureNotes PDF
212 pages
Lec00 Intro For Web Highlighted
No ratings yet
Lec00 Intro For Web Highlighted
72 pages
Lecture 1-Introduction Fundamentals
No ratings yet
Lecture 1-Introduction Fundamentals
42 pages
Chapter 1 - Introduction To CV
No ratings yet
Chapter 1 - Introduction To CV
49 pages
Computer Vision
No ratings yet
Computer Vision
41 pages
Introduction to Data Science: (Khoa học dữ liệu)
No ratings yet
Introduction to Data Science: (Khoa học dữ liệu)
91 pages
Computer Vision 1731163352
No ratings yet
Computer Vision 1731163352
153 pages
CV - Lec01 - Introduction
No ratings yet
CV - Lec01 - Introduction
50 pages
Week-16 Lecture-32
No ratings yet
Week-16 Lecture-32
65 pages
CS231A - Computer Vision: Project Proposals
No ratings yet
CS231A - Computer Vision: Project Proposals
46 pages
CS5330-F22-Lectures
No ratings yet
CS5330-F22-Lectures
116 pages
MODULE-1
No ratings yet
MODULE-1
18 pages
Unit 4 Computer Vision Lecture Notes 1 4 Compress
No ratings yet
Unit 4 Computer Vision Lecture Notes 1 4 Compress
138 pages
Computer Vision and Virtual Reality: Motivation
No ratings yet
Computer Vision and Virtual Reality: Motivation
9 pages
CV s2015 Lec 1
No ratings yet
CV s2015 Lec 1
32 pages
1 Intro Visión Artificial
No ratings yet
1 Intro Visión Artificial
50 pages
Unit 5 Introduction Robot Vision
No ratings yet
Unit 5 Introduction Robot Vision
60 pages
UNESCO Module: Introduction To Computer Vision and Image Processing
No ratings yet
UNESCO Module: Introduction To Computer Vision and Image Processing
48 pages
Introduction To CVIP
No ratings yet
Introduction To CVIP
33 pages
01 Lecture No. 1
No ratings yet
01 Lecture No. 1
52 pages
00 - Course Info - MSc
No ratings yet
00 - Course Info - MSc
12 pages
CS-475 - Computer Vision
No ratings yet
CS-475 - Computer Vision
5 pages
INT345 Computer Vision
No ratings yet
INT345 Computer Vision
31 pages
Opencv 2 Refman
No ratings yet
Opencv 2 Refman
553 pages
CV Lecture 1
No ratings yet
CV Lecture 1
65 pages
Opencv 2 Refman
No ratings yet
Opencv 2 Refman
553 pages
The Visual Elements—Photography: A Handbook for Communicating Science and Engineering
From Everand
The Visual Elements—Photography: A Handbook for Communicating Science and Engineering
Felice C. Frankel
No ratings yet
Laser TV: Bring the cinema home with a breathtaking 4K Ultra-HD experience
From Everand
Laser TV: Bring the cinema home with a breathtaking 4K Ultra-HD experience
Fouad Sabry
No ratings yet
AI Breakthroughs: Theories and Concepts for Today
From Everand
AI Breakthroughs: Theories and Concepts for Today
Gopee Mukhopadhyay
No ratings yet
Final Report (1)
No ratings yet
Final Report (1)
18 pages
SF_Lund_2011_part1
No ratings yet
SF_Lund_2011_part1
87 pages
Lecture 02
No ratings yet
Lecture 02
92 pages
P GPT: Evaluating and Harnessing Large Language Models For Automated Penetration Testing
No ratings yet
P GPT: Evaluating and Harnessing Large Language Models For Automated Penetration Testing
21 pages
Lecture 03
No ratings yet
Lecture 03
82 pages
Lecture 05
No ratings yet
Lecture 05
57 pages
ECE 408 HOMEWORK #3 (CH. 13) Solutions ©dr. James S. Kang: F Rad F
No ratings yet
ECE 408 HOMEWORK #3 (CH. 13) Solutions ©dr. James S. Kang: F Rad F
49 pages
My Essay
No ratings yet
My Essay
8 pages
TDS Byk-1794 en
No ratings yet
TDS Byk-1794 en
4 pages
HSC Board Exam July 2024 - Que.5B, 5C, 5D - History of Novel - Activities
No ratings yet
HSC Board Exam July 2024 - Que.5B, 5C, 5D - History of Novel - Activities
5 pages
Politis. 2003.paisaje Teórico y Desarrollo Metodológico de La Arqueología en Latinoamérica
No ratings yet
Politis. 2003.paisaje Teórico y Desarrollo Metodológico de La Arqueología en Latinoamérica
28 pages
Pancreatic Cancer With Special Focus On Topical Issues and Surgical Techniques
No ratings yet
Pancreatic Cancer With Special Focus On Topical Issues and Surgical Techniques
462 pages
Problems Faced by Women Entrepreneurs in India 2
No ratings yet
Problems Faced by Women Entrepreneurs in India 2
12 pages
Primitives: - . Exercises
No ratings yet
Primitives: - . Exercises
1 page
WWWYTBR NewsRelease
No ratings yet
WWWYTBR NewsRelease
3 pages
Truck Homologation Norms
No ratings yet
Truck Homologation Norms
1 page
Trade Finance Presentation
No ratings yet
Trade Finance Presentation
8 pages
Euro Zone Stability Pact
No ratings yet
Euro Zone Stability Pact
6 pages
RE18306-02 - Gruppo 1 - Nlo
No ratings yet
RE18306-02 - Gruppo 1 - Nlo
132 pages
Keller Et Al., 2012 PDF
No ratings yet
Keller Et Al., 2012 PDF
15 pages
2020 Laundry English
No ratings yet
2020 Laundry English
56 pages
ASMO Math Year5
100% (1)
ASMO Math Year5
10 pages
Sale by Non-Owner: Indiviso Share in The Property and Consequently Made The Buyer A Co-Owner of The
No ratings yet
Sale by Non-Owner: Indiviso Share in The Property and Consequently Made The Buyer A Co-Owner of The
15 pages
Contemporary Political Theory 1st ed. 2015 Edition Andrew Shorten instant download
100% (1)
Contemporary Political Theory 1st ed. 2015 Edition Andrew Shorten instant download
61 pages
Dan Sinykin - American Literature and The Long Downturn - Neoliberal Apocalypse (2020)
No ratings yet
Dan Sinykin - American Literature and The Long Downturn - Neoliberal Apocalypse (2020)
196 pages
RadioSecure IVM 4 Starter Kit User Manual
No ratings yet
RadioSecure IVM 4 Starter Kit User Manual
10 pages
(Top) Term of Office
No ratings yet
(Top) Term of Office
4 pages
Werewolf The Forsaken Fetishes
No ratings yet
Werewolf The Forsaken Fetishes
47 pages
Weaving Loom Parts Catalogue Sulzer Rapier Spare
No ratings yet
Weaving Loom Parts Catalogue Sulzer Rapier Spare
51 pages
The Physics of Two Dimensional Josephson Junction Arrays
No ratings yet
The Physics of Two Dimensional Josephson Junction Arrays
250 pages
Illustrated Parts Catalog: Section Title
No ratings yet
Illustrated Parts Catalog: Section Title
18 pages
968264566-1743137025177--co-vu-mai-phuong-chuyen-de-15_thi-online_menh-de-quan-he-buoi-03
No ratings yet
968264566-1743137025177--co-vu-mai-phuong-chuyen-de-15_thi-online_menh-de-quan-he-buoi-03
3 pages
Script For Teachers Intern
No ratings yet
Script For Teachers Intern
1 page
cstTRA-3470254 3
No ratings yet
cstTRA-3470254 3
7 pages
Monrovia Sewer System Description
No ratings yet
Monrovia Sewer System Description
7 pages