01 Introduction 2023
01 Introduction 2023
Instructor: Xu Zhao
About Me
❖ Motivation
❖ Application: promising and significant direction
toward practical artificial intelligence.
❖ Research: in CV field, many open problems need to
be solved by inventing diverse methodologies from
different research domains.
Course information
❖ All in Canvas https://oc.sjtu.edu.cn/login/canvas
๏ Sildes
๏ Video
๏ Assignments
๏ Discussion
๏ Announcement
❖ Keeping in touch: Wechat group
❖ Office hours:
Grading policy
❖ Attendance: 10 %
❖ Final project: 45 %
Syllabus
Event type Contents Hours Week No.
Unit 0 Introduction 3 1
Lecture 1 Introduction 3 1
Lecture 7 Calibration 3 7
Lecture 8 Stereopsis 3 8
Lecture 9 Structure from motion 3 9
Unit IV Grouping and Fitting 9 10-12
Lecture 10 Segmentation 3 10
Lecture 11 Fitting 3 11
Lecture 12 Registration 3 12
Unit V Recognition: high-level vision 9 13-15
Lecture 13 Learning based recognition 3 13
Lecture 14 BoW model 3 14
Lecture 15 Object detection 3 15
Final project Project presentation and evaluation 3 16
Reference
Reference
Reference
Other resources
Lecture 1: Introduction
Contents
❖ What is vision?
๏ Psychological perspective
๏ History
❖ Course information
What is vision?
❖ What does it mean, to see?
❖ Philosophy perspective: “To know what is where by looking”. -plain
man’s answer. Vision is a process discovering from images what is
present in the world and where it is, and furthermore, what action
are taking place. [David Marr, Vision]
❖ Biological perspective: The special sense by which the qualities of
an object (such as color, luminosity, shape, and size) constituting its
appearance are perceived through a process in which light rays
entering the eye are transformed by the retina into electrical signals
that are transmitted to the brain via the optic nerve. [Merriam-
webster]
Visual perception
Analysis
Sensory process
Environmental stimulus
Challenges: ambiguity
Challenges: illusion
Verywell / JR Bee
Perceptual constancy
Size Lightness
Perceptual constancy
Shape
Orientation
Identification and recognition
❖ Top-down and bottom-up process
Identification and recognition
Representational framework
Name Purpose Primitives
Discussions
Motivation
๏ 1966: Marvin Minsky, at MIT asked his undergraduate student Gerald Jay Sussman to “spend
the summer linking a camera to a computer and getting the computer to describe what it saw”
๏ 1970s: Scene understanding by finding edges and then inferring the 3D structure
๏ 1980s: qualitative approach to understanding intensities and shading variations. Shape from
shading.
History
❖ 1990s: Existing techniques continued to be explored, learning based appeared
๏ Factorization techniques
๏ Bundle adjustment
๏ Graph cut
๏ Dimensionality reduction
Applications: OCR
❖ Technology to convert images of text into text
❖
Applications: Vision-based biometrics
❖ Face, Fingerprint, Iris, Palm, Gait…
Facial login without a password…
http://gl.ict.usc.edu/Research/presidentialportrait/
Logistic
robots
Saxena et al.
2008 Robotics
STAIR at from Boston
Stanford dynamics
Applications: Medical
3D imaging
MRI, CT
NASA'S Mars Exploration Rover Spirit captured this westward view from atop
a low plateau where Spirit spent the closing months of 2007.
Applications: Sports
Michelangelo 1475-1564
Challenges: Occlusion
Magritte, 1957
Klimt, 1913
Mr. Bean
Challenges: Local appearance ambiguity
Assignment