Lec01 Intro
Lec01 Intro
Today
• Introduction to computer vision
• Course overview
• Course requirements
The goal of computer vision
• To bridge the gap between pixels and “meaning”
Reconstruction from
Real-time stereo Structure from motion Internet photo collections
sky
building
flag
face
banner
wall
street lamp
bus bus
slanted
non-rigid moving
object
vertical
• Vision is useful
• Vision is interesting
• Vision is difficult
– Half of primate cerebral cortex is devoted to visual processing
– Achieving human-level visual perception is probably “AI-complete”
Why is computer vision difficult?
Challenges: viewpoint variation
Source: J. Koenderink
Shape cues: Texture gradient
Shape and lighting cues: Shading
Source: J. Koenderink
Position and lighting cues: Cast shadows
Source: J. Koenderink
Grouping cues: Similarity (color, texture,
proximity)
Grouping cues: “Common fate”
• Possible solutions
– Bring in more constraints (more images)
– Use prior knowledge about the structure of the world
• Need a combination of different methods
Connections to other disciplines
Artificial Intelligence
Computer Vision
Image Processing
Origins of computer vision
Source: S. Seitz
3D urban modeling
Source: S. Seitz
3D urban modeling: Microsoft Photosynth
Source: S. Seitz
Smile detection
http://www.apple.com/ilife/iphoto/
Biometrics
Source: S. Seitz
Biometrics
Source: S. Seitz
Optical character recognition (OCR)
Technology to convert scanned docs to text
• If you have a scanner, it probably came with OCR software
Source: S. Seitz
Mobile visual search: Google Goggles
Mobile visual search: iPhone Apps
Automotive safety
LaneHawk by EvolutionRobotics
“A smart camera is flush-mounted in the checkout lane, continuously watching for items.
When an item is detected and recognized, the cashier verifies the quantity of items that
were found under the basket, and continues to close the transaction. The item can remain
under the basket, and with LaneHawk,you are assured to get paid for it… “
Source: S. Seitz
Vision-based interaction (and games)
Sony EyeToy
Assistive technologies
Source: S. Seitz
Vision for robotics, space exploration
NASA'S Mars Exploration Rover Spirit captured this westward view from atop
a low plateau where Spirit spent the closing months of 2007.
http://www.cs.ubc.ca/spider/lowe/vision.html
Course overview
I. Early vision: Image formation and processing
II. Mid-level vision: Grouping and fitting
III. Multi-view geometry
IV. Recognition
V. Advanced topics
I. Early vision
• Basic image formation and processing
* =
Linear filtering
Edge detection
Cameras and sensors
Light and color
Alignment
• Instructor:
Svetlana Lazebnik ([email protected])
• Office hours:
By appointment, FB 244
• Textbooks (suggested):
Forsyth & Ponce, Computer Vision: A Modern Approach
Richard Szeliski, Computer Vision: Algorithms and
Applications (draft available online)
• Class webpage:
http://www.cs.unc.edu/~lazebnik/spring10
Course requirements
• Philosophy: computer vision is best experienced hands-on
• Participation: 20%
– Come to class regularly
– Ask questions
– Answer questions
Collaboration policy
• Feel free to discuss assignments with each other, but coding
must be done individually