0% found this document useful (0 votes)
11 views

Lec01 Intro

This presentation provides an introduction to the field of computer vision. It covers the basic goals of the field, challenges, and its relevance across different applications, from personal photo albums to robotics. It also introduces course requirements and the connection of computer vision to other disciplines

Uploaded by

cap.cafu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Lec01 Intro

This presentation provides an introduction to the field of computer vision. It covers the basic goals of the field, challenges, and its relevance across different applications, from personal photo albums to robotics. It also introduces course requirements and the connection of computer vision to other disciplines

Uploaded by

cap.cafu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 61

COMP 776: Computer Vision

Today
• Introduction to computer vision
• Course overview
• Course requirements
The goal of computer vision
• To bridge the gap between pixels and “meaning”

What we see What a computer sees


Source: S. Narasimhan
What kind of information can we extract
from an image?
• Metric 3D information
• Semantic information
Vision as measurement device

Reconstruction from
Real-time stereo Structure from motion Internet photo collections

NASA Mars Rover

Pollefeys et al. Goesele et al.


Vision as a source of semantic information

slide credit: Fei-Fei, Fergus & Torralba


Object categorization

sky
building

flag

face
banner
wall
street lamp
bus bus

cars slide credit: Fei-Fei, Fergus & Torralba


Scene and context categorization
• outdoor
• city
• traffic
•…

slide credit: Fei-Fei, Fergus & Torralba


Qualitative spatial information

slanted

non-rigid moving
object

vertical

rigid moving rigid moving


object object
horizontal slide credit: Fei-Fei, Fergus & Torralba
Why study computer vision?
• Vision is useful: Images and video are everywhere!

Personal photo albums Movies, news, sports

Surveillance and security Medical and scientific images


Why study computer vision?

• Vision is useful
• Vision is interesting
• Vision is difficult
– Half of primate cerebral cortex is devoted to visual processing
– Achieving human-level visual perception is probably “AI-complete”
Why is computer vision difficult?
Challenges: viewpoint variation

Michelangelo 1475-1564 slide credit: Fei-Fei, Fergus & Torralba


Challenges: illumination

image credit: J. Koenderink


Challenges: scale

slide credit: Fei-Fei, Fergus & Torralba


Challenges: deformation

Xu, Beihong 1943

slide credit: Fei-Fei, Fergus & Torralba


Challenges: occlusion

Magritte, 1957 slide credit: Fei-Fei, Fergus & Torralba


Challenges: background clutter
Challenges: Motion
Challenges: object intra-class
variation

slide credit: Fei-Fei, Fergus & Torralba


Challenges: local ambiguity

slide credit: Fei-Fei, Fergus & Torralba


Challenges or opportunities?
• Images are confusing, but they also reveal the structure of
the world through numerous cues
• Our job is to interpret the cues!

Image source: J. Koenderink


Depth cues: Linear perspective
Depth cues: Aerial perspective
Depth ordering cues: Occlusion

Source: J. Koenderink
Shape cues: Texture gradient
Shape and lighting cues: Shading

Source: J. Koenderink
Position and lighting cues: Cast shadows

Source: J. Koenderink
Grouping cues: Similarity (color, texture,
proximity)
Grouping cues: “Common fate”

Image credit: Arthus-Bertrand (via F. Durand)


Bottom line
• Perception is an inherently ambiguous problem
– Many different 3D scenes could have given rise to a particular 2D picture
Bottom line
• Perception is an inherently ambiguous problem
– Many different 3D scenes could have given rise to a particular 2D picture

• Possible solutions
– Bring in more constraints (more images)
– Use prior knowledge about the structure of the world
• Need a combination of different methods
Connections to other disciplines

Artificial Intelligence

Robotics Machine Learning

Computer Vision

Computer Graphics Cognitive science


Neuroscience

Image Processing
Origins of computer vision

L. G. Roberts, Machine Perception


of Three Dimensional Solids,
Ph.D. thesis, MIT Department of
Electrical Engineering, 1963.
Computer Vision in the Real World
Special effects: shape and motion capture

Source: S. Seitz
3D urban modeling

Bing maps, Google Streetview

Source: S. Seitz
3D urban modeling: Microsoft Photosynth

http://labs.live.com/photosynth/ Source: S. Seitz


Face detection

Many new digital cameras now detect faces


• Canon, Sony, Fuji, …

Source: S. Seitz
Smile detection

Sony Cyber-shot® T70 Digital Still Camera Source: S. Seitz


Face recognition: Apple iPhoto software

http://www.apple.com/ilife/iphoto/
Biometrics

How the Afghan Girl was Identified by Her Iris


Patterns

Source: S. Seitz
Biometrics

Face recognition systems now


Fingerprint scanners on
beginning to appear more widely
many new laptops, http://www.sensiblevision.com/
other devices

Source: S. Seitz
Optical character recognition (OCR)
Technology to convert scanned docs to text
• If you have a scanner, it probably came with OCR software

Digit recognition, AT&T labs License plate readers


http://en.wikipedia.org/wiki/Automatic_number_plate_recognition

Source: S. Seitz
Mobile visual search: Google Goggles
Mobile visual search: iPhone Apps
Automotive safety

Mobileye: Vision systems in high-end BMW, GM, Volvo models


• “In mid 2010 Mobileye will launch a world's first application of full
emergency braking for collision mitigation for pedestrians where
vision is the key technology for detecting pedestrians.”
Source: A. Shashua, S. Seitz
Vision in supermarkets

LaneHawk by EvolutionRobotics
“A smart camera is flush-mounted in the checkout lane, continuously watching for items.
When an item is detected and recognized, the cashier verifies the quantity of items that
were found under the basket, and continues to close the transaction. The item can remain
under the basket, and with LaneHawk,you are assured to get paid for it… “
Source: S. Seitz
Vision-based interaction (and games)

Sony EyeToy

Nintendo Wii has camera-based IR


tracking built in. See Lee’s work at
CMU on clever tricks on using it to
create a multi-touch display!

Assistive technologies
Source: S. Seitz
Vision for robotics, space exploration

NASA'S Mars Exploration Rover Spirit captured this westward view from atop
a low plateau where Spirit spent the closing months of 2007.

Vision systems (JPL) used for several tasks


• Panorama stitching
• 3D terrain modeling
• Obstacle detection, position tracking
• For more, read “Computer Vision on Mars” by Matthies et al.
Source: S. Seitz
The computer vision industry
• A list of companies here:

http://www.cs.ubc.ca/spider/lowe/vision.html
Course overview
I. Early vision: Image formation and processing
II. Mid-level vision: Grouping and fitting
III. Multi-view geometry
IV. Recognition
V. Advanced topics
I. Early vision
• Basic image formation and processing

* =
Linear filtering
Edge detection
Cameras and sensors
Light and color

Feature extraction: corner and blob detection


II. “Mid-level vision”
• Fitting and grouping

Alignment

Fitting: Least squares


Hough transform
RANSAC
III. Multi-view geometry

Stereo Epipolar geometry

Tomasi & Kanade (1993)

Affine structure from motion Projective structure from motion


IV. Recognition

Patch description and matching Clustering and visual vocabularies

Bag-of-features models Classification

Sources: D. Lowe, L. Fei-Fei


V. Advanced Topics
• Time permitting…

Segmentation Face detection

Articulated models Motion and tracking


Basic Info

• Instructor:
Svetlana Lazebnik ([email protected])

• Office hours:
By appointment, FB 244

• Textbooks (suggested):
Forsyth & Ponce, Computer Vision: A Modern Approach
Richard Szeliski, Computer Vision: Algorithms and
Applications (draft available online)

• Class webpage:
http://www.cs.unc.edu/~lazebnik/spring10
Course requirements
• Philosophy: computer vision is best experienced hands-on

• Programming assignments: 50%


– Four assignments
– Expect the first one in the next couple of classes
– Brush up on your MATLAB skills (see web page for tutorial)

• Final assignment: 30%


– Recognition competition
– Winner gets a prize!

• Participation: 20%
– Come to class regularly
– Ask questions
– Answer questions
Collaboration policy
• Feel free to discuss assignments with each other, but coding
must be done individually

• Feel free to incorporate code or tips you find on the Web,


provided this doesn’t make the assignment trivial and you
explicitly acknowledge your sources

• Remember: I can Google too (and I have the copies of


everybody’s assignments from the last two years this class
was offered)
For next time
• Self-study: MATLAB tutorial
• Reading: cameras and image formation (F&P chapter 1)

You might also like