Lec 01 CompVision N DIP Intro
Lec 01 CompVision N DIP Intro
AND
COMPUTER VISION
Introduction
A color image is just three functions pasted together. We can write this as a
“vector-valued” function:
r ( x, y )
f ( x, y ) g ( x, y )
b ( x, y )
Digital Image
A digitized image is one in which:
Spatial and grayscale values have been made discrete.
Intensities measured across a regularly spaced grid in x and y directions are
sampled to
8 bits (256 values) per point for black and white,
3x8 bits per point for color images.
Stored as a 2D arrays of gray-level values. The array elements are called pixels and
identified by their x, y coordinates.
What is a Digital Image? (cont…)
Common image formats include:
1 sample per point (B&W or Grayscale)
3 samples per point (Red, Green, and Blue)
4 samples per point (Red, Green, Blue, and “Alpha”, a.k.a. Opacity)
Image restoration
techniques were
used to improve image
quality before fixing the
problem.
Image Processing
Image Compression
Computer Vision
Make computers understand images and video.
Computing properties of the 3D world from visual data (measurement)
Algorithms and representations to allow a machine to recognize objects, people, scenes, and
activities. (perception and interpretation)
…
What is Computer Vision?
Computer vision is the science and technology of machines that see.
Concerned with the theory for building artificial systems that obtain information from images.
The image data can take many forms, such as a video sequence, depth images, views from
multiple cameras, or multi-dimensional data from a medical scanner
DIP to CV
The continuum from image processing to computer vision can be broken up
into low-, mid- and high-level processes
19
Vision is really hard
Vision is Challenging
Inverse problems
Apriori-knowledge is required
Complexity is extensive
Non-local operations
21
Vision is really hard
24
Related disciplines
Artificial
intelligence
Machine
Graphics learning
Computer
Image vision Cognitive
processing science
Algorithms
Vision and graphics
graphics
vision
3D geometry
Estimation
physics
Why vision matters?
Images and
video are
everywhere!
46
Industry and Applications
Film and Video
Editing
Special effects
Image Database
Content based image retrieval
visual search of products
Face recognition
Industrial Automation and Inspection
vision-guided robotics
Inspection systems
Medical and Biomedical
Surgical assistance
Sensor fusion
Vision based diagnosis
Astronomy
Astronomical Image Enhancement
47 Chemical/Spectral Analysis
Industry and Applications
Arial Photography
Image Enhancement
Missile Guidance
Geological Mapping
Robotics
Autonomous Vehicles
Security and Safety
Biometry verification (face, iris)
Surveillance (fences, swimming pools)
Military
Tracking and localizing
Detection
Missile guidance
Traffic and Road Monitoring
Traffic monitoring Cruise Missiles
Adaptive traffic lights
48
Key Processes in Image Analysis
Image
Processing
Restoration
Image Segmentation
Enhancement
Image Feature
Acquisition Extraction
Object
Recognition
Problem Domain
Image Compression
Image Acquisition
Image
Processing
Restoration
Image Segmentation
Enhancement
Image Feature
Acquisition Extraction
Object
Recognition
Problem Domain
Image Compression
Image Enhancement
Image
Processing
Restoration
Image Segmentation
Enhancement
Image Feature
Acquisition Extraction
Object
Recognition
Problem Domain
Image Compression
Image Restoration
Image
Processing
Restoration
Image Segmentation
Enhancement
Image Feature
Acquisition Extraction
Object
Recognition
Problem Domain
Image Compression
Processing
Image
Processing
Restoration
Image Segmentation
Enhancement
Image Feature
Acquisition Extraction
Object
Recognition
Problem Domain
Image Compression
Segmentation
Image
Processing
Restoration
Image Segmentation
Enhancement
Image Feature
Acquisition Extraction
Object
Recognition
Problem Domain
Image Compression
Representation & Description
Image
Processing
Restoration
Image Segmentation
Enhancement
Image Feature
Acquisition Extraction
Object
Recognition
Problem Domain
Image Compression
Object Recognition
Image
Processing
Restoration
Image Segmentation
Enhancement
Image Feature
Acquisition Extraction
Object
Recognition
Problem Domain
Image Compression
Image Compression
Image
Processing
Restoration
Image Segmentation
Enhancement
Image Feature
Acquisition Extraction
Object
Recognition
Problem Domain
Image Compression
VISION CHALLENGES
viewpoint variation
Michelangelo 1475-1564
Illumination
Illumination
Scale
Deformation
Occlusion
Background Clutter
Background Clutter
Object intra-class variation
Local ambiguity
Challenges or opportunities?
Images are confusing, but they also reveal the structure of the world
through numerous cues
Our job is to interpret the cues!
Possible solutions
Bring in more constraints ( or more images)
Use prior knowledge about the structure of the world
Need both exact measurements and statistical inference!
Some more Applications of IP and CV
Image Enhancement
Denoising Inpainting
Texture Synthesis
80
Industry and Applications
Image Demosaicing Face detection
Almost all digital cameras now detect
faces
82
Optical character recognition (OCR)
Technology to convert scanned docs to text
• If you have a scanner, it probably came with OCR software
Age recognition
Smile recognition
Smile detection?
“How the Afghan Girl was Identified by Her Iris Patterns” Read the story
Who is she?
Facial Expression Recognition
http://www.youtube.com/watch?v=M1WgnisIyPQ&feature=related
Earth Viewers (3D Modeling)
Flickr photos
3D model
Mobileye
Vision systems currently in high-end BMW, GM, Volvo models
By 2010: 70% of car manufacturers.
Video demo
BMW 5 series
Kinect
Assistive technologies
Nintendo Wii has camera-based IR
tracking built in. See Lee’s work atCMU on clever tricks on using it to
create a multi-touch display!
Virtual Fitting
2015
Interactive Games: Kinect
Object Recognition: http://www.youtube.com/watch?feature=iv&v=fQ59dXOo63o
Mario: http://www.youtube.com/watch?v=8CTJL5lUjHg
3D: http://www.youtube.com/watch?v=7QrnwoO1-8A
Robot: http://www.youtube.com/watch?v=w8BmgtMKFbY
Vision in space
NASA'S Mars Exploration Rover Spirit captured this westward view from atop
a low plateau where Spirit spent the closing months of 2007.
Verification
Multi-scale
Edge Detection
Convex Grouping Hypotheses
Line fitting
Ebrahim Emami, Touqeer Ahmad, George Bebis, Ara Nefian, and Terry Fong, "Crater
Detection Using Unsupervised Algorithms and Convolutional Neural Networks",
IEEE Transactions on Geoscience and Remote Sensing, vol. 57, no. 8, 2019.
Robotics
125
Geometric Operations
126
Point Operations
127
Geometric and Point Operations
128
Spatial Operations
129
Global Operations
130
Global Operations
Image domain
Freq. domain
131
Multi-Resolution
Low resolution
High resolution
133
What skills you need ?
Strong programming skills (i.e., C, C++, Python, Matlab)
Good knowledge of Data Structures and Algorithms
Good skills in analyzing algorithm performance (i.e., time and memory requirements).
Strong background in mathematics, especially in:
Linear Algebra
Probabilities and Statistics
Numerical Analysis
Geometry
Calculus
Textbook
Digital Image Processing
Rafael C. Gonzalez & Richard E. Woods,
http://szeliski.org/Book/