0% found this document useful (0 votes)

78 views

An Introduction To Computer Vision

This document provides an introduction to the field of computer vision. It discusses the motivation for computer vision research, which is to develop computers that can see and think intelligently like humans. The document outlines several application areas of computer vision, including human-computer interaction, intelligent environments, multimedia, and intelligent robots. It also discusses some fundamental research issues in the fields of image processing, computer vision, machine learning and pattern recognition that are important for developing applications in computer vision.

Uploaded by

Netsanet Getnet

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

78 views

An Introduction To Computer Vision

Uploaded by

Netsanet Getnet

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

EECS 432-Advanced Computer Vision Notes Series 1

An Introduction to Computer Vision

Ying Wu
Electrical Engineering & Computer Science
Northwestern University
Evanston, IL 60208
[email protected]

Contents
1 What Motivates Us? 2

2 What Is This Area About? 2

2.1 Application Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1.1 Human-Computer Interaction . . . . . . . . . . . . . . . . . . . . . . 3
2.1.2 Intelligent Environments . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.3 Multimedia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.4 Intelligent Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Fundamental Research Issues . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.1 Image Processing and Computer Vision . . . . . . . . . . . . . . . . . 4
2.2.2 Machine Learning and Pattern Recognition . . . . . . . . . . . . . . . 4

3 What Is Computer Vision? 5

3.1 Image Formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Low-level Image Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.3 Low-level Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.4 Middle-level Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.5 High-level Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

4 What Is This Course Going To Cover? 7

1
1 What Motivates Us?
An interesting question we always ask is what the next generation of computers is going
to be like. To answer this question, let’s recall our first touch of computer. At least, my
experience was that I waved my hands and said “how are you” to a machinery. Obviously,
no answer at all.
It was a dream that computers would be able to see and think, which has been driving
us to explore various research issues to make this dream come true. Although computers
become faster and faster, they are still quite dull, since they can neither see nor even perform
simple reasonings. Obviously, we are not satisfied to just use our computers as a calculator,
a word processer, a CD player, or a game station; instead, we expect computers to do more
intelligent things like our human beings. For example,

• Can computers identify me by looking at my face or even my gait?

• Can computers know where I am looking at and what I am doing?

• Can computers tell what is a car and what is not a car?

• Can computers learn something by themselves?

• Can computers summarize a video for me?

• ···

2 What Is This Area About?

Obviously, with an interdisciplinary nature, this area involves fundamental research in image
processing, computer vision/graphics, machine learning, pattern recognition, biomechanics
and even psychology. Figure 1 shows a big picture of this area. On top of it are several major
application areas such as human-computer interaction, robotics, virtual environments, and
multimedia. The common foundation for such applications include computer vision, image
processing and speech processing. Instead of taking some ad hoc approaches to audio and
visual processing when the area was in its infantile stage, we are currently pursuing some
intelligent ways by machine learning and pattern recognition, trying to achieve a kind of
artificial intelligence.

2.1 Application Areas

We can imagine what a visually-capable and intelligent computer can do! We expect a
revolution in next generation of computer: we do not use mice and keyboards anymore.
Computers could understand our actions and our languages, they could think and feedback
to us some kind of smart results in response to our commands, and they could even perform
some missions on behalf of our human beings. Least but not last, we expect a rapid progress
in the near future in such areas as intelligent human-computer interaction, robotics, virtual
environments, intelligent environments, and multimedia.

2
Biomechanics Psychology

Human-Computer Virtual
Multimedia Robotics Enviornments
Interaction

Computer Computer
Vision Graphics

Speech Image
Processing Processing

Machine Learning
A.I.
Pattern Recognition

Figure 1: The big picture of the entire area

2.1.1 Human-Computer Interaction

The research of human-computer interaction is no longer the design of devices and psycho-
logical experiments of windows layouts, but evolutes to a new stage: intelligent interaction.
One aspects is that computers should be able to accept audio and visual sensory inputs, and
then make some kind of analysis and interpretation, and then provide intuitive feedbacks by
synthesizing speech, video or actions. Fundamentally, besides speech recognition, computers
should be able to recognize, interpret and understand human actions and behaviors from
visual inputs.

2.1.2 Intelligent Environments

Intelligent environments, or smart environments, refer to some physical spaces that could
automatically or intelligently react according to human activities. For example, when a
person enters, the system could tell a people comes in and even identify who s/he is, and
then turn on the lights. When the people sits on a sofa and points to a TV, the TV will be
turned on. When s/he says “I want some news”, the TV will be switched to a channel that
is broadcasting news at that moment.

3
2.1.3 Multimedia
Multimedia is a vague term. Different people have different emphasis. We are particularly
interested in the analysis of the content of multimedia. An interesting question we ask is
what is inside this picture or what this video means, which involves a quite challenging task
of image/video understanding. Many appealing applications have been proposed, but yet to
be accomplished. When given just a photo of Sophie Marceau, without knowing her name,
computers could search the Internet and get tones of her photos and movies. When you
get tied of watching a long movie, computers could automatically summarize the movie in
maybe five minutes.

2.1.4 Intelligent Robots

Robots have been giving quite good mechanical ability, but they are still machinery because
they are neither able to see nor able to think. Honda has built a humanoid robot, ASIMO,
which can walk like a human being. However, he is blind, dump and dull. We expect to see
that ASIMO moves by itself.

2.2 Fundamental Research Issues

The fundamental research in image processing, computer vision, machine learning and pat-
tern recognition is important part of the foundation of these application topics.

2.2.1 Image Processing and Computer Vision

Image processing is a quite board research area, not just filtering, compression, and enhance-
ment. Besides, we are even interested in the question, “what is in images?”, i.e., content
analysis of visual inputs, which is part of the main task of computer vision. The study
of computer vision could make possible such tasks as 3D reconstruction of scenes, motion
capturing, and object recognition, which are crucial for even higher-level intelligence such as
image and video understanding, and motion understanding.

2.2.2 Machine Learning and Pattern Recognition

Vision perception itself is an intelligent process, not just an imaging process. Through vision,
human beings are able to perceive the lighting, color, texture, shape and motion of the outside
world. The intelligence lies in the inference of such high-level concepts based on imaging.
It is quite easy for human beings, but it is still very unclear how computers can achieve
that level of intelligence. Recognition is one of the most fundamental problems for machine,
i.e., recognizing a pre-stored pattern in new situations by comparing inputs with a set of
templates or models. However, the problem is how to construct these templates or models.
For example, what will be the appropriate templates to recognize faces even under different
view directions or different lightings? The most challenging aspect for visual recognition
lies in the fact that there are too many aspects that affects imaging, and it is impossible to
model every aspects such as lighting and motion. So, people ask, “can computers ‘learn’ the

4
model from examples?” such that models could be learned implicitly, instead of constructed
explicitly.

3 What Is Computer Vision?

According to my understanding, computer vision, basically, is to infer different factors such
as camera model, lighting, color, texture, shape and motion that affect images and videos,
from visual inputs. A rough structure of machine vision could be illustrated by Figure 2. In

Animation HCI Robotics Multimedia

Image Video
Understanding Understanding

Object
Recognition

3D Motion
IBR Reconstruction Capturing

Geometric Visual
Modeling Tracking

Multiview Segmentation
Geometry Stereo SfM

image Optical Motion

Matching Flow Analysis

Texture Low-level Image Proc. Color Radiometry

Analysis Image Primitives Analysis

Image & Video

Image Formation

Camera Lighting Color Texture Motion Shape

Model

Object Dynamics Geometry

Figure 2: What is computer vision?

5
a word, computer vision is an inverse processing of the forward process of image formation
and graphics. In this sense, as many people agree, vision is a much more challenging problem
than computer graphics, because it is full of uncertainties.

3.1 Image Formation

Image formation studies the forward process of producing images and videos. It is an impor-
tant research topic for both vision and graphics. To produce a real image, the nature of the
visual sensors, i.e., cameras, should be studied. In terms of geometrical aspects of camera,
people have been looking into pinhole cameras, cameras with lenses and even omnidirectional
cameras. In terms of physical aspects, factors such as focal lengthes and dynamic ranges of
CCD and CMOS cameras have been investigated.
Besides the imaging device, it is also important to study the factors from objects and
scenes themselves, such as lighting, color, texture, motion and shape, which largely affect
the appearance of images and video.

3.2 Low-level Image Processing

Low-level image processing is not vision, but the pre-processing steps for vision. The ba-
sic task is to extract fundamental image primitives for further processing, including edge
detection, corner detection, filtering, and morphology, etc.

3.3 Low-level Vision

Based on low-level image processing, low-level vision tasks could be preformed, such as image
matching, optical flow computation and motion analysis. Image matching basically is to find
correspondences between two or more images. These images could be the same scene taken
from different view points, or a moving scene taken by a fixed camera, or both. Constructing
image correspondences is a fundamentally important problem in vision for both geometry
recovery and motion recovery. Without exaggeration, image matching is part of the base for
vision.
Optical flow is a kind of image observation of motion, but it is not the true motion. Since
it only measure the optical changes in images, an aperture problem is unavoidable. But
based on optical flows, camera motion or object motion could be estimated.

3.4 Middle-level Vision

There are two major aspects in middle-level vision: (1) inferring the geometry and (2)
inferring the motion. These two aspects are not independent but highly related. A simple
question is “can we estimate geometry based on just one image?”. The answer is obvious.
We need at least two images. They could be taken from two cameras or come from the
motion of the scene.
Some fundamental parts of geometric vision include multiview geometry, stereo and struc-
ture from motion (SfM), which fulfill the step of from 2D to 3D by inferring 3D scene infor-
mation from 2D images. Based on that, geometric modelling is to construct 3D models for

6
objects and scenes, such that 3D reconstruction and image-based rendering could be made
possible.
Another task of middle-level vision is to answer the question “how the object moves”.
Firstly, we should know which areas in the images belong to the object, which is the task
of image segmentation. Image segmentation has been a challenging fundamental problem
in computer vision for decades. Segmentation could be based on spatial similarities and
continuities. However, uncertainty can not be overcome for static image. When considering
motion continuities, we hope the uncertainty of segmentation could be alleviated. On top
of that is visual tracking and visual motion capturing, which estimate 2D and 3D motions,
including deformable motions and articulated motions.

3.5 High-level Vision

High-level vision is to infer the semantics, for example, object recognition and scene under-
standing. A challenging question in many decades is that how to achieve invariant recogni-
tion, i.e., recognize 3D object from different view directions. There have been two approaches
for recognition: model-based recognition and learning-based recognition. It is noticed that
there was a spiral development of these two approaches in history.
Even higher level vision is image understanding and video understanding. We are inter-
ested in answering questions like “Is there a car in the image?” or “Is this video a drama or
an action?”, or ”Is the person in the video jumping?” Based on the answers of these ques-
tions, we should be able to fulfill different tasks in intelligent human-computer interaction,
intelligent robots, smart environment and content-based multimedia.

4 What Is This Course Going To Cover?

This course is going to cover most fundamental aspects in computer vision and machine
learning. Details are available in course syllabus.

In My Mind's Eye - Ursula Franke
100% (10)
In My Mind's Eye - Ursula Franke
156 pages
GOALS Brian Tracy
100% (1)
GOALS Brian Tracy
1 page
Computer Vision Presentation AI
83% (6)
Computer Vision Presentation AI
16 pages
Leadership Interview (BEI)
No ratings yet
Leadership Interview (BEI)
25 pages
Computer Vision Report
No ratings yet
Computer Vision Report
31 pages
Computer Vision in Aritificial Intelligence
No ratings yet
Computer Vision in Aritificial Intelligence
33 pages
Unit2 Notes
No ratings yet
Unit2 Notes
12 pages
502355296-Computer-Vision-Presentation-AI
No ratings yet
502355296-Computer-Vision-Presentation-AI
16 pages
CS 474 Lec 01 Introduction
No ratings yet
CS 474 Lec 01 Introduction
69 pages
1 Intro Visión Artificial
No ratings yet
1 Intro Visión Artificial
50 pages
Report on Computer Vision
No ratings yet
Report on Computer Vision
33 pages
CV Module 1
No ratings yet
CV Module 1
166 pages
Computer Vision ET
No ratings yet
Computer Vision ET
12 pages
Computer Vision PDF
No ratings yet
Computer Vision PDF
6 pages
Lecture 1
No ratings yet
Lecture 1
21 pages
Table of Contents
No ratings yet
Table of Contents
9 pages
Computer Vision Introduction
No ratings yet
Computer Vision Introduction
11 pages
A Comprehensive Guide to Computer Vision
No ratings yet
A Comprehensive Guide to Computer Vision
6 pages
UNIT-I_Introduction to Computer Vision
No ratings yet
UNIT-I_Introduction to Computer Vision
45 pages
What Is Computer Vision
No ratings yet
What Is Computer Vision
9 pages
How Computer Vision is Used in Everyday Life
No ratings yet
How Computer Vision is Used in Everyday Life
5 pages
Computer Vision ch1
No ratings yet
Computer Vision ch1
80 pages
IT5409 Ch1 Intro New Template
No ratings yet
IT5409 Ch1 Intro New Template
14 pages
Mod1 PDF
No ratings yet
Mod1 PDF
277 pages
PDF Joiner
No ratings yet
PDF Joiner
38 pages
The Fascinating Field of Computer Vision
No ratings yet
The Fascinating Field of Computer Vision
8 pages
Seminar Report: by P.Gopala Krishna, (1203108)
No ratings yet
Seminar Report: by P.Gopala Krishna, (1203108)
14 pages
Computer_Vision_1_introduction
No ratings yet
Computer_Vision_1_introduction
44 pages
CV_Lecture_1-DD-Don
No ratings yet
CV_Lecture_1-DD-Don
38 pages
1a. Introduction
No ratings yet
1a. Introduction
32 pages
CS312 Module 4
No ratings yet
CS312 Module 4
21 pages
Computer Vision: Evolution and Promise
No ratings yet
Computer Vision: Evolution and Promise
5 pages
Introduction To Digital Image Processing
100% (1)
Introduction To Digital Image Processing
81 pages
Raz Report Final
No ratings yet
Raz Report Final
37 pages
Lect1 PDF
100% (1)
Lect1 PDF
45 pages
Computer Vision
No ratings yet
Computer Vision
52 pages
Lecture Notes
No ratings yet
Lecture Notes
144 pages
008 Image Processing
No ratings yet
008 Image Processing
28 pages
CV #1 Course Introduction-1
No ratings yet
CV #1 Course Introduction-1
61 pages
CompVisNotes PDF
No ratings yet
CompVisNotes PDF
115 pages
The Rise of Computer Vision: Mechanics, Use Cases, Real World Successes
No ratings yet
The Rise of Computer Vision: Mechanics, Use Cases, Real World Successes
11 pages
The Rise of Computer Vision 110626
No ratings yet
The Rise of Computer Vision 110626
11 pages
Ai Pra
No ratings yet
Ai Pra
15 pages
Computer Vision Advancement Rebecca
No ratings yet
Computer Vision Advancement Rebecca
17 pages
1 Intro to CV
No ratings yet
1 Intro to CV
76 pages
Text For Presentation
No ratings yet
Text For Presentation
5 pages
CPCS335 - Chapter 9-Final
No ratings yet
CPCS335 - Chapter 9-Final
24 pages
Lec 01 CompVision N DIP Intro
No ratings yet
Lec 01 CompVision N DIP Intro
91 pages
01 - Introduction
No ratings yet
01 - Introduction
37 pages
LectureNotes PDF
No ratings yet
LectureNotes PDF
212 pages
T2310 TDS3651 L01 Introduction
No ratings yet
T2310 TDS3651 L01 Introduction
73 pages
Lec 1 - 2
No ratings yet
Lec 1 - 2
39 pages
1 Lecture AI Module1 Intro
No ratings yet
1 Lecture AI Module1 Intro
53 pages
803 (A) Image Processing and Computer Vision#: Subject In-Charge: Prof Shilpa Sharma
No ratings yet
803 (A) Image Processing and Computer Vision#: Subject In-Charge: Prof Shilpa Sharma
44 pages
Topic 5 Computer Vision
No ratings yet
Topic 5 Computer Vision
65 pages
1 Vision Lec 1
No ratings yet
1 Vision Lec 1
49 pages
Computer Vision Models Learning and Inference 1st Edition Dr Simon J. D. Prince download
100% (1)
Computer Vision Models Learning and Inference 1st Edition Dr Simon J. D. Prince download
64 pages
CV Unit 1
No ratings yet
CV Unit 1
30 pages
Computer Vision Notes
No ratings yet
Computer Vision Notes
3 pages
Introduction of Computer Vision
No ratings yet
Introduction of Computer Vision
5 pages
AI in Computer Vision
No ratings yet
AI in Computer Vision
10 pages
CV Lecture 1
No ratings yet
CV Lecture 1
65 pages
Percept: Fundamentals and Applications
From Everand
Percept: Fundamentals and Applications
Fouad Sabry
No ratings yet
01 Data Communication Network Basis
No ratings yet
01 Data Communication Network Basis
31 pages
Chapter - 1
No ratings yet
Chapter - 1
35 pages
Chapter - 2
No ratings yet
Chapter - 2
38 pages
Mse 2201
No ratings yet
Mse 2201
7 pages
Creative Writing Rubric
No ratings yet
Creative Writing Rubric
1 page
COMMUNICATIVE COMPETENCE Presentation
100% (1)
COMMUNICATIVE COMPETENCE Presentation
19 pages
Aaron Anglin Presentation (AOM 2017)
No ratings yet
Aaron Anglin Presentation (AOM 2017)
11 pages
The Knowledge Management Theory Papers
No ratings yet
The Knowledge Management Theory Papers
20 pages
Medical Insurance Cost Prediction System: Dharesh Bahety EN18EL301057 Under The Guidance of Mr. Parag Ravekar Sir
0% (1)
Medical Insurance Cost Prediction System: Dharesh Bahety EN18EL301057 Under The Guidance of Mr. Parag Ravekar Sir
18 pages
Phil 100W Paper Topic 1 Essay
No ratings yet
Phil 100W Paper Topic 1 Essay
2 pages
621 Management Theory & Practice
No ratings yet
621 Management Theory & Practice
2 pages
Curriculum Links To ACARA and NESA - Grammar
No ratings yet
Curriculum Links To ACARA and NESA - Grammar
2 pages
First Day of School Checklist
No ratings yet
First Day of School Checklist
3 pages
Yuboc Jaira Admilao Ambrad Ivy Gay Maasin Villacampa Trisha Loise Perigo
No ratings yet
Yuboc Jaira Admilao Ambrad Ivy Gay Maasin Villacampa Trisha Loise Perigo
2 pages
Wk3 4 Lesson Plan 1 5
No ratings yet
Wk3 4 Lesson Plan 1 5
3 pages
Translating Culture: Problems, Strategies and Practical Realities
No ratings yet
Translating Culture: Problems, Strategies and Practical Realities
27 pages
DLL Week 3
No ratings yet
DLL Week 3
3 pages
An Interpersonal Metafunction Analysis of President Mohammad Ashraf Ghani
No ratings yet
An Interpersonal Metafunction Analysis of President Mohammad Ashraf Ghani
21 pages
B Inggris S1 Keperawatan B-1
No ratings yet
B Inggris S1 Keperawatan B-1
3 pages
Essential Reading Strategies Part A (OET)
No ratings yet
Essential Reading Strategies Part A (OET)
2 pages
620 ArticleText 2453 1 10 20231214
No ratings yet
620 ArticleText 2453 1 10 20231214
14 pages
Intelligence and Creativity: Mental Age Chronological Age
No ratings yet
Intelligence and Creativity: Mental Age Chronological Age
2 pages
Unit 4 PDF
No ratings yet
Unit 4 PDF
52 pages
2nd Detailed Lesson Plan in English For Academic and Professional Purposes in Grade 11
No ratings yet
2nd Detailed Lesson Plan in English For Academic and Professional Purposes in Grade 11
9 pages
Principles of Health Education
100% (1)
Principles of Health Education
3 pages
Oral Communication Week1
No ratings yet
Oral Communication Week1
41 pages
Analysis of The Basic Education of The Philippines: Implications For The K To 12 Education Program
100% (5)
Analysis of The Basic Education of The Philippines: Implications For The K To 12 Education Program
32 pages
Shishu Samarthya Course - 20+ Activities Intro - Low
No ratings yet
Shishu Samarthya Course - 20+ Activities Intro - Low
42 pages
Chapter 5 Semantics
No ratings yet
Chapter 5 Semantics
71 pages
Dossier
No ratings yet
Dossier
490 pages
Notes For Teaching Jobs
No ratings yet
Notes For Teaching Jobs
165 pages