Introduction To Pattern Recognition PDF
Introduction To Pattern Recognition PDF
CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 9
What makes a good feature vector?
The quality of a feature vector is related to its ability to discriminate
examples from different classes
Examples from the same class should have similar feature values
Examples from different classes have different feature values
More feature properties
Good features Bad features
Highly correlated features Non-linear separability Linear separability Multi-modal
CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 10
Classifiers
The task of a classifier is to partition feature
space into class-labeled decision regions
Borders between decision regions are called
decision boundaries
The classification of feature vector consists of
determining which decision region it belongs to,
and assign to this class
A classifier can be represented as a set of
discriminant functions
The classifier assigns a feature vector
to class
if
>
R1
R2
R3
R1
R2
R3
R4
x
2
x
3
x
d
g
1
(x)
x
1
g
2
(x) g
C
(x)
Select max
Costs
Class assignment
Discriminant functions
Features
CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 11
Pattern recognition approaches
Statistical
Patterns classified based on an underlying statistical model of the features
The statistical model is defined by a family of class-conditional probability
density functions
)
Neural
Classification is based on the response of a network of processing units
(neurons) to an input stimuli (pattern)
Knowledge is stored in the connectivity and strength of the synaptic weights
Trainable, non-algorithmic, black-box strategy
Very attractive since
it requires minimum a priori knowledge
with enough layers and neurons, ANNs can create any complex decision region
Syntactic
Patterns classified based on measures of structural similarity
Knowledge is represented by means of formal grammars or relational
descriptions (graphs)
Used not only for classification, but also for description
Typically, syntactic approaches formulate hierarchical descriptions of complex
patterns built up from simpler sub patterns
CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 12
A
+
+
+
(
(
(
(
(
(
(
(
(
(
(
(
(
(
=
0
1
0
1
0
0
1
1
0
x
1
=
3
x
Feature extraction:
# intersections
# right oblique lines
# left oblique lines
# horizontal lines
# holes
| | ) A" |" p(x 1 2 1 2 3 x
2
model
tic Probabilis
T
2
=
Statistical Structural Neural*
Example: neural, statistical and structural OCR
F eature #1
F
e
a
t
u
r
e
#
2
P ( f
1
, f
2
| e
i
)
F eature #1
F
e
a
t
u
r
e
#
2
P ( f
1
, f
2
| e
i
)
[Schalkoff, 1992]
*Neural approaches may also
employ feature extraction
To
parser
CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 13
A simple pattern recognition problem
Consider the problem of recognizing the letters L,P,O,E,Q
Determine a sufficient set of features
Design a tree-structured classifier
C>0?
V>0?
O>0?
H>0?
P
E L
O Q
Start
YES NO
YES
NO
YES NO
YES
Character
Features
Vertical
straight
lines
Horizontal
straight
lines
Oblique
straight
lines
Curved
lines
L 1 1 0 0
P 1 0 0 1
O 0 0 0 1
E 1 3 0 0
Q 0 0 1 1
CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 14
The pattern recognition design cycle
Data collection
Probably the most time-intensive component of a PR project
How many examples are enough?
Feature choice
Critical to the success of the PR problem
Garbage in, garbage out
Requires basic prior knowledge
Model choice
Statistical, neural and structural approaches
Parameter settings
Training
Given a feature set and a blank model, adapt the model to explain the
data
Supervised, unsupervised and reinforcement learning
Evaluation
How well does the trained model do?
Overfitting vs. generalization
CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 15
Consider the following scenario
A fish processing plan wants to automate the process of sorting
incoming fish according to species (salmon or sea bass)
The automation system consists of
a conveyor belt for incoming products
two conveyor belts for sorted products
a pick-and-place robotic arm
a vision system with an overhead CCD camera
a computer to analyze images and control the robot arm
[Duda, Hart and Stork, 2001]
Conveyor
belt
CCD
camera
Conveyor belt
(bass)
Conveyor belt
(salmon)
Robot
arm
computer
CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 16
Sensor
The vision system captures an image as a new fish enters the sorting area
Preprocessing
Image processing algorithms, e.g., adjustments for average intensity
levels, segmentation to separate fish from background
Feature extraction
Suppose we know that, on the average, sea bass is larger than salmon
From the segmented image we estimate the length of the fish
Classification
Collect a set of examples from both species
Compute the distribution of lengths for both
classes
Determine a decision boundary (threshold)
that minimizes the classification error
We estimate the classifiers probability
of error and obtain a discouraging result of 40%
What do we do now?
count
length
Sea bass Salmon
Decision
boundary
CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 17
Improving the performance of our PR system
Determined to achieve a recognition rate of 95%, we try a number of
features
Width, area, position of the eyes w.r.t. mouth...
only to find out that these features contain no discriminatory information
Finally we find a good feature: average intensity of the scales
We combine length and average
intensity of the scales to improve
class separability
We compute a linear discriminant
function to separate the two classes,
and obtain a classification rate of 95.7%
count
Avg. scale intensity
Sea bass Salmon
Decision
boundary
Avg. scale intensity
l
e
n
g
t
h
Decision
boundary
Sea bass Salmon
CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 18
Cost vs. classification rate
Our linear classifier was designed to minimize the overall
misclassification rate
Is this the best objective function for our fish processing plant?
The cost of misclassifying salmon as sea bass is that the end customer will
occasionally find a tasty piece of salmon when he purchases sea bass
The cost of misclassifying sea bass as salmon is an end customer upset
when he finds a piece of sea bass purchased at the price of salmon
Intuitively, we could adjust the decision boundary to minimize this cost
function
Avg. scale intensity
l
e
n
g
t
h
Decision
boundary
Sea bass Salmon
Avg. scale intensity
l
e
n
g
t
h
New
Decision
boundary
Sea bass
Salmon
CSCE 666 Pattern Analysis | Ricardo Gutierrez-Osuna | CSE@TAMU 19
The issue of generalization
The recognition rate of our linear classifier (95.7%) met the design
specs, but we still think we can improve the performance of the system
We then design an ANN with five hidden
layers, a combination of logistic and
hyperbolic tangent activation functions,
train it with the Levenberg-Marquardt
algorithm and obtain an impressive
classification rate of 99.9975% with
the following decision boundary
Satisfied with our classifier, we integrate the system and deploy it to the
fish processing plant
After a few days, the plant manager calls to complain that the system is
misclassifying an average of 25% of the fish
What went wrong?
Avg. scale intensity
l
e
n
g
t
h
Salmon Sea bass