0% found this document useful (0 votes)
205 views

Different Paradigms of Pattern Recognition

This document discusses different paradigms for pattern recognition, with a focus on statistical pattern recognition. The two main paradigms are statistical pattern recognition and syntactic pattern recognition, with statistical being more popular due to its ability to handle noisy data using probability and statistics. Statistical pattern recognition represents patterns as vectors and uses probability distributions, similarity between points, and classification techniques like nearest neighbors, decision trees, and neural networks. Representation of patterns is important for classification.

Uploaded by

Ankur Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
205 views

Different Paradigms of Pattern Recognition

This document discusses different paradigms for pattern recognition, with a focus on statistical pattern recognition. The two main paradigms are statistical pattern recognition and syntactic pattern recognition, with statistical being more popular due to its ability to handle noisy data using probability and statistics. Statistical pattern recognition represents patterns as vectors and uses probability distributions, similarity between points, and classification techniques like nearest neighbors, decision trees, and neural networks. Representation of patterns is important for classification.

Uploaded by

Ankur Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Paradigms for Pattern Recognition

1
Different Paradigms for Pattern Recognition
• There are several paradigms in use to solve the pattern recognition
problem.

• The two main paradigms are

1. Statistical Pattern Recognition


2. Syntactic Pattern Recognition

• Of the two, the statistical pattern recognition has been more popular
and received a major attention in the literature.

• The main reason for this is that most of the practical problems in this
area have to deal with noisy data and uncertainty and statistics and
probability are good tools to deal with such problems.

• On the other hand, formal language theory provides the background for
syntactic pattern recognition. Systems based on such linguistic tools,
more often than not, are not ideally suited to deal with noisy envi-
ronments. However, they are powerful in dealing with well-structured
domains. Also, recently there is a growing interest in statistical pattern
recognition because of the influence of statistical learning theory.

• This naturally prompts us to orient material in this course towards


statistical classification and clustering.

Statistical Pattern Recognition

• In statistical pattern recognition, we use vectors to represent patterns


and class labels from a label set.

• The abstractions typically deal with probability density/distributions


of points in multi-dimensional spaces, trees and graphs, rules, and vec-
tors themselves.

• Because of the vector space representation, it is meaningful to talk of


subspaces/projections and similarity between points in terms of dis-
tance measures.

2
• There are several soft computing tools associated with this notion. Soft
computing techniques are tolerant of imprecision, uncertainty and ap-
proximation. These tools include neural networks, fuzzy systems and
evolutionary computation.

• For example, vectorial representation of points and classes are also


employed by

– neural networks,
– fuzzy set and rough set based pattern recognition schemes.

• In pattern recognition, we assign labels to patterns. This is achieved


using a set of semantically labelled patterns; such a set is called the
training data set. It is obtained in practice based on inputs from ex-
perts.

• In Figure 1, there are patterns of Class ‘X’ and Class ‘+’.

X4 X5
f2

X X1 P X6 X
8
3
X
X7 9
X2

f1

Figure 1: Example set of patterns

3
• The pattern P is a new sample (test sample) which has to be assigned
either to Class ‘X’ or Class ‘+’. There are different possibilities; some
of them are

– The nearest neighbour classifier (NNC): Here, P is assigned to the


class of its nearest neighbour. Note that pattern X1 (labelled ‘X’)
is the nearest neighbour of P. So, the test pattern P is assigned
the class label ‘X’. The nearest neighbour classifier is explained in
Module 7.
– The K-Nearest neighbour classifier (KNNC) is based on the class
labels of K nearest neighbours of the test pattern P . Note that
patterns X1 (from class ‘X’), X6 (from class ‘+’) and X7 (from
class ‘+’) are the first three (K=3) neighbours.A majority ( 2 out
of 3) of the neighbours are from class ‘+’. So, P is assigned the
class label ‘+’. We discuss the KNNC in module 7.
– Decision stump classifier: In this case, each of the two features is
considered for splitting; the one which provides the best separation
between the two classes is chosen. The test pattern is classified
based on this split. So, in the example, the test pattern P is
classified based on whether its first feature (x-coordinate) value is
less than A or not. If it is less than A, then the class is ‘X’, else
it is ‘+’. In Figure 1, P is assigned to class ‘X’. A generalization
of the decision stump called the decision tree classifier

– Separating line as decision boundary: In Figure 1, the two classes


may be characterized in terms of the boundary patterns falling
on the support lines. In the example, pattern X1 (class ‘X’) falls
on one line (say line1) and patterns X5 and X7 (of class ‘+’)
fall on a parallel line (line2). So, any pattern closer to line 1 is
assigned the class label ‘X’ and similarly patterns closer to line2
are assigned class label ‘+’. We discuss classifiers based on such
linear discriminants in module 12. Neural networks and support
vector machines (SVMs) are members of this category.

– It is possible to use a combinations of classifiers to classify a test


pattern. For example, P could be classified using weighted nearest

4
neighbours. Suppose such a weighted classifier assigns a weight of
0.4 to the first neighbour (pattern X1 , labelled ‘X’), a weight of
0.35 to the second neighbour (pattern X6 from class ‘+’) and a
weight of 0.25 to the third neighbour (pattern X7 from class ‘+’).
We first add the weights of the neighbours of P coming from the
same class. So, the sum of the weights for class ‘X’, WX is 0.4 as
only the first neighbour is from ‘X’. The sum of the weights for
class ‘+’, W+ is 0.6 (0.35 ‘ +′ 0.25) corresponding the remaining
two neighbours (8 and 6) from class ‘+’. So, P is assigned class
label ‘+’.
– In a system that is built to classify humans into tall, medium and
short, the abstractions, learnt from examples, facilitate assigning
one of these class labels (tall, medium or short) to a newly en-
countered human. Here, the class labels are semantic; they convey
some meaning.
– In the case of clustering, we can group a collection of unlabelled
patterns also; in such a case, the labels assigned to each group of
patterns is syntactic, simply the cluster identity.
– Several times, it is possible that there is a large training data
which can be directly used for classification. In such a context,
clustering can be used to generate abstractions of the data and use
these abstractions for classification. For example, sets of patterns
corresponding to each of the classes can be clustered to form sub-
classes. Each such subclass (cluster) can berepresented by a single
prototypical pattern; these representative patterns can be used to
build the classifier instead of the entire data set.

Importance of Representation

• It is possible to directly use a classification rule without generating any


abstraction, for example by using the NNC.

• In such a case, the notion of proximity/similarity (or distance) is used


to classify patterns.

5
• Such a similarity function is computed based on the representation of
patterns; the representation scheme plays a crucial role in classification.

• A pattern is represented as a vector of feature values.

• The features which are used to represent patterns are important. We


illustrate it with the help of the following example.

Example

Consider the following data where humans are to be categorized into tall
and short. The classes are represented using the feature Weight. If a newly

Weight of human (in Kilograms) Class label


40 tall
50 short
60 tall
70 short

encountered person weighs 46 KGs, then he/she may be assigned the class
label short because 46 is closer to 50. However, such an assignment does not
appeal to us because we know that weight and the class labels tall and short
do not correlate well; a feature such as Height is more appropriate. Module
2 deals with representation of patterns and classes.

6
Assignment
1. Consider a collection of data items bought in a supermarket. The
features include cost of the item, size of the item and the class label.
The data is shown in the following table. Consider a new item with
cost = 34 and volume = 8. How do you classify this item using the
NNC? How about KNNC with K = 3?

2. Consider the problem of classifying objects into triangles and rectangles.


Which paradigm do you use? Provide an appropriate representation.

3. Consider a variant of the previous problem where the classes are small
circle and big circle. How do you classify such objects?

7
item no cost in Rs. volume in cm3 Class label

1 10 6 inexpensive
2 15 6 inexpensive
3 25 6 inexpensive
4 50 10 expensive
5 45 10 expensive
6 47 12 expensive

Further Reading

[1] is an introductory book on Pattern Recognition with several worked out


examples. [2] is an excellent book on Pattern Classification. [5] is a book on
data mining. [3] is an book on artificial intelligence which discusses learning
and pattern recognition techniques as a part of artificial intelligence. Neural
network as used for Pattern Classification is found in [4].

References
[1] V. Susheela Devi, M. Narasimha Murty. Pattern Recognition: An In-
troduction Universities Press, Hyderabad, 2011.

[2] R.O. Duda, P.E. Hart, D.G. Stork. Pattern Classification John Wiley
and Sons, 2000.

[3] S. Russell and P. Norvig Artificial intelligence A Modern approach Pear-


son India, 2003.

[4] C. M. Bishop. Neural Networks for Pattern Recognition. Oxford Uni-


versity Press, New Delhi, 2003.

[5] P. N. Tan, M. Steinbach, and V. Kumar. Introduction to Data Mining


Pearson India, 2007.

You might also like