0% found this document useful (0 votes)
23 views

3 Pattern Recognition 1

Pattern recognition

Uploaded by

sachinch01432
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

3 Pattern Recognition 1

Pattern recognition

Uploaded by

sachinch01432
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Compiled By: Er.

Santosh Pandeya 1
Pattern Recognition
 Pattern is everything around in this digital world. A pattern can either be seen physically or it can be observed
mathematically by applying algorithms.
Example: The colors on the clothes, speech pattern, etc. In computer science, a pattern is represented using
vector feature values.

 Pattern recognition is the ability of machines to identify patterns in data, and then use those patterns to make
decisions or predictions using computer algorithms.
 It’s a vital component of modern artificial intelligence (AI) systems.
 Pattern Recognition is the use of computer algorithm to recognize data regularities and patterns.
 This can be recognition can be done in various input types such as biometric recognition, color , image
recognition
 It is the use of machine learning algorithm to identify patterns.
 It classifies data based on statistical information or knowledge gained from patterns and their representation .

Compiled By: Er. Santosh Pandeya 2


Features of Pattern Recognition

 It has great precisions in recognizing patterns.


 Can recognize unfamiliar objects.
 It can recognize objects accurately from various angles
 It can recover patterns in instances of missing data
 Can discover patterns that are partly hidden

Compiled By: Er. Santosh Pandeya 3


How Does Pattern Recognition Work?
 Pattern recognition is applied for data of all types, including image, video, text, and
audio.
 As the pattern recognition model can identify recurring patterns in data, predictions
made by such models are quite reliable.

Pattern recognition involves three key steps:


analyzing the input data, extracting patterns, and comparing it with the stored data.

The process can be further divided into two phases:

1.Explorative phase: In this phase, computer algorithms tend to explore data patterns.
2.Descriptive phase: Here, algorithms group and attribute identified patterns to new data.

Compiled By: Er. Santosh Pandeya 4


These phases can be further subdivided into the following modules:

 Data collection
 Pre-processing
 Features Extraction
 Classification
 Post-Processing

Compiled By: Er. Santosh Pandeya 5


Compiled By: Er. Santosh Pandeya 6
Classification Problems:
 Classification is a task that occurs very frequently in everyday life.
 Essentially it involves dividing up objects so that each is assigned to one of number of mutually exhaustive
and exclusive categories known as classes.
 The term mutually exhaustive and exclusive simply means that each object must assigned to precisely one
class, i.e. never to more than one and never to any class at all. So, organization of data in a given class is
called classification.

Many practical decision-making tasks can be formulated as classification problems, i.e. assigning people or
objects to one of a number of categories.
For example:
• customers who likely to buy or not buy a particular product in a supermarket.,
• people who are at high , medium or low risk of acquiring a certain illness,
• objects on a radar display which correspond to vehicles, people, buildings or trees,
• the likelihood of rain the next day for a weather forecast (very likely, likely, unlikely, very unlikely).

Compiled By: Er. Santosh Pandeya 7


Applications:
•Image processing, segmentation, and analysis
Pattern recognition is used to give human recognition intelligence to machines that are required in
image processing.
•Computer vision
Pattern recognition is used to extract meaningful features from given image/video samples and is used
in computer vision for various applications like biological and biomedical imaging.
•Seismic analysis
The pattern recognition approach is used for the discovery, imaging, and interpretation of temporal
patterns in seismic array recordings. Statistical pattern recognition is implemented and used in
different types of seismic analysis models.
•Speech recognition
The greatest success in speech recognition has been obtained using pattern recognition paradigms. It
is used in various algorithms of speech recognition which tries to avoid the problems of using a
phoneme level of description and treats larger units such as words as pattern
•Fingerprint identification
Fingerprint recognition technology is a dominant technology in the biometric market. A number of
recognition methods have been used to perform fingerprint matching out of which pattern recognition
approaches are widely used.
Compiled By: Er. Santosh Pandeya 8
Classification Problems

 Classification is a task that occurs very frequently in everyday life.


 Essentially it involves dividing up objects so that each is assigned to one of
number of mutually exhaustive and exclusive categories known as classes.
 The term mutually exhaustive(for two events, one of them must occur) and
exclusive (two events cannot occur at the same time) simply means that each
object must assigned to precisely one class, i.e. never to more than one and
never to any class at all.
 So, organization of data in a given class is called classification

Compiled By: Er. Santosh Pandeya 9


 Many practical decision-making tasks can be formulated as classification problems, i.e.assigning
people or objects to one of a number of categories.
 For example:
 customers who likely to buy or not buy a particular product in a supermarket.,
 people who are at high , medium or low risk of acquiring a certain illness,
 objects on a radar display which correspond to vehicles, people, buildings or trees,
 the likelihood of rain the next day for a weather forecast (very likely, likely,
unlikely, very unlikely).

 Classification is learning a function that maps a data item into one of several predefined
classes.

 The target function is also known informally as a classification model.

Compiled By: Er. Santosh Pandeya 10


 Data Classification can be done in two steps:

 First Step
 model of predefined set of data classes or concept is made
 model is constructed by analyzing database tuples (i.e. samples, examples or objects).
described by attributes.
 each tuple is assumed to belong to predefined class .
 individual tuples making up the training set are referred to as trainingsamples.
 Since the class label of each training sample is provided , this step is also called supervised
learning. (i.e. told to which class each training sample belongs).

Compiled By: Er. Santosh Pandeya 11


 Second Step
 model is used for classification.
 First, predictive accuracy of the model (or classifier) is estimated.
 There are several methods for estimating classifier accuracy.
 Samples are randomly selected and are independent of the trainingsamples.
 Accuracy of model is given by test set.
 the known class label is compared with the learned model's class prediction.
 if the accuracy is acceptable, the model can be used to classify future data tuples of objects for
which the class label is known.(Such data are called "unknown" or "previously unseen" data).

Compiled By: Er. Santosh Pandeya 12


Evaluating Classifier
 An algorithm implements classification , especially in a concrete implementation,is known as a
classifier.
 The term "classifier" sometimes also refers to mathematical function, implementedby a classification
algorithm that maps input data to a category.

a) Bayesian Classifiers:
 They are statistical classifiers.
 Can predict class membership probabilities for e.g. probability that a given tuplebelongs to a particular
class.
Based on Baye's Theorem

Compiled By: Er. Santosh Pandeya 13


a) Naive Bayesian Classifier
 Studies comparing classification algorithms found a simple Bayesian classifier known
as Naive Bayesian Classifier.
 based on conditional probability.
 comparable in performance with decision tree and selected neural network classifiers.
 Naive Bayesian Classifiers assume that the effect of an attribute value on a given class is
independent of the values of the other attributes. This assumption is called "class conditional
independence".
 It is made to simplify the computations involved , so in this sense considered "naive".
makes use of all attributes contained in data and analyze them individually as though they are equally
important and independent to each other

Compiled By: Er. Santosh Pandeya 14


Bayes Theorem
 X: data sample whose class label is unknown
 H: some Hypothesis, such as that the data sample X belongs to a specified class C.

 P(H | X): the probability that the hypothesis H holds given the observed data sample X. It is also called
posterior probability or posteriori probability
• e.g. suppose the world of data samples consists of fruits, described by their color and shape.
Suppose that X is red and round, and that H is the hypothesis that X is an apple.
• Then P(H | X) reflects our confidence that X is an apple given that we have seen that X is red and round. In
contrast, P(H) is the prior probability, or a priori probability of H. Similarly, P(X) is the prior probability of
X.
 P(X |H): posterior probability of X conditioned on H. That is, it is the probability that X is red and round
given that we know that it is true that X is an apple .

Compiled By: Er. Santosh Pandeya 15


Bayes Theorem

Compiled By: Er. Santosh Pandeya 16


Bayes Theorem Numerical

 While watching a game of Champions League football in café, you observe


someone who is clearly supporting Manchester United in the game. Using Bayes
Rule calculate the probability that they were actually born within 30 miles of
Manchester. Assume that:
• The probability that a randomly selected person in a typical local
bar environment is born within 30 miles of Manchester is 1/20
• The chance that a person born within 30 miles of Manchester actually supports
Manchester United is 7/10
• The probability that a person not born within 30 miles of Manchester supports
Manchester United with probability 1/10
Compiled By: Er. Santosh Pandeya 17
Solution:
LetM: Set of born within 30 miles of Manchester
N: Set of NOT born within 30 miles of Manchester S:
Set of Supporter of Manchester
Given that:
P(M)= 1/20
P(N) = 1- 1/20 = 19/20 P(M|S) = 7/10
P(N|S) = 1/10
P(S|M) = ?

Compiled By: Er. Santosh Pandeya 18


We have from Bayes Theorem,
P(S|M) = 𝑃𝑀.𝑃(𝑀|𝑆)

𝑃𝑀.𝑃(𝑀|𝑆)+𝑃𝑁.𝑃(𝑁|𝑆)
=1/20∗7/10
1/20∗7/10+19/20∗1/20
= 7/26 Ans

Compiled By: Er. Santosh Pandeya 19


c) k-Nearest Neighbor classifiers
Nearest neighbor classifiers are based on learning by analogy.
The training samples are described by n-dimensional numeric attributes.
Each sample represents a point in an n-dimensional space, i.e. all training samples are stored in a n-
dimensional pattern space.
When given an unknown sample, a k-nearest neighbor classifier searches the pattern space for the k
training samples that are closest to the unknown sample.
These k training samples are the k "nearest neighbors" of the unknown sample.
"closeness" is defined in terms of Euclidean distance, where the Euclidean distance between two
points, X=(x1, x2, ......xn) and Y=(y1,y2, .....,yn) is :

Compiled By: Er. Santosh Pandeya 20


Contd….

•Nearest neighbor classifier are instance-based or lazy learners in that they store all of the training
samples and do not build a classifier until a new (unlabeled) sample needs to be classified. This
contrasts with eager learning methods such as decision tree, induction and back propagation, which
construct a generalization model before receiving new samples to classify
•Lazy learners can incur expensive computational costs when the number of potential neighbor (i.e.,
stored training samples) with which to compare a given unlabeled sample is great. Therefore, they
require efficient indexing techniques.
•Nearest neighbor classifiers can also be used for prediction i.e. to return a real- valued prediction
for a given unknown sample.

Compiled By: Er. Santosh Pandeya 21


Training , Testing, Validation
Training Dataset:
•a set of examples used for learning
•The sample of data used to fit the model.

Validation Dataset:
•The sample of data used to provide an unbiased evaluation of a model fit on the training dataset while tuning
model hyper parameters.
•The evaluation becomes more biased as skill on the validation dataset is incorporated into the model
configuration.

Test Dataset:
•The sample of data used to provide an unbiased evaluation of a final model fit on the training dataset.
•a set of examples used only to assess the performance of a fully-trained classifier

Compiled By: Er. Santosh Pandeya 22


Over fitting and Complexity

 Overfitting refers to a model that models the training data too well.
 Overfitting happens when a model learns the detail and noise in the training data to the extent that it
negatively
 impacts the performance of the model on new data.
 This means that the noise or random fluctuations in the training data is picked up and learned as concepts by
the model.
 The problem is that these concepts do not apply to new data and negatively impact the models ability to
generalize
 Overfitting is more likely with nonparametric and nonlinear models that have more flexibility when learning
a target function.
 As such, many nonparametric machine learning algorithms also include parameters or techniques to limit
and constrain how much detail the model learns.

Compiled By: Er. Santosh Pandeya 23


For example, decision trees are a nonparametric machine learning algorithm that is very flexible and is subject to
overfitting training data. This problem can be addressed by pruning a tree after it has learned in order to remove
some of the detail it has picked up.
 Complexity (running time) increases with dimension d
 A lot of methods have at least O(nd2) complexity, where n is the number of samples
 For example if we need to estimate covariance matrix .So as d becomes large, O(nd2) complexity may be
too costly.
 If d is large, n, the number of samples, may be too small for accurate parameter estimation.
 For example, covariance matrix has d2 parameters:
For accurate estimation, n should be much bigger than d2
Otherwise model is too complicated for the data,

Compiled By: Er. Santosh Pandeya 24


Assignment-03
1. What are the main types of classification problems in pattern recognition?
2. How does a binary classification problem differ from a multi-class classification problem?
3. What are some common metrics used to evaluate the performance of a classifier?
4. How does the k-Nearest Neighbors (k-NN) algorithm work?
5. Describe the impact of feature scaling on the performance of k-NN.
6. Why is it important to split your dataset into training, testing, and validation sets?
7. Explain the concept of cross-validation and its advantages.
8. What is overfitting in the context of machine learning and pattern recognition?
9. How can overfitting be detected and prevented?
10.Explain the role of regularization in controlling model complexity.

-Must be submitted within 7 days

Compiled By: Er. Santosh Pandeya 25

You might also like