3 Pattern Recognition 1
3 Pattern Recognition 1
Santosh Pandeya 1
Pattern Recognition
Pattern is everything around in this digital world. A pattern can either be seen physically or it can be observed
mathematically by applying algorithms.
Example: The colors on the clothes, speech pattern, etc. In computer science, a pattern is represented using
vector feature values.
Pattern recognition is the ability of machines to identify patterns in data, and then use those patterns to make
decisions or predictions using computer algorithms.
It’s a vital component of modern artificial intelligence (AI) systems.
Pattern Recognition is the use of computer algorithm to recognize data regularities and patterns.
This can be recognition can be done in various input types such as biometric recognition, color , image
recognition
It is the use of machine learning algorithm to identify patterns.
It classifies data based on statistical information or knowledge gained from patterns and their representation .
1.Explorative phase: In this phase, computer algorithms tend to explore data patterns.
2.Descriptive phase: Here, algorithms group and attribute identified patterns to new data.
Data collection
Pre-processing
Features Extraction
Classification
Post-Processing
Many practical decision-making tasks can be formulated as classification problems, i.e. assigning people or
objects to one of a number of categories.
For example:
• customers who likely to buy or not buy a particular product in a supermarket.,
• people who are at high , medium or low risk of acquiring a certain illness,
• objects on a radar display which correspond to vehicles, people, buildings or trees,
• the likelihood of rain the next day for a weather forecast (very likely, likely, unlikely, very unlikely).
Classification is learning a function that maps a data item into one of several predefined
classes.
First Step
model of predefined set of data classes or concept is made
model is constructed by analyzing database tuples (i.e. samples, examples or objects).
described by attributes.
each tuple is assumed to belong to predefined class .
individual tuples making up the training set are referred to as trainingsamples.
Since the class label of each training sample is provided , this step is also called supervised
learning. (i.e. told to which class each training sample belongs).
a) Bayesian Classifiers:
They are statistical classifiers.
Can predict class membership probabilities for e.g. probability that a given tuplebelongs to a particular
class.
Based on Baye's Theorem
P(H | X): the probability that the hypothesis H holds given the observed data sample X. It is also called
posterior probability or posteriori probability
• e.g. suppose the world of data samples consists of fruits, described by their color and shape.
Suppose that X is red and round, and that H is the hypothesis that X is an apple.
• Then P(H | X) reflects our confidence that X is an apple given that we have seen that X is red and round. In
contrast, P(H) is the prior probability, or a priori probability of H. Similarly, P(X) is the prior probability of
X.
P(X |H): posterior probability of X conditioned on H. That is, it is the probability that X is red and round
given that we know that it is true that X is an apple .
𝑃𝑀.𝑃(𝑀|𝑆)+𝑃𝑁.𝑃(𝑁|𝑆)
=1/20∗7/10
1/20∗7/10+19/20∗1/20
= 7/26 Ans
•Nearest neighbor classifier are instance-based or lazy learners in that they store all of the training
samples and do not build a classifier until a new (unlabeled) sample needs to be classified. This
contrasts with eager learning methods such as decision tree, induction and back propagation, which
construct a generalization model before receiving new samples to classify
•Lazy learners can incur expensive computational costs when the number of potential neighbor (i.e.,
stored training samples) with which to compare a given unlabeled sample is great. Therefore, they
require efficient indexing techniques.
•Nearest neighbor classifiers can also be used for prediction i.e. to return a real- valued prediction
for a given unknown sample.
Validation Dataset:
•The sample of data used to provide an unbiased evaluation of a model fit on the training dataset while tuning
model hyper parameters.
•The evaluation becomes more biased as skill on the validation dataset is incorporated into the model
configuration.
Test Dataset:
•The sample of data used to provide an unbiased evaluation of a final model fit on the training dataset.
•a set of examples used only to assess the performance of a fully-trained classifier
Overfitting refers to a model that models the training data too well.
Overfitting happens when a model learns the detail and noise in the training data to the extent that it
negatively
impacts the performance of the model on new data.
This means that the noise or random fluctuations in the training data is picked up and learned as concepts by
the model.
The problem is that these concepts do not apply to new data and negatively impact the models ability to
generalize
Overfitting is more likely with nonparametric and nonlinear models that have more flexibility when learning
a target function.
As such, many nonparametric machine learning algorithms also include parameters or techniques to limit
and constrain how much detail the model learns.