Classification

Uploaded by

lovishh03.ssll

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

Classification

Uploaded by

lovishh03.ssll

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 53

classifiers

Generative classifiers
• generative classifier tries to learn the model that
generates the data behind the scenes by **estimating the
assumptions and distributions of the model.
• It then uses this to predict unseen data, because it
assumes the model that was learned captures the real
model.
• A Generative model assumes that all the features are
conditionally independent .
• A Generative Model ‌learns the joint probability
distribution p(x,y).
• An example is the Naive Bayes Classifier.
Discriminative Classifiers

• A discriminative classifier tries to model by just

depending on the observed data. It makes fewer
assumptions on the distributions but depends
heavily on the quality of the data.
• Discriminative model does not assume anything
related to the independence of features.
• Discriminative model ‌learns the conditional
probability distribution p(y|x).
• An example is the Logistic Regression, Scalar Vector
Machine, ‌Nearest neighbour.
When to use a generative classifier or a discriminative classifier?

• In practice, discriminative classifiers

outperform generative classifiers, if you have a
lot of data.
• Machine Learning models have one sole
purpose; to generalize well.
• Generalization is the model’s ability to give
sensible outputs to sets of input that it has
never seen before.
• A model that generalizes well is a model that
is neither underfit nor overfit.
Underfitting and overfitting
The cause of poor performance in machine learning is either
overfitting or underfitting the data.

Underfitting: when model is too simple, both training and test errors are large
(the model has “ not learned enough”)

Overfitting: )when model is too complex, training error is small but test error is
large (learning “too much)
Overfitting in Machine Learning

• Over fitting refers to a model that models the training data too well.
• Overfittting happens when our model doesn’t generalize well from our training data to
unseen data.
• Overfitting is the case where the overall cost is really small, but the generalization of the model is unreliable. This is due to
the model learning “too much” from the training data set.
• Overfitting happens when a model learns the detail and noise in the training data to the extent that it
negatively impacts the performance of the model on new data.
• If we train for too long, then we will overfit the data, which means that we have learnt about
the noise and inaccuracies in the data as well as the actual function. Therefore, the model
that we learn will be much too complicated, and won’t be able to generalise.

• The problem is that these concepts do not apply to new data and negatively impact the models
ability to generalize.
• If our model does much better on the training set than on the test set, then we’re likely
overfitting.

• Overfitting is more likely with nonparametric and nonlinear models that have more flexibility when
learning a target function
Reasons for Model Overfitting
• Limited Training Size

• High Model Complexity

Underfitting in Machine Learning

• Underfitting refers to a model that can neither

model the training data nor generalize to new data.
• An underfit machine learning model is not a
suitable model and will be obvious as it will have
poor performance on the training data.
• Underfitting is the case where the model has “ not
learned enough” from the training data, resulting
in low generalization and unreliable predictions.
How To Limit Overfitting

• Both overfitting and underfitting can lead to poor

model performance. But by far the most common
problem in applied machine learning is overfitting.
• There are two important techniques that you can
use when evaluating machine learning algorithms
to limit overfitting:
1. Use a resampling technique to estimate model
accuracy.
2. Hold back a validation dataset(cross validation).
Cross validation
• Leave one out:
• take (n-1) data values as training set and one
data set value for test.
• Then again take (n-1) data values and leave
some different one valve data set value for
test.
• Repeat it for number of times and then take
average.
Repeat random subsampling
• Select ration of the data which will act as
training set and which will act as test set.
For eg:80% training set & 20% test set.
Supervised learning
• Supervised learning typically begins with an
established set of data and a certain
understanding of how that data is classified.
• Supervised learning is intended to find
patterns in data that can be applied to an
analytics process.
• This data has labeled features that define the
meaning of data.
• For example, there could be millions of images
of animals and include an explanation of what
each animal is and then you can create a
machine learning application that distinguishes
one animal from another.
• By labeling this data about types of animals,
you may have hundreds of categories of
different species.
• When the label is continuous, it is a regression.
• when the data comes from a finite set of
values, it known as classification.
Supervised learning

Classification Regression
• Regression: real numbers associated with
feature vector.
eg: to predict house price from training data.
• Classification : A discrete value associated with
a feature vector.
eg: to predict gender
What is classification?
A machine learning task that deals with identifying the class to which an
instance belongs

A classifier performs classification

Test instance
Classifier Discrete-valued

Attributes Issue Loan? {Yes, No}

Class label
(a1, a2,… an)
Types of learners in classification
There are two types of learners in classification
as lazy learners and eager learners.
• Lazy learners

• Eager learners
Lazy learners
• Lazy learners simply store the training data and
wait until a testing data appear.
• When it does, classification is conducted based
on the most related data in the stored training
data.
• Compared to eager learners, lazy learners have
less training time but more time in predicting.
• Ex. k-nearest neighbor, Case-based reasoning
Eager learners

• Eager learners construct a classification model

based on the given training data before receiving
data for classification.
• It must be able to commit to a single hypothesis
that covers the entire instance space.
• Due to the model construction, eager learners
take a long time for train and less time to predict.
• Ex. Decision Tree, Naive Bayes, Artificial Neural
Networks
Classification learning

Training Testing
phase phase
Learning the classifier Testing how well the classifier
from the available data performs
‘Training set’ ‘Testing set’
(Labeled)
1. The Classification
challange
Learning of binary classification
• Given: a set of m examples (xi,yi) i = 1,2…m
sampled from some distribution D, where xiRn and
yi{-1,+1}
• Find: a function f f: Rn -> {-1,+1} which classifies
‘well’ examples xj sampled from D.

comments
• The function f is usually a statistical model, whose
parameters are learnt from the set of examples.
• The set of examples are called – ‘training set’.
• Y is called – ‘target variable’, or ‘target’.
• Examples with yi=+1 are called ‘positive examples’.
Examples with yi=-1 are called ‘negative examples’.
Some Real life applications
• Systems Biology – Gene expression microarray data:
• Text categorization: spam detection
• Face detection: Signature recognition: Customer discovery
• Medicine: Predict if a patient has heart ischemia by a
spectral analysis of his/her ECG.
• Fraud detection
Microarray data
Separate malignant from healthy
tissues based on the mRNA
expression profile of the tissue.
Face detection
•discriminating human faces from non faces.
Signature recognition
• Recognize signatures by structural similarities
which are difficult to quantify.
• does a signature belongs to a specific person, say
Tony Blair, or not.
Classification problem

x2
?

?
?

x1
Classification algorithms
– Fisher linear discriminant
– KNN
– Decision tree
– Neural networks
– SVM
– Naïve bayes
– Adaboost
– Many many more ….

– Each one has its properties wrt bias,

speed, accuracy, transparency…
Nearest neighbor
• Simplest approach.
• Without any priori assumptions.
• Remember training set.
• Calculate distance between different labels
and compare new x with nearest neighbor.
• Problem: if noisy data we can get wrong
answers
Properties of KNN
• The following two properties would define KNN
well −
• Lazy learning algorithm − KNN is a lazy learning
algorithm because it does not have a specialized
training phase and uses all the data for training
while classification.
• Non-parametric learning algorithm − KNN is also a
non-parametric learning algorithm because it
doesn’t assume anything about the underlying
data.
KNN in simple terms
• We take some k no of nearest neighbors and
compare the new x with its k nearest
neighbors.
Performance of ML algorithms
There are various metrics which we can use to evaluate the
performance of ML algorithms, classification as well as regression
algorithms. We must carefully choose the metrics for evaluating
ML performance because −

• How the performance of ML algorithms is measured and

compared will be dependent entirely on the metric you choose.

• How you weight the importance of various characteristics in the

result will be influenced completely by the metric you choose.
Performance Metrics for Classification Problems

Confusion Matrix
• It is the easiest way to measure the performance of a
classification problem where the output can be of two or
more type of classes.
• A confusion matrix is nothing but a table with two
dimensions viz. “Actual” and “Predicted” and
furthermore, both the dimensions have “True Positives
(TP)”, “True Negatives (TN)”, “False Positives (FP)”, “False
Negatives (FN)”
• True Positives (TP) − It is the case when both actual class &
predicted class of data point is 1.
• True Negatives (TN) − It is the case when both actual class &
predicted class of data point is 0.
• False Positives (FP) − It is the case when actual class of data
point is 0 & predicted class of data point is 1.
• False Negatives (FN) − It is the case when actual class of
data point is 1 & predicted class of data point is 0.
We can use confusion_matrix function of sklearn.metrics to
compute Confusion Matrix of our classification model.
Accuracy

• It is most common performance metric for

classification algorithms. It may be defined as
the number of correct predictions made as a
ratio of all predictions made. We can easily
calculate it by confusion matrix with the help
of following formula −
Accuracy= TP+TN/TP+FP+TN+FN
If class data unbalance
• Some values more than other type of values.
• If +ve values> -ve values
Or
If –ve values> +ve values
then accuracy is not perfect measure to be
calculated for performance.
recall, precision can be used for performance
measure
Performance measures
• Recall may be defined as the number of positives returned by our ML
model. We can easily calculate it by confusion matrix with the help of
following formula.
Recall = senstivity
= true + ve values/(true +ve values + false –ve values)

• Precision, used in document retrievals, may be defined as the number

of correct documents returned by our ML model. We can easily
calculate it by confusion matrix with the help of following formula:

Precision = specificity
= true -ve values/(true-ve values + false +ve values)
Advantages
• It is very simple algorithm to understand and
interpret.
• It is very useful for nonlinear data because there
is no assumption about data in this algorithm.
• It is a versatile algorithm as we can use it for
classification as well as regression.
• It has relatively high accuracy but there are much
better supervised learning models than KNN.
Disadvantages
• It is computationally a bit expensive algorithm
because it stores all the training data.
• High memory storage required as compared to
other supervised learning algorithms.
• Prediction is slow in case of big N.
• It is very sensitive to the scale of data as well
as irrelevant features.
• Choosing k can be tricky.

Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Adjusting Top Mounted Graphic Rangefinder
No ratings yet
Adjusting Top Mounted Graphic Rangefinder
3 pages
Team Binder - User Manual
100% (2)
Team Binder - User Manual
225 pages
Power One
No ratings yet
Power One
136 pages
Unit-1 ML
No ratings yet
Unit-1 ML
19 pages
Machine - Learning - Unit - 1
No ratings yet
Machine - Learning - Unit - 1
70 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
32 pages
module3_DS_ppt
No ratings yet
module3_DS_ppt
68 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
ML 1 2 3
No ratings yet
ML 1 2 3
54 pages
Unit III - I
No ratings yet
Unit III - I
15 pages
Classification Algorithms I
No ratings yet
Classification Algorithms I
14 pages
Data Science-Unit-4- 05.10.23
No ratings yet
Data Science-Unit-4- 05.10.23
59 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
24 pages
UNIT 1 PART 3
No ratings yet
UNIT 1 PART 3
11 pages
Machine Learning Notes
100% (3)
Machine Learning Notes
134 pages
Chapter 1-ML
No ratings yet
Chapter 1-ML
27 pages
DW&M Unit 3 Part I
No ratings yet
DW&M Unit 3 Part I
101 pages
ML Tutorial
No ratings yet
ML Tutorial
87 pages
Unit 5 Intro To Machine Learning
No ratings yet
Unit 5 Intro To Machine Learning
25 pages
Chapter 01 Introduction To Machine Learning
No ratings yet
Chapter 01 Introduction To Machine Learning
59 pages
Classification (Part II)
No ratings yet
Classification (Part II)
162 pages
Complete ML Concepts
No ratings yet
Complete ML Concepts
30 pages
Data in ML
No ratings yet
Data in ML
26 pages
Introduction To ML
No ratings yet
Introduction To ML
55 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
4 pages
AI Chapter 5
No ratings yet
AI Chapter 5
31 pages
Lecturenotes Cse176
No ratings yet
Lecturenotes Cse176
80 pages
Module 1
No ratings yet
Module 1
50 pages
Lecturenotes PDF
No ratings yet
Lecturenotes PDF
80 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
Presentation on ML - Copy
No ratings yet
Presentation on ML - Copy
469 pages
Lecture 4.2 Supervised Learning Classification
No ratings yet
Lecture 4.2 Supervised Learning Classification
25 pages
LECTURE - 1
No ratings yet
LECTURE - 1
35 pages
Week 15
No ratings yet
Week 15
41 pages
ML 3RD Unit
No ratings yet
ML 3RD Unit
67 pages
Data Mining Classification and Prediction
No ratings yet
Data Mining Classification and Prediction
17 pages
Chapter 1
No ratings yet
Chapter 1
3 pages
Unit-4 AML (1. Basics and K-NN)
No ratings yet
Unit-4 AML (1. Basics and K-NN)
25 pages
Unit I
No ratings yet
Unit I
44 pages
CSCI946 w5-classification
No ratings yet
CSCI946 w5-classification
72 pages
Unit - 3
No ratings yet
Unit - 3
83 pages
statistic inference unit 2 notes
No ratings yet
statistic inference unit 2 notes
34 pages
5.3 Model
No ratings yet
5.3 Model
26 pages
ML & DL
No ratings yet
ML & DL
19 pages
Unit1 ML
No ratings yet
Unit1 ML
15 pages
IT 802 ML Unit-2 Notes
No ratings yet
IT 802 ML Unit-2 Notes
19 pages
Classification
100% (2)
Classification
105 pages
Lecture 2 - Supervised Learning
No ratings yet
Lecture 2 - Supervised Learning
6 pages
unit 1
100% (1)
unit 1
13 pages
Unit 4 ML
No ratings yet
Unit 4 ML
28 pages
Week 4 - Intro to ML
No ratings yet
Week 4 - Intro to ML
37 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
68 pages
MIT - Machine Learning Notes From Chapter 1 - 14 PDF
No ratings yet
MIT - Machine Learning Notes From Chapter 1 - 14 PDF
101 pages
This Story Paraphrased From A Post On 9/4/12
No ratings yet
This Story Paraphrased From A Post On 9/4/12
7 pages
UNIT03
No ratings yet
UNIT03
52 pages
ML 1-6
No ratings yet
ML 1-6
248 pages
ML Unit 2
No ratings yet
ML Unit 2
31 pages
Ch3-Machine Learning
No ratings yet
Ch3-Machine Learning
124 pages
Chapter 7 - LAST
No ratings yet
Chapter 7 - LAST
29 pages
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Introduction to Robotics
From Everand
Introduction to Robotics
Swarnalata Verma
No ratings yet
Connection Log
No ratings yet
Connection Log
1 page
MALACAÑAN
No ratings yet
MALACAÑAN
83 pages
(3-FIB) Fiber Optic Communications Interface
No ratings yet
(3-FIB) Fiber Optic Communications Interface
6 pages
PDF Physics Week 11 20docx DD - PDF
No ratings yet
PDF Physics Week 11 20docx DD - PDF
122 pages
Innovation Voucher Program
No ratings yet
Innovation Voucher Program
2 pages
UENR6568 Bucket - Remove and Install - Bottom Dump Bucket For The Front Shovel
No ratings yet
UENR6568 Bucket - Remove and Install - Bottom Dump Bucket For The Front Shovel
22 pages
EQ04
No ratings yet
EQ04
121 pages
436 Datasheet en
No ratings yet
436 Datasheet en
3 pages
16 Major Divisions
100% (1)
16 Major Divisions
17 pages
Download ebooks file Bioinformatics Algorithms Techniques and Applications 1st Edition Ion Mandoiu all chapters
100% (8)
Download ebooks file Bioinformatics Algorithms Techniques and Applications 1st Edition Ion Mandoiu all chapters
40 pages
Securitex Flammable Gas Detector ND-104N
No ratings yet
Securitex Flammable Gas Detector ND-104N
1 page
Approval WFPosted Purch Invoice
No ratings yet
Approval WFPosted Purch Invoice
8 pages
CALFA BAS - Catalogue Overview of Product - Page 2
No ratings yet
CALFA BAS - Catalogue Overview of Product - Page 2
1 page
Copia de Petrobras Marine Lubricating Oils Equivalent List
No ratings yet
Copia de Petrobras Marine Lubricating Oils Equivalent List
5 pages
Internet of Things
No ratings yet
Internet of Things
29 pages
12th Maths Chapter 1 QB
No ratings yet
12th Maths Chapter 1 QB
2 pages
8443A260E
No ratings yet
8443A260E
2 pages
Pace Flow Controller Small Recip Flyerfinal
No ratings yet
Pace Flow Controller Small Recip Flyerfinal
2 pages
ICG - Preview
No ratings yet
ICG - Preview
11 pages
Japanese For Busy People, PDF
No ratings yet
Japanese For Busy People, PDF
2 pages
Hexnode Plan Comparisions
No ratings yet
Hexnode Plan Comparisions
27 pages
Hs Wismar Anmeldung Bachelor Thesis
100% (2)
Hs Wismar Anmeldung Bachelor Thesis
7 pages
HiTarget HTS521L10-1
No ratings yet
HiTarget HTS521L10-1
2 pages
2020 Physical
No ratings yet
2020 Physical
16 pages
TI170&175&390&395 User Manual
No ratings yet
TI170&175&390&395 User Manual
88 pages
Standard Specifications
No ratings yet
Standard Specifications
2 pages
SSC3S931 Data Sheet: LLC Current-Resonant Off-Line Switching Controller
No ratings yet
SSC3S931 Data Sheet: LLC Current-Resonant Off-Line Switching Controller
23 pages