100% found this document useful (1 vote)

259 views56 pages

Naive Bayes and Classification Techniques

The document discusses various classification algorithms including naive Bayes classification, discriminant analysis, and logistic regression. Naive Bayes classification uses Bayes' theorem and makes the assumption of conditional independence between features. Discriminant analysis uses covariance matrices and Fisher's linear discriminant to classify data. Logistic regression uses a logistic response function to predict class probabilities and can be used for binary and multiclass classification problems. The document also covers topics like evaluating classification models, dealing with imbalanced data, and types of naive Bayes classifiers.

Uploaded by

tunio.bscsf21

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

259 views56 pages

Naive Bayes and Classification Techniques

Uploaded by

tunio.bscsf21

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Contents
Classification
Naïve Bayes Classifier
Discriminant Analysis
Logistic Regression
Evaluating Classification Models
Strategies for Imbalanced Data
Summary

Classification

Unit 02 Lecture 07

Dr. Mohammad Asif Khan

Contents
 Naive Bayes
 Why Exact Bayesian Classification Is Impractical
 The Naive Solution
 Discriminant Analysis
 Covariance Matrix
 Fisher’s Linear Discriminant
 A Simple Example
 Logistic Regression
 Logistic Response Function and Logit
 Logistic Regression and the GLM
 Generalized Linear Models
 Predicted Values from Logistic Regression
 Evaluating Classification Models
 Confusion Matrix
 The Rare Class Problem
 Precision, Recall, and Specificity
 ROC Curve
 AUC
 Lift
 Strategies for Imbalanced Data
 Undersampling
 Oversampling and Up/Down Weighting
 Data Generation
 Cost-Based Classification 2

 Exploring the Predictions

Classification
 Classification is perhaps the most important form of prediction:
the goal is to predict whether a record is a 1 or a 0
(phishing/not-phishing, click/don’t click, churn/don’t churn),
known as binary classification.
 In multiclassification problem, most algorithms can return a
probability score of belonging to the class of interest. Like
predicting type of tumor normal, benign, malignant.
 Most classification methods, provides two prediction methods:
 predict (which returns the class) and
 returns probabilities for each class

3
Classification
 A sliding cutoff can then be used to convert the score to a
decision.
 The general approach is as follows:
1. Establish a cutoff probability for the class of interest, above
which we consider a record as belonging to that class.
2. Estimate (with any model) the probability that a record
belongs to the class of interest.
3. If that probability is above the cutoff probability, assign the
new record to the class of interest.
 The higher the cutoff, the fewer the records predicted as 1—
that is, as belonging to the class of interest.
 The lower the cutoff, the more the records predicted as 1.
4
Naive Bayes Classifier
 Naive Bayes is the most straightforward and fast classification
algorithm, which is suitable for a large chunk of data.
 Naive Bayes classifier is successfully used in various applications such
as spam filtering, text classification, sentiment analysis, and
recommender systems.
 It uses Bayes theorem of probability for prediction of unknown class.
 Whenever you perform classification,
 the first step is to understand the problem and identify potential
features and label.
 Features are those characteristics or attributes which affect the results
of the label.
 For example, in the case of a loan distribution, bank managers identify the
customer’s occupation, income, age, location, previous loan history,
transaction history, and credit score. These characteristics are known5 as
features that help the model classify customers.
Naive Bayes Classifier
 The classification has two phases,
 a learning phase and the evaluation phase. In the learning phase, the
classifier trains its model on a given dataset,
 and in the evaluation phase, it tests the classifier's performance.
Performance is evaluated on the basis of various parameters such as
accuracy, error, precision, and recall.

6
Naive Bayes Classifier
 Naive Bayes classifier is the fast, accurate and reliable
algorithm.
 Naive Bayes classifiers have high accuracy and speed on
large datasets.
 Naive Bayes classifier assumes that the effect of a particular
feature in a class is independent of other features.
 For example, a loan applicant is desirable or not depending on his/her
income, previous loan and transaction history, age, and location. Even if
these features are interdependent, these features are still considered
independently.
 This assumption simplifies computation, and that's why it is
considered as naive. This assumption is called class conditional
independence.
7
Naive Bayes Classifier
 Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used
to determine the probability of a hypothesis with prior knowledge. It
depends on the conditional probability.

 P(A|B) is Posterior probability: Probability of hypothesis A on the observed

event B.
 P(B|A) is Likelihood probability: Probability of the evidence given that the
probability of a hypothesis is true.
 P(A) is Prior Probability: Probability of hypothesis before observing the
evidence.
 P(B) is Marginal Probability: Probability of Evidence.
8
Naive Bayes Classifier
 Naive Bayes classifier calculates the probability of an event in the
following steps:
 Step 1: Calculate the prior probability for given class labels
 Step 2: Find Likelihood probability with each attribute for each class
 Step 3: Put these value in Bayes Formula and calculate posterior
probability.
 Step 4: See which class has a higher probability, given the input belongs
to the higher probability class.
 For simplifying prior and posterior probability calculation, you can use the two tables
frequency and likelihood tables.
 Both of these tables will help you to calculate the prior and posterior probability. The
Frequency table contains the occurrence of labels for all features.

9
Naive Bayes Classifier
 There are two likelihood tables. Likelihood Table 1 is showing prior
probabilities of labels and Likelihood Table 2 is showing the posterior
probability.

10
Naive Bayes Classifier
 Probability of playing:
 P(Yes | Overcast) = P(Overcast | Yes) P(Yes) / P (Overcast) .....................(1)
 Calculate Prior Probabilities:
 P(Overcast) = 4/14 = 0.29
 P(Yes)= 9/14 = 0.64
 Calculate Posterior Probabilities:
 P(Overcast |Yes) = 4/9 = 0.44
 Put Prior and Posterior probabilities in equation (1)
 P (Yes | Overcast) = 0.44 * 0.64 / 0.29 = 0.98(Higher)
 Similarly, calculate the probability of not playing:
 Probability of not playing:
 P(No | Overcast) = P(Overcast | No) P(No) / P (Overcast) ..(2)
 Calculate Prior Probabilities:
 P(Overcast) = 4/14 = 0.29
 P(No)= 5/14 = 0.36
 Calculate Posterior Probabilities:
 P(Overcast |No) = 0/9 = 0
 Put Prior and Posterior probabilities in equation (2) 11
 P (No | Overcast) = 0 * 0.36 / 0.29 = 0
 The probability of a 'Yes' class is higher. So you can determine here if the weather is overcast
Naive Bayes Classifier with multiple features
 Now suppose you want to calculate the probability of when the weather is
overcast, and the temperature is mild.
 Probability of playing
 Bayes Naïve Theorem says

12
Naive Bayes Classifier with multiple features

 Calculate Prior Probability: P(Yes)= 9/14 = 0.64

 Calculate Likelihood Probability: P(Overcast |Yes) = 4/9 = 0.44; P(Mild |Yes) = 4/9 = 0.44
 Calculate Marginal Probability: P(Overcast) = 4/14 = 0.29; P(Mild) = 6/14 = 0.4285
 Now calculate Posterior Probability
Temperatu Yes No P(Temp type)
re Type

Hot 2 2 4/14 = 0.2857

Mild 4 2 6/14 = 0.4285

Cool 3 1 4/14 =0.2857

Total 9 5 14

13
Types of Naive Bayes Classifier
 There are three types of Naive Bayes Model, which are given below:
 Gaussian: The Gaussian model assumes that features follow a
normal distribution. This means if predictors take continuous values
instead of discrete, then the model assumes that these values are
sampled from the Gaussian distribution.
 Multinomial: The Multinomial Naïve Bayes classifier is used when
the data is multinomial distributed. It is primarily used for document
classification problems, it means a particular document belongs to
which category such as Sports, Politics, education, etc.
 The classifier uses the frequency of words for the predictors.
 Bernoulli: The Bernoulli classifier works similar to the Multinomial
classifier, but the predictor variables are the independent Booleans
variables. Such as if a particular word is present or not in a
document. This model is also famous for document classification 14

tasks.
Naive Bayes Classifier (colab code)

15
Naive Bayes Classifier (colab code)

16
Naive Bayes Classifier (colab code)

17
Discriminant Analysis
 Discriminant analysis is the earliest statistical classifier
 It was introduced by R. A. Fisher in 1936
 While discriminant analysis encompasses several techniques, the
most commonly used is linear discriminant analysis, or
LDA.
 It many other applications like used in principal component
analysis (PCA).

18
Discriminant Analysis
 Linear Discriminant Analysis (LDA) is a supervised learning algorithm used
for classification tasks in machine learning. It is a technique used to find
a linear combination of features that best separates the classes in a
dataset.
 LDA works by projecting the data onto a lower-dimensional space that
maximizes the separation between the classes. It does this by finding
a set of linear discriminants that maximize the ratio of between-class
variance to within-class variance. In other words, it finds the
directions in the feature space that best separate the different classes
of data.
 LDA assumes that the data has a Gaussian distribution. It also assumes
that the data is linearly separable, meaning that a linear decision
boundary can accurately classify the different classes.
 To understand discriminant analysis, it is first necessary to introduce the
concept of covariance between two or more variables.
19
Covariance Matrix
 The covariance measures the relationship between two
variables x and z.
 Denote the mean for each variable by 𝑋 ̅ and 𝑌 ̅
 The covariance Sx,z between x and z is given by:

 where n is the number of records (note that we divide by n – 1

instead of n)

20
Covariance Matrix
 As with the correlation coefficient, positive values indicate a
positive relationship and negative values indicate a negative
relationship.
 Correlation, however, is constrained to be between –1
and 1, whereas covariance scale depends on the scale of
the variables x and z.
 The covariance matrix Σ for x and z consists of the
individual variable variances, 𝑆_𝑥^2 s 𝑆_𝑧^2, on the
diagonal (where row and column are the same variable) and the
covariances between variable pairs on the off-diagonals:

21
Fisher’s Linear Discriminant
 Fisher’s linear discriminant distinguishes variation between
groups, on the one hand, from variation within groups on
the other.
 Divides the records into two groups, linear discriminant analysis
(LDA) focuses on
 maximizing the “between” sum of squares SSbetween
(measuring the variation between the two groups) relative to
the “within” sum of squares SSwithin (measuring the within-
group variation).
 LDA projects data from a D dimensional feature space
down to a D’ (D>D’) dimensional space in a way to
maximize the variability between the classes and reducing the
variability within the classes. 22
Fisher’s Linear Discriminant
 For implementation and example see colab notebook

23
Logistic Regression
 Approximately 70% of problems in Data Science are classification
problems.
 Logistic regression is common and is a useful regression method
for solving the binary classification problem.
 Logistic Regression can be used for various classification problems
such as:
 spam detection
 Diabetes prediction,
 if a given customer will purchase a particular product or will they churn
another competitor,
 whether the user will click on a given advertisement link or not
 Logistic regression describes and estimates the relationship between
one dependent binary variable and independent variables. 24
Logistic Regression
 Logistic regression is a statistical method for predicting binary
classes.
 The outcome or target variable is dichotomous in nature.
Dichotomous means there are only two possible classes.
 For example, it can be used for cancer detection problems. It
computes the probability of an event occurrence.
 It is a special case of linear regression where the target
variable is categorical in nature.
 Logistic Regression uses a log of odds as the dependent
variable. It predicts the probability of occurrence of a binary
event utilizing a logit function.

25
Logistic Regression
 Logistic regression assumptions:
 The dependent variable is binary or dichotomous,
 i.e. It fits into one of two clear-cut categories.
 There should be no, or very little, multicollinearity between the
predictor variables
 The independent variables should be linearly related to the log
odds.
 Logistic regression requires fairly large sample sizes

26
Logistic Regression
 Log-odds: In very simplistic terms, log odds are an alternate
way of expressing probabilities.
 In order to understand log odds, it’s important to understand a
key difference between odds and probabilities:
 odds are the ratio of something happening to something
not happening,
 while probability is the ratio of something happening to
everything that could possibly happen.
 Example: if you and your friend play 10 games of tennis, and
you win 4 out of 10 games,
 the odds of you winning are 4 to 6 ( or, as a fraction, 4/6).
 The probability of you winning, is 4 to 10 (or, as a fraction, 4/10
27

), as there were 10 games played in total.

Logistic Regression
 Regression predicts the probability of occurrence of a binary
event utilizing a logit function.
 Linear Regression Equation:

 Where, y is a dependent variable and x1, x2 ... and Xn are

explanatory variables.
 Sigmoid Function:

 Apply Sigmoid function on linear regression:

28
Logistic Regression
 The sigmoid function, also called logistic
function gives an ‘S’ shaped curve that can take
any real-valued number and map it into a value
between 0 and 1.
 If the curve goes to positive infinity, y predicted will
become 1, and
 if the curve goes to negative infinity, y predicted
will become 0.
 If the output of the sigmoid function is more than
0.5, we can classify the outcome as 1 or YES,
 and if it is less than 0.5, we can classify it as 0 or
NO.
 The outputcannot For example: If the output is
0.75, we can say in terms of probability as: There
is a 75 percent chance that a patient will suffer 29

from cancer.
Types of Logistic Regression
 Types of Logistic Regression:
 Binary Logistic Regression: The target variable has only two
possible outcomes such as Spam or Not Spam, Cancer or No
Cancer.
 Multinomial Logistic Regression: The target variable has three
or more nominal categories such as predicting the type of
Wine.
 Ordinal Logistic Regression: the target variable has three or
more ordinal categories such as restaurant or product rating
from 1 to 5.

30
Logistic Regression code
 Let's build the diabetes prediction model.

 Here, you are going to predict diabetes using the Logistic

Regression Classifier.

 Let's first load the required Pima Indian Diabetes dataset using
the pandas' read CSV function. You can download data from the
following link:
[Link]

31
Logistic Regression code

32
Evaluating Classification Models
 It is common in predictive modeling to train a number of different
models, apply each to a holdout sample, and assess their
performance.
 Model validation is referred to as the process where a trained
model is evaluated with a testing data set.
 Fundamentally, the assessment process attempts to learn which
model produces the most accurate and useful predictions.
 A simple way to measure classification performance is to count
the proportion of predictions that are correct, i.e., measure the
accuracy.
 Accuracy is simply a measure of total error:

33
Evaluating Classification Models
 In most classification algorithms, each case is assigned an
“estimated probability of being a 1.”
 The default decision point, or cutoff, is typically 0.50 or 50%.
 If the probability is above 0.5, the classification is “1”;
otherwise it is “0”.

34
Evaluating Classification Models
 Confusion Matrix: The confusion matrix is a table showing the
number of correct and incorrect predictions categorized by type
of response.
 It is often used to measure the performance of classification
models.
 It tell what your machine learning algorithm did right and what it
did wrong.
 The matrix displays the number of true positives (TP), true
negatives (TN), false positives (FP), and false negatives (FN)
produced by the model on the test data.
 Each row of the matrix represents the instances in an actual class
while each column represents the instances in a predicted class.
 The name “confusion” from the fact that it makes it easy to see
whether the system is confusing two classes (i.e. commonly 35

mislabeling one as another).

Evaluating Classification Models
 For binary classification, the matrix will be of a 2X2 table,
 For multi-class classification, the matrix shape will be equal
to the number of classes i.e for n classes it will be nXn.
 A 2X2 Confusion matrix is shown below for the image
recognization having a Dog image or Not Dog image.
 True Positive (TP): It is the total counts having both
predicted and actual values are Dog.
 True Negative (TN): It is the total counts having both
predicted and actual values are Not Dog.
 False Positive (FP): It is the total counts having prediction is
Dog while actually Not Dog.
 False Negative (FN): It is the total counts having prediction
is Not Dog while actually, it is Dog. 36
Evaluating Classification Models (ML Performance Metrics)

 Accuracy is a metric that measures how often a machine

learning model correctly predicts the outcome.
 You can calculate accuracy by dividing the number of
correct predictions by the total number of predictions.
 It treats all classes as equally important and looks at all
correct predictions.
 However, many real-world applications have a high
imbalance of classes. These are the cases when one
category has significantly more frequent occurrences than
the other.
 Much read example: Link to accuracy example

37
Evaluating Classification Models (ML Performance Metrics)
 Precision is a metric that measures how often a machine learning
model correctly predicts the positive class or How well you guess
the label in question or goal is to minimize mistakes in guessing
positive labels

 The recall or sensitivity or true positive rate, measures how

often a machine learning model correctly identifies positive
instances (true positives) from all the actual positive samples in
the dataset.

 Another metric used is specificity, which measures a model’s

ability to predict a negative outcome:

38
Evaluating Classification Models (ML Performance Metrics)

 F1-Score: F1-score is used to evaluate the overall

performance of a classification model. It is the harmonic
mean of precision and recall
 F1 Score is needed when you want to seek a balance
between Precision and Recall

39
Evaluating Classification Models

 Actual Dog Counts = 6

 Actual Not Dog Counts = 4
 True Positive Counts = 5
 False Positive Counts = 1
 True Negative Counts = 3
 False Negative Counts = 1

40
Evaluating Classification Models

 [Link]
Evaluating Classification Models

 [Link]
Evaluating Classification Models (AUC-ROC Curve)
 ROC curve is the graphical representation of the effectiveness of
the binary classification model.
 It plots the true positive rate (TPR) vs the false positive rate (FPR)
at different classification thresholds.
 AUC stands for Area Under the Curve
 AUC curve represents the area under the ROC curve.
 TPR and FPR range between 0 to 1, So, the area will always lie
between 0 and 1, and A greater value of AUC denotes better
model performance.
 The goal is to maximize this area in order to have the highest TPR
and lowest FPR at the given threshold.
 The AUC measures the probability that the model will assign a
randomly chosen positive instance a higher predicted probability
44

compared to a randomly chosen negative instance.

Evaluating Classification Models (AUC-ROC Curve)
 Basically, the ROC curve is a graph that shows the
performance of a classification model at all possible
thresholds (threshold is a particular value beyond which you
say a point belongs to a particular class).
 The curve is plotted between two parameters
 TPR – True Positive Rate (Recall)
 FPR – False Positive Rate

45
Evaluating Classification Models (AUC-ROC Curve)
 Basically, TPR/Recall/Sensitivity is the ratio of positive
examples that are correctly identified.
 It represents the ability of the model to correctly identify
positive instances and is calculated as follows:

 FPR is the ratio of negative examples that are incorrectly

classified.

46
Evaluating Classification Models (AUC-ROC Curve)
 And as said earlier ROC is nothing but the plot between TPR and
FPR across all possible thresholds and AUC is the entire area
beneath this ROC curve
 let us look at AUC-ROC from a probabilistic point of view.
 AUC measures how well a model is able to distinguish between classes
 The black dots are TPR and FPR at different probability thresholds.

There are two models applied

red & blue.
Which one performed
better?

47
Strategies for imbalanced data
 Balanced Dataset: In a Balanced dataset, there is approximately
equal distribution of classes in the target column.
 Imbalanced Dataset: In an Imbalanced dataset, there is a highly
unequal distribution of classes in the target column.
 Example : Suppose there is a Binary Classification problem with
the following training data:
 Total Observations : 1000
 Target variable class is either ‘Yes’ or ‘No’.
 Case 1:
 If there are 900 ‘Yes’ and 100 ‘No’ then it represents an Imbalanced dataset as
there is highly unequal distribution of the two classes. .
 Case 2:
 If there are 550 ‘Yes’ and 450 ‘No’ then it represents a Balanced dataset as
there is approximately equal distribution of the two classes. 48
Strategies for imbalanced data
 Imbalanced Data Distribution, generally happens when
observations in one of the class are much higher or lower than
the other classes.
 This problem is prevalent in examples such as Fraud Detection,
Anomaly Detection, Facial recognition etc.
 Standard ML techniques such as Decision Tree and Logistic
Regression have a bias towards the majority class, and they tend
to ignore the minority class. They tend only to predict the
majority class, hence, having major misclassification of the
minority class in comparison with the majority class.
 In more technical words, if we have imbalanced data distribution
in our dataset then our model becomes more prone to the case
when minority class has negligible or very lesser recall.

49
Strategies for imbalanced data
 Hence, there is a significant amount of difference between the
sample sizes of the two classes in an Imbalanced Dataset.
 Problem with Imbalanced dataset:
 Algorithms may get biased towards the majority class and thus
tend to predict output as the majority class.
 Imbalanced dataset gives misleading accuracy score.
 Two main types of balancing data:
 Up/down sampling
 SMOTE (Synthetic Minority Oversampling Technique)

50
Strategies for imbalanced data (Over/Up-Sample Minority Class)

 In Up-sampling, samples from minority

classes are randomly duplicated so as to
achieve equivalence with the majority class.

51
Strategies for imbalanced data (Over/Up-Sample Minority Class)

 Using RandomOverSampler:
 This can be done with the help of the
RandomOverSampler method present in imblearn.
 This function randomly generates new data points
belonging to the minority class with replacement (by
default).

52
Strategies for imbalanced data (Over/Up-Sample Minority Class)

 Synthetic Minority Oversampling Technique (SMOTE): It is used

to generate artificial/synthetic samples for the minority class.
 This technique works by randomly choosing a sample from a
minority class and determining K-Nearest Neighbors for this
sample, then the artificial sample is added between the picked
sample and its neighbors. This function is present in imblearn
module.
 Minority class is given as input vector.
 Determine its K-Nearest Neighbours
 Pick one of these neighbors and place an artificial sample point anywhere
between the neighbor and sample point under consideration.
 Repeat till the dataset gets balanced.

53
Strategies for imbalanced data (Down/Under-sample Majority Class)

 Down/Under Sampling is the process of randomly

selecting samples of majority class and removing them
in order to prevent them from dominating over the
minority class in the dataset.

54
Strategies for imbalanced data (Down/Under-sample Majority Class)

 Using RandomUnderSampler

55
Summary

 We learned different classifications methods of machine learning

 We have learned performance metrics for evaluating models
 We have learned about data imbalances and methods to balance datasets

Build Your Movie Recommendation System
No ratings yet
Build Your Movie Recommendation System
8 pages
Multinomial Logistic Regression Overview
No ratings yet
Multinomial Logistic Regression Overview
73 pages
Bias vs. Variance in Machine Learning
100% (1)
Bias vs. Variance in Machine Learning
5 pages
Data Visualization Basics with Python
No ratings yet
Data Visualization Basics with Python
47 pages
Overview of Andrew Ng's ML Course
100% (1)
Overview of Andrew Ng's ML Course
4 pages
Gradient Descent in Machine Learning
100% (2)
Gradient Descent in Machine Learning
28 pages
Understanding Bayes Classifier Basics
No ratings yet
Understanding Bayes Classifier Basics
23 pages
Data Science Course: Variable Types & Visualization
100% (1)
Data Science Course: Variable Types & Visualization
45 pages
Regression Diagnostics Overview
100% (1)
Regression Diagnostics Overview
53 pages
Guide to Becoming a Data Scientist
No ratings yet
Guide to Becoming a Data Scientist
46 pages
Supervised vs Unsupervised Learning Algorithms
100% (1)
Supervised vs Unsupervised Learning Algorithms
41 pages
ABTesting Intuition Busters
No ratings yet
ABTesting Intuition Busters
11 pages
Math Behind Neural Networks Explained
No ratings yet
Math Behind Neural Networks Explained
5 pages
ML Unit 1 Notes
100% (1)
ML Unit 1 Notes
19 pages
Essential Math for Machine Learning
No ratings yet
Essential Math for Machine Learning
3 pages
Understanding Qualitative and Quantitative Data
No ratings yet
Understanding Qualitative and Quantitative Data
89 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
36 pages
Forecasting Time Series with Prophet
0% (1)
Forecasting Time Series with Prophet
10 pages
Understanding Regression Analysis Types
No ratings yet
Understanding Regression Analysis Types
5 pages
Introduction to Spark SQL and Scala
No ratings yet
Introduction to Spark SQL and Scala
17 pages
Maths ML Cheat-Sheet Overview
No ratings yet
Maths ML Cheat-Sheet Overview
1 page
Dimensionality Reduction Techniques Explained
No ratings yet
Dimensionality Reduction Techniques Explained
47 pages
Understanding Random Forests in Machine Learning
100% (1)
Understanding Random Forests in Machine Learning
28 pages
Data Quality and Preprocessing in ML
100% (1)
Data Quality and Preprocessing in ML
162 pages
Logistic Regression: Theory and Applications
No ratings yet
Logistic Regression: Theory and Applications
24 pages
Insights from 35 Data Scientists
No ratings yet
Insights from 35 Data Scientists
6 pages
Machine Learning Interview Q&A Guide
100% (1)
Machine Learning Interview Q&A Guide
17 pages
Mathematical Foundations of Machine Learning
No ratings yet
Mathematical Foundations of Machine Learning
74 pages
Deep Learning Essentials in R
100% (3)
Deep Learning Essentials in R
24 pages
Machine Learning & AI Course Overview
No ratings yet
Machine Learning & AI Course Overview
66 pages
Prophet Seasonality Forecasting Guide
No ratings yet
Prophet Seasonality Forecasting Guide
17 pages
Model Validation Techniques in ML
100% (2)
Model Validation Techniques in ML
26 pages
AMATH 352 Midterm Review Guide
No ratings yet
AMATH 352 Midterm Review Guide
7 pages
Foundations of Machine Learning Overview
100% (1)
Foundations of Machine Learning Overview
469 pages
R2 Model Validation and Cross-Validation
No ratings yet
R2 Model Validation and Cross-Validation
46 pages
Training Predictive Models in Python
No ratings yet
Training Predictive Models in Python
2 pages
SMOTE for Imbalanced Data in Python
No ratings yet
SMOTE for Imbalanced Data in Python
8 pages
Simple Linear Regression Guide with Python
No ratings yet
Simple Linear Regression Guide with Python
8 pages
Trustworthy Machine Learning Overview
No ratings yet
Trustworthy Machine Learning Overview
176 pages
Backpropagation in Neural Networks
100% (1)
Backpropagation in Neural Networks
26 pages
ML Model Types and Concepts Explained
No ratings yet
ML Model Types and Concepts Explained
34 pages
Logistic Regression Overview
No ratings yet
Logistic Regression Overview
44 pages
Understanding Bagging and Boosting Techniques
100% (1)
Understanding Bagging and Boosting Techniques
19 pages
Bayesian Machine Learning Overview
No ratings yet
Bayesian Machine Learning Overview
127 pages
Understanding Machine Learning Concepts
No ratings yet
Understanding Machine Learning Concepts
33 pages
Data Science Interview Prep Guide
No ratings yet
Data Science Interview Prep Guide
10 pages
Essential Guide to Exploratory Data Analysis
No ratings yet
Essential Guide to Exploratory Data Analysis
10 pages
Inferential Statistics and Empirical Rule
No ratings yet
Inferential Statistics and Empirical Rule
111 pages
K-Means Clustering Explained
No ratings yet
K-Means Clustering Explained
11 pages
Bayes Lecture Notes
No ratings yet
Bayes Lecture Notes
172 pages
Statistics Interview Questions for Data Science
No ratings yet
Statistics Interview Questions for Data Science
5 pages
Deep Learning Interview Questions Guide
No ratings yet
Deep Learning Interview Questions Guide
17 pages
Transformers in NLP: Architecture & Models
No ratings yet
Transformers in NLP: Architecture & Models
9 pages
Naive Bayes and Perceptron in ML
No ratings yet
Naive Bayes and Perceptron in ML
2 pages
Data Science Workshop Curriculum
No ratings yet
Data Science Workshop Curriculum
6 pages
Lab 7 Classification Session1
No ratings yet
Lab 7 Classification Session1
22 pages
Discriminant Analysis in Data Science
No ratings yet
Discriminant Analysis in Data Science
25 pages
Naïve Bayes Classifier Overview
No ratings yet
Naïve Bayes Classifier Overview
18 pages
Understanding Naive Bayes Classifiers
No ratings yet
Understanding Naive Bayes Classifiers
22 pages
Naïve Bayes Classifier Overview
No ratings yet
Naïve Bayes Classifier Overview
19 pages
CHF Case Report: 75-Year-Old Male
No ratings yet
CHF Case Report: 75-Year-Old Male
41 pages
Dutch Painting: Portraits and Genres
No ratings yet
Dutch Painting: Portraits and Genres
4 pages
Child Adolescent Case History Record
No ratings yet
Child Adolescent Case History Record
6 pages
Understanding "A Poison Tree" Poem
100% (1)
Understanding "A Poison Tree" Poem
2 pages
Sha'ar Ruach HaKodesh Overview
0% (1)
Sha'ar Ruach HaKodesh Overview
11 pages
EDTM 312: Environmental Lesson Planning
No ratings yet
EDTM 312: Environmental Lesson Planning
4 pages
Future Tenses: Uses and Examples
No ratings yet
Future Tenses: Uses and Examples
3 pages
Homoeopathy for Separation Anxiety Disorder
No ratings yet
Homoeopathy for Separation Anxiety Disorder
6 pages
Comey Indictment Dismissed Over Prosecutor's Appointment
No ratings yet
Comey Indictment Dismissed Over Prosecutor's Appointment
29 pages
Tactical Monster Clean Bulking Guide
50% (2)
Tactical Monster Clean Bulking Guide
17 pages
Grade 7 English Lesson on Prediction
No ratings yet
Grade 7 English Lesson on Prediction
8 pages
Mastering Prompt Engineering for Chatbots
No ratings yet
Mastering Prompt Engineering for Chatbots
1 page
Mogged: Trends and Culture Insights
No ratings yet
Mogged: Trends and Culture Insights
1 page
Defining Personal Success in Life
No ratings yet
Defining Personal Success in Life
16 pages
Young's Double Slit Experiment Guide
No ratings yet
Young's Double Slit Experiment Guide
5 pages
Tooth Development and Morphogenesis
No ratings yet
Tooth Development and Morphogenesis
30 pages
Integrated Language Skills Scheme of Work
No ratings yet
Integrated Language Skills Scheme of Work
4 pages
Oracle APEX 4.2 Hands-on Guide
No ratings yet
Oracle APEX 4.2 Hands-on Guide
20 pages
Northern Glory Training Evaluation Form
No ratings yet
Northern Glory Training Evaluation Form
3 pages
Tafsīr of Sūrat al-Ikhlās Explained
No ratings yet
Tafsīr of Sūrat al-Ikhlās Explained
17 pages
Causes and Solutions for Stress Today
No ratings yet
Causes and Solutions for Stress Today
2 pages
Level 1-2 Ranger Spell List
No ratings yet
Level 1-2 Ranger Spell List
8 pages
Succession Law Quiz Insights 2020
No ratings yet
Succession Law Quiz Insights 2020
2 pages
Causes of Procrastination Explained
No ratings yet
Causes of Procrastination Explained
6 pages
POLI1003: Understanding Politics Basics
No ratings yet
POLI1003: Understanding Politics Basics
8 pages
Sexual Tension Tactics Cheat Sheet
No ratings yet
Sexual Tension Tactics Cheat Sheet
17 pages
Life Stages and Events Vocabulary Guide
No ratings yet
Life Stages and Events Vocabulary Guide
14 pages
The Bock Saga: Ancient Civilizations Explained
No ratings yet
The Bock Saga: Ancient Civilizations Explained
11 pages
Reflective Learning via Mentoring & Coaching
0% (3)
Reflective Learning via Mentoring & Coaching
2 pages
Customer Satisfaction Analysis Case Study
No ratings yet
Customer Satisfaction Analysis Case Study
6 pages

Naive Bayes and Classification Techniques

Uploaded by

Naive Bayes and Classification Techniques

Uploaded by

Classification

Dr. Mohammad Asif Khan

 Exploring the Predictions

 P(A|B) is Posterior probability: Probability of hypothesis A on the observed

 Calculate Prior Probability: P(Yes)= 9/14 = 0.64

Hot 2 2 4/14 = 0.2857

Cool 3 1 4/14 =0.2857

 where n is the number of records (note that we divide by n – 1

), as there were 10 games played in total.

 Where, y is a dependent variable and x1, x2 ... and Xn are

 Apply Sigmoid function on linear regression:

 Here, you are going to predict diabetes using the Logistic

mislabeling one as another).

 Accuracy is a metric that measures how often a machine

 The recall or sensitivity or true positive rate, measures how

 Another metric used is specificity, which measures a model’s

 F1-Score: F1-score is used to evaluate the overall

 Actual Dog Counts = 6

compared to a randomly chosen negative instance.

 FPR is the ratio of negative examples that are incorrectly

There are two models applied

 In Up-sampling, samples from minority

 Synthetic Minority Oversampling Technique (SMOTE): It is used

 Down/Under Sampling is the process of randomly

 We learned different classifications methods of machine learning

You might also like