0% found this document useful (0 votes)
170 views

Confusion Matrix

The document discusses confusion matrices and how they are used to evaluate classification models. A confusion matrix is an N x N table that compares actual and predicted classifications. It contains counts of true positives, false positives, true negatives, and false negatives. The confusion matrix allows calculating important metrics like accuracy, precision, recall, and F1 score to assess model performance on test data where the true values are known.

Uploaded by

Swapnil Bera
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
170 views

Confusion Matrix

The document discusses confusion matrices and how they are used to evaluate classification models. A confusion matrix is an N x N table that compares actual and predicted classifications. It contains counts of true positives, false positives, true negatives, and false negatives. The confusion matrix allows calculating important metrics like accuracy, precision, recall, and F1 score to assess model performance on test data where the true values are known.

Uploaded by

Swapnil Bera
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Confusion Matrix

Introduction
❖Machine Learning deals with two types of
problems: Regression problems and
Classification problems.
❖Regression techniques or models are used when
our dependent variable is continuous in nature
whereas Classification techniques are used
when the dependent variable is categorical.
❖When a Machine Learning model is built
various evaluation metrics are used to check
the quality or the performance of a model.
❖ For classification models, metrics such as
Accuracy, Confusion Matrix, Classification
report (i.e Precision, Recall, F1 score), and
AUC-ROC curve are used.
What is Confusion Matrix?
❖Confusion Matrix is the visual representation
of the Actual VS Predicted values.
❖ It measures the performance of our Machine
Learning classification model and looks like a
table-like structure.
What is a Confusion Matrix?
❖A Confusion matrix is an N x N matrix used for
evaluating the performance of a classification
model, where N is the number of target classes.
❖The matrix compares the actual target values
with those predicted by the machine learning
model.
• For a binary classification problem, we would
have a 2 x 2 matrix as shown below with 4
values:
✓The above 2-dimensional array is the
Confusion Matrix which evaluates the
performance of a classification model on a
set of predicted values for which the true(or
actual) values are known.
How does the Confusion Matrix evaluate
the model’s performance?
• Now, all we need to know is how this matrix
works on the dataset.
Feature 1 Feature 2 Target( has cancer: 0,doesn’t have cancer:1) Prediction
( what our model predicted)
1 1
0 1
1 1
1 0
0 0
0 0
1 1
1 1
0 0
❖In the above classification dataset, Feature 1,
Feature 2, and up to Feature n are the
independent variables.
❖For Target (dependent variable) we have
assigned 1 to a positive value (i.e have cancer)
and 0 to a negative value (i.e doesn’t have
cancer).
❖ After we have trained our model and got our
predictions and we want to evaluate the
performance, this is how the confusion matrix
would look like.
Confusion matrix
✓ TP = 4, There are four cases in the dataset where the
model predicted 1 and the target was also 1
✓ FP = 1, There is only one case in the dataset where
the model predicted 1 but the target was 0
✓ FN = 1, There is only one case in the dataset where
the model predicted 0 but the target was 1
✓ TN = 3, There are three cases in the dataset where the
model predicted 0 and the target was also 0
Elements of Confusion Matrix
✓ It represents the different combinations of
Actual VS Predicted values.
1. True Positive (TP)
• The predicted value matches the actual value.
• The actual value was positive and the model
predicted a positive value.
True Negative (TN)

• The predicted value matches the actual


value.
• The actual value was negative and the model
predicted a negative value.
False Positive (FP) – Type 1 error
• The predicted value was falsely predicted
• The actual value was negative but the model
predicted a positive value.
• Also known as the Type 1 error.
False Negative (FN) – Type 2 error
• The predicted value was falsely predicted.
• The actual value was positive but the model
predicted a negative value.
• Also known as the Type 2 error.
• Suppose we had a classification dataset with
1000 data points.
• We fit a classifier on it and get the below
confusion matrix:
The different values of the Confusion matrix
would be as follows:
• True Positive (TP) = 560; meaning 560
positive class data points were correctly
classified by the model.
• True Negative (TN) = 330; meaning 330
negative class data points were correctly
classified by the model.
• False Positive (FP) = 60; meaning 60 negative
class data points were incorrectly classified as
belonging to the positive class by the model.
• False Negative (FN) = 50; meaning 50 positive
class data points were incorrectly classified as
belonging to the negative class by the model.
• This turned out to be a pretty decent
classifier for our dataset considering the
relatively larger number of true positive and
true negative values.
Why is the Confusion Matrix important?
• The below-given metrics of confusion matrix
determine how well our model performs-

1. Accuracy
2. Precision (Positive Prediction Value)
3. Recall (True Positive Rate or Sensitivity)
4. F beta Score
1. ACCURACY:
Accuracy is the number of correctly (True) predicted results out of the
total.

Accuracy = (TP + TN) / (TP + TN + FP + FN)

= (4 + 3) / 9 = 0.77

• Accuracy should be considered when TP and TN are more important and the
dataset is balanced because in that case the model will not get baised based on
the class distribution.
• But in real-life classification problem, imbalanced class distribution exists.
Why Do We Need a Confusion Matrix?
• Let’s say you want to predict how many
people are infected with a contagious virus in
times before they show the symptoms, and
isolate them from the healthy population.
• The two values for our target variable would
be: Sick and Not Sick.
Our dataset is an example of an imbalanced
data set
• There are 947 data points for the negative
class and 3 data points for the positive class.
This is how we’ll calculate the accuracy:
Let’s see how our model performed:
• The total outcome values are:
• TP = 30, TN = 930, FP = 30, FN = 10
• So, the accuracy for our model turns out to
be:
• 96%! Not bad!
• But it is giving the wrong idea about the result. Think
about it.
• Our model should say “I can predict sick people 96%
of the time”.
• However, it is doing the opposite.
• It is predicting the people who will not get sick with
96% accuracy while the sick are spreading the virus!
Precision vs. Recall
• how to calculate Precision:

This would determine whether our model is


reliable or not.
2. PRECISION:
• Out of the total predicted positive values,
how many were actually positive.

Precision = TP / (TP + FP) = 4/5 = 0.8


3. RECALL:
here’s how we can calculate Recall:
3. RECALL:
• Out of the total actual positive values, how
many were correctly predicted as positive.

Recall= TP / (TP + FN) = 4/5 = 0.8


• We can easily calculate Precision and Recall
for our model by plugging in the values into
the above questions:
When Precision is used?
✓Precision is a useful metric in cases where
False Positive is a higher concern than False
Negatives.
✓Precision is important in music or video
recommendation systems, e-commerce
websites, etc. Wrong results could be harmful
to the business.
When precision should be considered?
• Taking a use case of Spam Detection, suppose
the mail is not spam (0), but the model has
predicted it as spam (1) which is FP.
• In this scenario, one can miss the important
mail.
• So, here we should focus on reducing the FP
and must consider precision in this case.
When Recall is used?
• Recall is important in medical cases where the
actual positive cases should not go undetected!
• In our example, Recall would be a better metric
because we don’t want to accidentally discharge
an infected person and let them mix with the
healthy population thereby spreading the
contagious virus.
When recall should be considered?
• In Cancer Detection, suppose if a person is
having cancer (1), but it is not predicted (0) by
the model which is FN. This could be a disaster.
So, in this scenario, we should focus on reducing
the FN and must consider recall in this case.
• Based on the problem statement, whenever the
FP is having a greater impact, go for Precision
and whenever the FN is important, go for Recall.
4. F beta SCORE
• In some use cases, both precision and recall
are important.
• Also, in some use cases even though
precision plays an important role or recall
plays is important, we should combine both
to get the most accurate result.
Selecting beta value
• F-1 Score (beta =1 )
• When FP and FN both are equally important.
This allows the model to consider both
precision and recall equally using a single
score.
F-1 Score is the Harmonic Mean of precision
and recall.
• Smaller beta value such as (beta = 0.5).
• If the impact of FP is high. This will give more
weight to precision than to recall.

• Higher beta value such (beta = 2)


• If the impact of FN is high. Thus, giving more
weight to recall and less to precision.

You might also like