Accuracy, Precision, Recall & F1 Score Interpretation of Performance Measures

This document discusses accuracy, precision, recall, and the F1 score as performance measures for classification models. It defines true positives, true negatives, false positives, and false negatives using a confusion matrix. It then provides the formulas and interpretations for accuracy, precision, recall, and the F1 score. Accuracy measures the proportion of correct predictions, while precision measures the proportion of true positives, and recall measures the proportion of actual positives that were correctly identified. The F1 score calculates the weighted average of precision and recall to account for both false positives and false negatives.

Uploaded by

Friday Jones

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

494 views

Accuracy, Precision, Recall & F1 Score Interpretation of Performance Measures

Uploaded by

Friday Jones

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Accuracy, Precision, Recall & F1 Score: Interpretation of Performance

Measures

A confusion matrix is a table that is often used to describe the performance of a

classification model on a set of test data for which the true values are known. So, let’s
talk about those four parameters first.

True positive and true negatives are the observations that are correctly predicted and
therefore shown in green. We want to minimize false positives and false negatives so
they are shown in red color. These terms are a bit confusing. So let’s take each term one
by one and understand it fully.

True Positives (TP) - These are the correctly predicted positive values which means that
the value of actual class is yes and the value of predicted class is also yes. E.g. if actual
class value indicates that this passenger survived and predicted class tells you the same
thing.

True Negatives (TN) - These are the correctly predicted negative values which means
that the value of actual class is no and value of predicted class is also no. E.g. if actual
class says this passenger did not survive and predicted class tells you the same thing.
False positives and false negatives, these values occur when your actual class
contradicts with the predicted class.

False Positives (FP) – When actual class is no and predicted class is yes. E.g. if actual
class says this passenger did not survive but predicted class tells you that this passenger
will survive.
False Negatives (FN) – When actual class is yes but predicted class in no. E.g. if actual
class value indicates that this passenger survived and predicted class tells you that
passenger will die.
Once you understand these four parameters then we can calculate Accuracy, Precision,
Recall and F1 score.

Accuracy - Accuracy is the most intuitive performance measure and it is simply a ratio of
correctly predicted observation to the total observations. One may think that, if we have
high accuracy then our model is best. Yes, accuracy is a great measure but only when
you have symmetric datasets where values of false positive and false negatives are
almost same. Therefore, you have to look at other parameters to evaluate the
performance of your model. For our model, we have got 0.803 which means our model is
approx. 80% accurate.

Accuracy = TP+TN / TP+FP+FN+TN

Precision - Precision is the ratio of correctly predicted positive observations to the total
predicted positive observations. The question that this metric answer is of all passengers
that labeled as survived, how many actually survived? High precision relates to the low
false positive rate. We have got 0.788 precision which is pretty good.

Precision = TP / TP+FP

What do you notice for the denominator? The denominator is actually the Total Predicted
Positive! So the formula becomes
Immediately, you can see that Precision talks about how precise/accurate your model is
out of those predicted positive, how many of them are actual positive.

Precision is a good measure to determine, when the costs of False Positive is high. For
instance, email spam detection. In email spam detection, a false positive means that an
email that is non-spam (actual negative) has been identified as spam (predicted spam).
The email user might lose important emails if the precision is not high for the spam
detection model.

Recall (Sensitivity) - Recall is the ratio of correctly predicted positive observations to the
all observations in actual class - yes. The question recall answers is: Of all the passengers
that truly survived, how many did we label? We have got recall of 0.631 which is good for
this model as it’s above 0.5.

Recall = TP/TP+FN
There you go! So, Recall actually calculates how many of the Actual Positives our model
capture through labeling it as Positive (True Positive). Applying the same understanding,
we know that Recall shall be the model metric we use to select our best model when there
is a high cost associated with False Negative.

For instance, in fraud detection or sick patient detection. If a fraudulent transaction (Actual
Positive) is predicted as non-fraudulent (Predicted Negative), the consequence can be
very bad for the bank.

Similarly, in sick patient detection. If a sick patient (Actual Positive) goes through the test
and predicted as not sick (Predicted Negative). The cost associated with False Negative
will be extremely high if the sickness is contagious.

F1 score - F1 Score is the weighted average of Precision and Recall. Therefore, this
score takes both false positives and false negatives into account. Intuitively it is not as
easy to understand as accuracy, but F1 is usually more useful than accuracy, especially
if you have an uneven class distribution. Accuracy works best if false positives and false
negatives have similar cost. If the cost of false positives and false negatives are very
different, it’s better to look at both Precision and Recall. In our case, F1 score is 0.701.

F1 Score = 2(Recall Precision) / (Recall + Precision)

F1 Score is needed when you want to seek a balance between Precision and Recall.
Right…so what is the difference between F1 Score and Accuracy then? We have
previously seen that accuracy can be largely contributed by a large number of True
Negatives which in most business circumstances, we do not focus on much whereas
False Negative and False Positive usually has business costs (tangible & intangible) thus
F1 Score might be a better measure to use if we need to seek a balance between
Precision and Recall AND there is an uneven class distribution (large number of Actual
Negatives).

So, whenever you build a model, this should help you to figure out what these parameters
mean and how good your model has performed.
78 (TP) 89 (FN)
55 (FP) 66 (TN)

Accuracy = TP+TN / TP+FP+TN+FN

=78+66 / (78+66+55+89) = ? = 0 to 1

= 0.67 = 67.00 % Accuracy

NguyenCongSang ITITIU20292 Lab6
No ratings yet
NguyenCongSang ITITIU20292 Lab6
10 pages
CNN RNN Assignment Set 4
0% (1)
CNN RNN Assignment Set 4
2 pages
Autoencoders - Presentation
No ratings yet
Autoencoders - Presentation
18 pages
CAPE Applied Mathematics Past Papers 2005P2MAY PDF
No ratings yet
CAPE Applied Mathematics Past Papers 2005P2MAY PDF
6 pages
Basic Relationship Between Pixels
No ratings yet
Basic Relationship Between Pixels
22 pages
CS230 Midterm Solutions Fall 2022
No ratings yet
CS230 Midterm Solutions Fall 2022
20 pages
Confusion Matrix, Accuracy, Precision, Recall, F1 Score
No ratings yet
Confusion Matrix, Accuracy, Precision, Recall, F1 Score
1 page
ML Program Output
No ratings yet
ML Program Output
22 pages
NLP End Sem Paper - Evaluation Scheme
No ratings yet
NLP End Sem Paper - Evaluation Scheme
14 pages
2.2 ML Session Bias Variance Tradeoffs
No ratings yet
2.2 ML Session Bias Variance Tradeoffs
38 pages
Deep Learning Unit 1
No ratings yet
Deep Learning Unit 1
32 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
Answers For End-Sem Exam Part - 2 (Deep Learning)
No ratings yet
Answers For End-Sem Exam Part - 2 (Deep Learning)
20 pages
NLP Lab Tasks
No ratings yet
NLP Lab Tasks
16 pages
Practice Final sp22
No ratings yet
Practice Final sp22
10 pages
ML Unit-1
No ratings yet
ML Unit-1
26 pages
Chap 11 12 - Practical Methodology and Applications - Heechul Lim
No ratings yet
Chap 11 12 - Practical Methodology and Applications - Heechul Lim
60 pages
Lecture 2.1.2activation Function
No ratings yet
Lecture 2.1.2activation Function
15 pages
02 ML Supervised Learning
No ratings yet
02 ML Supervised Learning
32 pages
Types of Classification Algorithm
No ratings yet
Types of Classification Algorithm
27 pages
Lec01 Conceptlearning
100% (1)
Lec01 Conceptlearning
49 pages
RBM, DBN, and DBM
No ratings yet
RBM, DBN, and DBM
79 pages
Clustering & Association Algorithms 4
No ratings yet
Clustering & Association Algorithms 4
17 pages
Calendar Functions in Python
No ratings yet
Calendar Functions in Python
3 pages
UNIT_5_DL
No ratings yet
UNIT_5_DL
11 pages
Deep Learning Lab With Output
No ratings yet
Deep Learning Lab With Output
12 pages
Deep Learning Lab Manual - IGDTUW - Vinisky Kumar
100% (1)
Deep Learning Lab Manual - IGDTUW - Vinisky Kumar
33 pages
Evaluation Metrics in Machine Learning
No ratings yet
Evaluation Metrics in Machine Learning
14 pages
Dl All Units Materials
No ratings yet
Dl All Units Materials
138 pages
Back Propagation Network: Soft Computing
No ratings yet
Back Propagation Network: Soft Computing
33 pages
Assignment 2
No ratings yet
Assignment 2
7 pages
Q1. What Is Data Science? List The Differences Between Supervised and Unsupervised Learning
100% (1)
Q1. What Is Data Science? List The Differences Between Supervised and Unsupervised Learning
41 pages
deep-learning-r18-jntuh-lab-manual
No ratings yet
deep-learning-r18-jntuh-lab-manual
20 pages
Introduction To Computer Vision
No ratings yet
Introduction To Computer Vision
10 pages
Loss Functions
No ratings yet
Loss Functions
37 pages
ML Lab Final R22
No ratings yet
ML Lab Final R22
67 pages
Ai QB
No ratings yet
Ai QB
3 pages
Artificial Neural Networks: Part 1/3
No ratings yet
Artificial Neural Networks: Part 1/3
25 pages
CS771 IITK EndSem Solutions
100% (1)
CS771 IITK EndSem Solutions
8 pages
UNIT_4_DL
No ratings yet
UNIT_4_DL
31 pages
Mini Project 2A PPT 2.0
No ratings yet
Mini Project 2A PPT 2.0
19 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Query Operation 2021
No ratings yet
Query Operation 2021
35 pages
Deep Learning (MODULE-3) (1)
No ratings yet
Deep Learning (MODULE-3) (1)
85 pages
1. Deep Learning
No ratings yet
1. Deep Learning
127 pages
Spam News Detection Report
No ratings yet
Spam News Detection Report
9 pages
Deep Neural Network
No ratings yet
Deep Neural Network
12 pages
Thyroid Disease Classification Using Machine Learning Project
No ratings yet
Thyroid Disease Classification Using Machine Learning Project
34 pages
Overfitting vs. Underfitting, Bias vs. Variance
No ratings yet
Overfitting vs. Underfitting, Bias vs. Variance
7 pages
Module2.3 Hyperparameter Optimization
No ratings yet
Module2.3 Hyperparameter Optimization
29 pages
UNIT-I_Introduction to Computer Vision
No ratings yet
UNIT-I_Introduction to Computer Vision
45 pages
RBF, KNN, SVM, DT
No ratings yet
RBF, KNN, SVM, DT
9 pages
Teaching Scheme Credits Assigned Code Theory Practical Tutorial Theory Practical Tutorial Total
No ratings yet
Teaching Scheme Credits Assigned Code Theory Practical Tutorial Theory Practical Tutorial Total
2 pages
Cyberbulling Detection Using ML Updated
No ratings yet
Cyberbulling Detection Using ML Updated
13 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
3 pages
Data Analytics lab manual
No ratings yet
Data Analytics lab manual
47 pages
Car Make and Model Recognition Using Ima
No ratings yet
Car Make and Model Recognition Using Ima
8 pages
PPT1
No ratings yet
PPT1
93 pages
Hyperparameters
No ratings yet
Hyperparameters
15 pages
SUpport Vector Machine
No ratings yet
SUpport Vector Machine
28 pages
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
From Everand
Hybrid Neural Networks: Fundamentals and Applications for Interacting Biological Neural Networks with Artificial Neuronal Models
Fouad Sabry
No ratings yet
Hopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories
From Everand
Hopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories
Fouad Sabry
No ratings yet
Ai Final Assignment
No ratings yet
Ai Final Assignment
24 pages
Fractiles 141012210550 Conversion Gate01 PDF
No ratings yet
Fractiles 141012210550 Conversion Gate01 PDF
26 pages
Aldimeola Alfarisy - PPT Final Project
No ratings yet
Aldimeola Alfarisy - PPT Final Project
29 pages
Advanced Regression With JMP PRO Handout
No ratings yet
Advanced Regression With JMP PRO Handout
46 pages
Reporting Statistics in Psychology
No ratings yet
Reporting Statistics in Psychology
7 pages
Anova Report April 2019
No ratings yet
Anova Report April 2019
11 pages
Naïve Bayes Classifier
No ratings yet
Naïve Bayes Classifier
8 pages
Assignment 2
No ratings yet
Assignment 2
4 pages
Training Sample Measurement Systems
No ratings yet
Training Sample Measurement Systems
19 pages
9 PDF
100% (1)
9 PDF
3 pages
ARDL
No ratings yet
ARDL
10 pages
Padel - Math89 - Hypothesis Testing
No ratings yet
Padel - Math89 - Hypothesis Testing
4 pages
Statistics for Business and Economics Global Edition Paul. Carlson Newbold (William. Thorne instant download
No ratings yet
Statistics for Business and Economics Global Edition Paul. Carlson Newbold (William. Thorne instant download
46 pages
Econometrics II Chap 4.1 Univariate Time Series Ppt (1)
No ratings yet
Econometrics II Chap 4.1 Univariate Time Series Ppt (1)
63 pages
Exno: 1 Explore The Features of Ms-Excel
No ratings yet
Exno: 1 Explore The Features of Ms-Excel
61 pages
INL2 Weiss Luca
No ratings yet
INL2 Weiss Luca
11 pages
Anova Excel Worksheet - Student
No ratings yet
Anova Excel Worksheet - Student
62 pages
Chapter-3[1]
No ratings yet
Chapter-3[1]
12 pages
Download Complete Multiple Imputation in Practice Using IVEware First Edition Berglund PDF for All Chapters
100% (1)
Download Complete Multiple Imputation in Practice Using IVEware First Edition Berglund PDF for All Chapters
65 pages
Unit-3 (MLT)
No ratings yet
Unit-3 (MLT)
46 pages
Confounding and Interaction in Regression: 11-1 Preview
No ratings yet
Confounding and Interaction in Regression: 11-1 Preview
9 pages
Customer Relationship Management
No ratings yet
Customer Relationship Management
15 pages
ASSIGNMENT 2 Production and Logistics Management - Forecasting Numerical Assignment
No ratings yet
ASSIGNMENT 2 Production and Logistics Management - Forecasting Numerical Assignment
5 pages
Stat For Fin CH 4 PDF
No ratings yet
Stat For Fin CH 4 PDF
17 pages
Lecture 15- Ch 11- F distribution
No ratings yet
Lecture 15- Ch 11- F distribution
24 pages
Regression
No ratings yet
Regression
4 pages
Assignment Module04 Part2
50% (2)
Assignment Module04 Part2
4 pages
Anantaa Paul SIP
No ratings yet
Anantaa Paul SIP
63 pages