Exp7_MLAI2
Exp7_MLAI2
07
To evaluate the performance or quality of the model, different metrics are used, and these
metrics are known as performance metrics or evaluation metrics.
In a classification problem, the category or classes of data is identified based on training
data. The model learns from the given dataset and then classifies the new data into classes
or groups based on the training. It predicts class labels as the output, such as Yes or No, 0 or
1, Spam or Not Spam, etc. To evaluate the performance of a classification model, different
metrics are used, and some of them are as follows:
o Accuracy o
Confusion Matrix
o Precision
o Recall o F1-Score
o AUC(Area Under
the Curve)-ROC
1.Accuracy
The accuracy metric is one of the simplest Classification metrics to implement, and it can
be determined as the number of correct predictions to the total number of predictions.
It can be formulated as:
To implement an accuracy metric, we can compare ground truth and predicted values in a
loop, or we can also use the scikit-learn module for this.
2.Confusion Matrix :
A confusion matrix is a tabular representation of prediction outcomes of any binary
classifier, which is used to describe the performance of the classification model on a set of
test data when true values are known.
A confusion matrix is a fundamental tool for evaluating the performance of a classification
model. It provides a summary of the prediction results on a classification problem, showing
how well the model's predictions match the actual labels. The matrix layout enables easy
identification of the types of errors the model is making.
True Negative (TN): The model correctly predicts the negative class.
False Positive (FP): The model incorrectly predicts the positive class when it's actually
negative (also known as a "Type I error").
False Negative (FN): The model incorrectly predicts the negative class when it's actually
positive (also known as a "Type II error").
Accuracy:
Example :
Output:
Precision :
The precision metric is used to overcome the limitation of Accuracy. The precision
determines the proportion of positive prediction that was actually correct. It can be
calculated as the True Positive or predictions that are actually true to the total positive
predictions (True Positive and False Positive).
Example :
Output :
Recall or Sensitivity :
It is also similar to the Precision metric; however, it aims to calculate the proportion of actual
positive that was identified incorrectly. It can be calculated as True Positive or predictions
that are actually true to the total number of positives, either correctly predicted as positive
or incorrectly predicted as negative (true Positive and false negative).
Example :
Output :
F1-Score :
F-score or F1 Score is a metric to evaluate a binary classification model on the basis of
predictions that are made for the positive class. It is calculated with the help of Precision
and Recall. It is a type of single score that represents both Precision and Recall. So, the F1
Score can be calculated as the harmonic mean of both precision and Recall, assigning equal
weight to each of them.
Example :
Output :
AUC-RUC :
Sometimes we need to visualize the performance of the classification model on charts;
then, we can use the AUC-ROC curve. It is one of the popular and important metrics for
evaluating the performance of the classification model.
Firstly, let's understand ROC (Receiver Operating Characteristic curve) curve. ROC
represents a graph to show the performance of a classification model at different threshold
levels. The curve is plotted between two parameters, which are:
TPR or true Positive rate is a synonym for Recall, hence can be calculated as:
To calculate value at any point in a ROC curve, we can evaluate a logistic regression
model multiple times with different classification thresholds, but this would not be much
efficient. So, for this, one efficient method is used, which is known as AUC.
AUC is known for Area Under the ROC curve. As its name suggests, AUC calculates the
two-dimensional area under the entire ROC curve, as shown below image:
AUC calculates the performance across all the thresholds and provides an aggregate
measure. The value of AUC ranges from 0 to 1. It means a model with 100% wrong
prediction will have an AUC of 0.0, whereas models with 100% correct predictions will
have an AUC of 1.0.
Example :