0% found this document useful (0 votes)
41 views5 pages

Confusion Matrix

The document discusses the importance of various metrics for evaluating classification models, especially in the context of imbalanced datasets. It introduces key metrics such as accuracy, precision, recall, and F1-score, explaining their definitions, use-cases, and limitations. Additionally, it provides examples of how to compute these metrics using Python and emphasizes the significance of a confusion matrix in understanding model performance.

Uploaded by

Arkojyoti Dey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views5 pages

Confusion Matrix

The document discusses the importance of various metrics for evaluating classification models, especially in the context of imbalanced datasets. It introduces key metrics such as accuracy, precision, recall, and F1-score, explaining their definitions, use-cases, and limitations. Additionally, it provides examples of how to compute these metrics using Python and emphasizes the significance of a confusion matrix in understanding model performance.

Uploaded by

Arkojyoti Dey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

🎯 Why Are These Metrics Important?

When we build a classification model, we need a way to measure how well it performs — especially
when the data is imbalanced (e.g., 90% healthy, 10% sick).

💡 Confusion Matrix (The Base of Everything)


A confusion matrix is a summary table for classification results:

lua Copy Edit

Predicted
0 | 1
-----------------
Actual 0 | TN | FP
1 | FN | TP

Where:
TP (True Positive): Correctly predicted positive class
TN (True Negative): Correctly predicted negative class
FP (False Positive): Incorrectly predicted positive (Type I error)
FN (False Negative): Incorrectly predicted negative (Type II error)

✅ Accuracy
Definition:

Percentage of total predictions that are correct.

TP + TN
Accuracy =
TP + TN + FP + FN

Use-case:
Good for balanced datasets.

Weakness:
Can be misleading for imbalanced data.

Example:

python Copy Edit

from sklearn.metrics import accuracy_score y_true = [1, 0, 1, 1, 0, 1, 0] y_pred = [1, 0,


1, 0, 0, 1, 1] print("Accuracy:", accuracy_score(y_true, y_pred))

Output:
sql Copy Edit

Accuracy: 0.714 (5 out of 7 predictions are correct)

🎯 Precision
Definition:

Of all predicted positives, how many were actually positive?

TP
Precision =
TP + FP

Use-case:
Important when false positives are costly (e.g., spam detection, cancer diagnosis).

Example:

python Copy Edit

from sklearn.metrics import precision_score print("Precision:", precision_score(y_true,


y_pred))

Output:

sql Copy Edit

Precision: 0.75 (3 correct positives out of 4 predicted positives)

🎯 Recall (Sensitivity or True Positive Rate)


Definition:

Of all actual positives, how many were correctly predicted?

TP
Recall =
TP + FN

Use-case:
Important when missing positives is dangerous (e.g., detecting fraud or cancer).

Example:

python Copy Edit


from sklearn.metrics import recall_score print("Recall:", recall_score(y_true, y_pred))

Output:

kotlin Copy Edit

Recall: 0.75 (3 out of 4 actual positives were caught)

🎯 F1-Score
Definition:

Harmonic mean of Precision and Recall — a balanced metric.

Precision × Recall
F1 = 2 ×
Precision + Recall

Use-case:
When you want to balance precision and recall, especially on imbalanced datasets.

Example:

python Copy Edit

from sklearn.metrics import f1_score print("F1-Score:", f1_score(y_true, y_pred))

Output:

makefile Copy Edit

F1-Score: 0.75

🧮 Full Breakdown with Confusion Matrix


Let's compute it manually to understand better:

python Copy Edit

from sklearn.metrics import confusion_matrix cm = confusion_matrix(y_true, y_pred)


print("Confusion Matrix:\n", cm)

Output:

lua Copy Edit

[[2 1]
[1 3]]

From this:
TP = 3, TN = 2, FP = 1, FN = 1

Now manually:

python Copy Edit

TP = 3 TN = 2 FP = 1 FN = 1 accuracy = (TP + TN) / (TP + TN + FP + FN) precision = TP /


(TP + FP) recall = TP / (TP + FN) f1 = 2 * (precision * recall) / (precision + recall)
print("Manual Accuracy:", accuracy) print("Manual Precision:", precision) print("Manual
Recall:", recall) print("Manual F1-Score:", f1)

📊 Visualize Metrics (Bar Plot)


python Copy Edit

import matplotlib.pyplot as plt metrics = ['Accuracy', 'Precision', 'Recall', 'F1-Score']


values = [accuracy, precision, recall, f1] plt.figure(figsize=(8,5)) plt.bar(metrics,
values, color='skyblue') plt.ylim(0, 1) plt.title('Performance Metrics')
plt.ylabel('Score') plt.grid(True, linestyle='--', alpha=0.5) plt.show()

✅ Summary Table
Metric Best When... Worst When...

Accuracy Data is balanced Data is imbalanced

Precision False positives are costly You need to catch all positives

Recall False negatives are costly False positives don’t matter as


much

F1-Score Need balance between precision & You care only about one (P or R)
recall

📌 Bonus: Classification Report


Scikit-learn gives all metrics per class:

python
Copy Edit
from sklearn.metrics import classification_report print(classification_report(y_true,
y_pred, target_names=['Class 0', 'Class 1']))

Output:

markdown Copy Edit

precision recall f1-score support

Class 0 0.67 0.67 0.67 3


Class 1 0.75 0.75 0.75 4

accuracy 0.71 7
macro avg 0.71 0.71 0.71 7
weighted avg 0.71 0.71 0.71 7

Would you like the same deep explanation for macro/micro/weighted averaging, ROC AUC, or how to
use these metrics with multiclass or multilabel classification?

You might also like