Lecture 2.3
Lecture 2.3
• Error Analysis
• Train/Test Split, validation set
• Confusion Matrix
• Accuracy, Precision, Recall, F-measure,
ROC curve,
𝐹𝑃
𝐹𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑟𝑎𝑡𝑒 (𝑡𝑦𝑝𝑒 − 𝐼 𝑒𝑟𝑟𝑜𝑟) =
𝐹𝑃 + 𝑇𝑁
Dr. Mainak Biswas
Components of ROC Curve
• X-axis: False Positive Rate (FPR)
• Y-axis: True Positive Rate (TPR)
• Curve: Plots TPR against FPR for various threshold
values
• Diagonal Line: Represents a random classifier (no
predictive power)
– The area under this line is 0.5
• Area Under the Curve (AUC): The AUC score
measures the overall performance of the model
– An AUC of 1.0 indicates a perfect classifier, while 0.5
indicates a model with no discriminative ability.
Dr. Mainak Biswas
Confusion Matrix Generation
Predicted True
1 1
Actually Actually
1 1 Positive Negative
1 0 (P) (1) (N) (0)
1 1
1 0 Predicted 8 (TP) 5 (FP) 𝐹𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑟𝑎𝑡𝑒 (𝑡𝑦𝑝𝑒
1 0 Positive 𝐹𝑃
1 1 − 𝐼 𝑒𝑟𝑟𝑜𝑟) =
1 1 (PP) (1) 𝐹𝑃 + 𝑇𝑁
0 0 5
0 0
Predicted 2 (FN) 5 (TN) = = 0.5
Negative 10
0 1
1 1 (PN) (0)
1 0
0 0
0 0
1 1 8+5
1 1 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = = 0.65
1 0 8+5+5+2
0 1
0 0 𝑇𝑃 8
𝑅𝑒𝑐𝑎𝑙𝑙, 𝑠𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 𝑆𝐸𝑁 = = = 0.8
𝑇𝑃 + 𝐹𝑁 8 + 2
𝑇𝑃 8 2 × 0.62 × 0.8
Precision = = = 0.62 𝐹 − 𝑀𝑒𝑎𝑠𝑢𝑟𝑒 = = 0.70
𝑇𝑃 + 𝐹𝑃 13 0.62 + 0.8
Dr. Mainak Biswas
ROC Generation
Predicted True
1 1
1 1
1 0
1 1
1 0
1 0
1 1
1 1
0 0
0 0
0 1
1 1
1 0
0 0
0 0
1 1
1 1
1 0
0 1
0 0