0% found this document useful (0 votes)
3 views

Lecture 2.3

The document discusses error analysis in machine learning, focusing on techniques such as train/test split, validation sets, and performance metrics including confusion matrix, accuracy, precision, recall, F-measure, and ROC curve. It explains the importance of splitting datasets for model validation and tuning hyper-parameters. Additionally, it provides definitions and calculations for various performance metrics used to evaluate classification models.

Uploaded by

22051210
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Lecture 2.3

The document discusses error analysis in machine learning, focusing on techniques such as train/test split, validation sets, and performance metrics including confusion matrix, accuracy, precision, recall, F-measure, and ROC curve. It explains the importance of splitting datasets for model validation and tuning hyper-parameters. Additionally, it provides definitions and calculations for various performance metrics used to evaluate classification models.

Uploaded by

22051210
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Lecture 2.

• Error Analysis
• Train/Test Split, validation set
• Confusion Matrix
• Accuracy, Precision, Recall, F-measure,
ROC curve,

Dr. Mainak Biswas


Train/Test Split in Machine Learning
• Train-test split is a machine learning technique
that divides a dataset into two subsets: a training
set and a testing set
• It's a model validation process that helps assess
how well a machine learning model will perform
on new data
• Typical Split Ratios
– 80% for training and 20% for testing
– 70% for training and 30% for testing
– 90% for training and 10% for testing (for large dataset)
Dr. Mainak Biswas
Validation Set
• The validation set is an additional subset of the dataset
used to tune the model's hyper-parameters and evaluate
its performance during training
• It acts as an intermediary between the training set and the
test set
• Purpose of a Validation Set
– Hyper parameter Tuning
– Early Stopping
– Model Selection
• Train/Validation/Test Split
– Training Set: Used to train the model
– Validation Set: Used to tune hyper parameters and evaluate the
model during training
– Test Set: Used to assess the final performance on unseen data
Dr. Mainak Biswas
Confusion Matrix

Dr. Mainak Biswas


𝑇𝑃 + 𝑇𝑁
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
Recall measures the proportion of correctly
predicted positive observations out of all
actual positives.
𝑇𝑃 𝑇𝑃
𝑇𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑟𝑎𝑡𝑒 𝑇𝑃𝑅 , 𝑟𝑒𝑐𝑎𝑙𝑙, 𝑠𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 𝑆𝐸𝑁 = =
𝑇𝑃 + 𝐹𝑁 𝑃
Precision measures the proportion of correctly
𝑇𝑃
Precision = predicted positive observations out of all
𝑇𝑃 + 𝐹𝑃 predicted positives.
The F-score (or F1-score) is a metric that combines precision and
recall into a single score, providing a balance between the two.
It's especially useful when the data is imbalanced.
2. 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛. 𝑅𝑒𝑐𝑎𝑙𝑙
𝐹 − 𝑀𝑒𝑎𝑠𝑢𝑟𝑒 =
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙
False Positive Rate (FPR) is a measure used in binary classification to quantify how
often a model incorrectly predicts a positive outcome for a negative instance
𝐹𝑃
𝐹𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑟𝑎𝑡𝑒 (𝑡𝑦𝑝𝑒 − 𝐼 𝑒𝑟𝑟𝑜𝑟) =
𝐹𝑃 + 𝑇𝑁
Dr. Mainak Biswas
ROC Curve
• An ROC (Receiver Operating Characteristic) plot
is a graphical representation used to evaluate the
performance of a binary classification model
• It illustrates the trade-off between the True
Positive Rate (TPR) and the False Positive Rate
(FPR) at various threshold settings for a classifier.
Here's a breakdown of its meaning and
components
𝑇𝑃 𝑇𝑃
𝑇𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑟𝑎𝑡𝑒 𝑇𝑃𝑅 , 𝑟𝑒𝑐𝑎𝑙𝑙, 𝑠𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 𝑆𝐸𝑁 = =
𝑇𝑃 + 𝐹𝑁 𝑃

𝐹𝑃
𝐹𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑟𝑎𝑡𝑒 (𝑡𝑦𝑝𝑒 − 𝐼 𝑒𝑟𝑟𝑜𝑟) =
𝐹𝑃 + 𝑇𝑁
Dr. Mainak Biswas
Components of ROC Curve
• X-axis: False Positive Rate (FPR)
• Y-axis: True Positive Rate (TPR)
• Curve: Plots TPR against FPR for various threshold
values
• Diagonal Line: Represents a random classifier (no
predictive power)
– The area under this line is 0.5
• Area Under the Curve (AUC): The AUC score
measures the overall performance of the model
– An AUC of 1.0 indicates a perfect classifier, while 0.5
indicates a model with no discriminative ability.
Dr. Mainak Biswas
Confusion Matrix Generation
Predicted True
1 1
Actually Actually
1 1 Positive Negative
1 0 (P) (1) (N) (0)
1 1
1 0 Predicted 8 (TP) 5 (FP) 𝐹𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑟𝑎𝑡𝑒 (𝑡𝑦𝑝𝑒
1 0 Positive 𝐹𝑃
1 1 − 𝐼 𝑒𝑟𝑟𝑜𝑟) =
1 1 (PP) (1) 𝐹𝑃 + 𝑇𝑁
0 0 5
0 0
Predicted 2 (FN) 5 (TN) = = 0.5
Negative 10
0 1
1 1 (PN) (0)
1 0
0 0
0 0
1 1 8+5
1 1 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = = 0.65
1 0 8+5+5+2
0 1
0 0 𝑇𝑃 8
𝑅𝑒𝑐𝑎𝑙𝑙, 𝑠𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 𝑆𝐸𝑁 = = = 0.8
𝑇𝑃 + 𝐹𝑁 8 + 2
𝑇𝑃 8 2 × 0.62 × 0.8
Precision = = = 0.62 𝐹 − 𝑀𝑒𝑎𝑠𝑢𝑟𝑒 = = 0.70
𝑇𝑃 + 𝐹𝑃 13 0.62 + 0.8
Dr. Mainak Biswas
ROC Generation
Predicted True
1 1
1 1
1 0
1 1
1 0
1 0
1 1
1 1
0 0
0 0
0 1
1 1
1 0
0 0
0 0
1 1
1 1
1 0
0 1
0 0

Dr. Mainak Biswas

You might also like