3-Performance Measures
3-Performance Measures
PERFORMANCE
MEASURES
2
Performance Measures for regression are used to evaluate learning algorithms and form
an important aspect of machine learning
❑ Pi - predicted values
❑ Ai - observed values / Actual Value
MEAN-ABSOLUTE ERROR
❑ Pi - predicted values
❑ Ai - observed values
❑ n - number of observations
If consider the direction it called Mean Bias Error (MBE), which is a sum of
errors(difference)
5
MAE=(∣3−2.5∣+∣−0.5−0.0∣+∣2−2∣+∣7−8∣) / 4 = 0.5
Actual values [3,−0.5,2,7][3,−0.5,2,7 6
MSE=((3−2.5)2+(−0.5−0.0)2+(2−2)2+(7−8)2)/4
= 0.375
• Measures the average of the squares of the errors
❑ Pi - predicted values
❑ Ai - observed values
❑ n - number of observations
7
❑ Pi - predicted values
❑ Oi - observed values
❑ n - number of observations
In a classification problem, you can represent the errors using “Confusion Matrix”
Confusion Matrix
Confusion Matrix:
PREDICTED CLASS
Class=Yes Class=No
a: TP (true positive)
Class=Yes a b
ACTUAL b: FN (false negative)
CLASS Class=No c d
c: FP (false positive)
d: TN (true negative)
1
0
Type-I and Type-II error
Metrics for Performance Evaluation… 1
1
PREDICTED CLASS
Class=Yes Class=No
Class=Yes a b
ACTUAL (TP) (FN)
CLASS
Class=No c d
(FP) (TN)
a+d TP + TN
Accuracy = =
a + b + c + d TP + TN + FP + FN
Most widely-used metric:
Class=Yes Class=No
1
Class=Yes a. (TP) B (FN) 2
ACCURACY ACTUAL
CLASS Class=No C (FP) D (TN)
Recall is the percent of positives cases that you were able to catch
The proportion of actual Spam emails correctly identified by the model.
F1-score: The harmonic mean of Precision and Recall. It balances the two metrics
TP PREDICTED CLASS
TPR =
TP + FN Yes No
Fraction of positive instances predicted
as positive Yes a b
Actual (TP) (FN)
FP No c d
FPR = (FP) (TN)
FP + TN
Fraction of negative instances predicted
as positive
1
8
RECEIVER OPERATING CHARACTERISTICS (ROC) CURVE
𝑻𝑷 𝑻𝑵
𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚 = 𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚 =
𝑻𝑷 + 𝑭𝑵 𝑻𝑵 + 𝑭𝑷
Actual
Cat Dog Horse
𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚𝒄𝒂𝒕 𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚𝒄𝒂𝒕
Predicted
Cat 12 102 93
𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚𝒅𝒐𝒈 𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚𝒅𝒐𝒈
Dog 112 23 77
𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚𝑯𝒐𝒓𝒔𝒆 𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚𝑯𝒐𝒓𝒔𝒆
Horse 83 92 17
2
0
𝑻𝑷 𝑻𝑵
𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚𝒄𝒂𝒕 = 𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚𝒄𝒂𝒕 =
𝑻𝑷 + 𝑭𝑵 𝑻𝑵 + 𝑭𝑷
Predicted
Cat 12 102 93
Dog 112 23 77
Horse 83 92 17
2
1
𝑻𝑷 𝑻𝑵
𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚𝒄𝒂𝒕 = 𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚𝒄𝒂𝒕 =
𝑻𝑷 + 𝑭𝑵 𝑻𝑵 + 𝑭𝑷
Actual
Cat Dog Horse
Predicted
Cat 12 102 93
Dog 112 23 77
Horse 83 92 17
2
2
𝑻𝑷 𝑻𝑵
𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚𝒄𝒂𝒕 = 𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚𝒄𝒂𝒕 =
𝑻𝑷 + 𝑭𝑵 𝑻𝑵 + 𝑭𝑷
Actual
𝟏𝟐 Cat Dog Horse
𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚𝒄𝒂𝒕 = = 𝟎. 𝟎𝟔
𝟏𝟐 + (𝟏𝟏𝟐 + 𝟖𝟑)
Predicted
Cat 12 102 93
Dog 112 23 77
𝟐𝟑 + 𝟕𝟕 + 𝟗𝟐 + 𝟏𝟕
𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚𝒄𝒂𝒕 = = 𝟎. 𝟓𝟐 Horse 83 92 17
(𝟐𝟑 + 𝟕𝟕 + 𝟗𝟐 + 𝟏𝟕) + (𝟏𝟎𝟐 + 𝟗𝟑)
2
3
𝑻𝑷 𝑻𝑵
𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚𝒅𝒐𝒈 = 𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚𝒅𝒐𝒈 =
𝑻𝑷 + 𝑭𝑵 𝑻𝑵 + 𝑭𝑷
Actual
Cat Dog Horse
Predicted
Cat 12 102 93
Dog 112 23 77
Horse 83 92 17
2
4
𝑻𝑷 𝑻𝑵
𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚𝒅𝒐𝒈 = 𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚𝒅𝒐𝒈 =
𝑻𝑷 + 𝑭𝑵 𝑻𝑵 + 𝑭𝑷
Actual
Cat Dog Horse
Predicted
Cat 12 102 93
Dog 112 23 77
Horse 83 92 17
2
5
𝑻𝑷 𝑻𝑵
𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚𝒅𝒐𝒈 = 𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚𝒅𝒐𝒈 =
𝑻𝑷 + 𝑭𝑵 𝑻𝑵 + 𝑭𝑷
Actual
𝟐𝟑 Cat Dog Horse
𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚𝒅𝒐𝒈 = = 𝟎. 𝟏𝟏
𝟐𝟑 + (𝟏𝟎𝟐 + 𝟗𝟐)
Predicted
Cat 12 102 93
Dog 112 23 77
𝟏𝟐 + 𝟗𝟑 + 𝟖𝟑 + 𝟏𝟕
𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚𝒅𝒐𝒈 = = 𝟎. 𝟓𝟐 Horse 83 92 17
(𝟏𝟐 + 𝟗𝟑 + 𝟖𝟑 + 𝟏𝟕) + (𝟏𝟏𝟐 + 𝟕𝟕)
2
6
𝑻𝑷 𝑻𝑵
𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚𝑯𝒐𝒓𝒔𝒆 = 𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚𝑯𝒐𝒓𝒔𝒆 =
𝑻𝑷 + 𝑭𝑵 𝑻𝑵 + 𝑭𝑷
Actual
Cat Dog Horse
Predicted
Cat 12 102 93
Dog 112 23 77
Horse 83 92 17
2
7
𝑻𝑷 𝑻𝑵
𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚𝑯𝒐𝒓𝒔𝒆 = 𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚𝑯𝒐𝒓𝒔𝒆 =
𝑻𝑷 + 𝑭𝑵 𝑻𝑵 + 𝑭𝑷
Actual
Cat Dog Horse
Predicted
Cat 12 102 93
Dog 112 23 77
Horse 83 92 17
2
8
𝑻𝑷 𝑻𝑵
𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚𝑯𝒐𝒓𝒔𝒆 = 𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚𝑯𝒐𝒓𝒔𝒆 =
𝑻𝑷 + 𝑭𝑵 𝑻𝑵 + 𝑭𝑷
Actual
𝟏𝟕 Cat Dog Horse
𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚𝑯𝒐𝒓𝒔𝒆 = = 𝟎. 𝟎𝟗
𝟏𝟕 + (𝟕𝟕 + 𝟗𝟑)
Predicted
Cat 12 102 93
Dog 112 23 77
𝟏𝟐 + 𝟏𝟎𝟐 + 𝟏𝟏𝟐 + 𝟐𝟑
𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚𝑯𝒐𝒓𝒔𝒆 = = 𝟎. 𝟓𝟖 Horse
(𝟏𝟐 + 𝟏𝟎𝟐 + 𝟏𝟏𝟐 + 𝟐𝟑) + (𝟖𝟑 + 𝟗𝟐) 83 92 17
2
9
Confusion Matrix
Actual
Thing 1 Thing 2 Thing 3 Thing 4
Thing 1 12 102 93 56
Predicted
Thing 2 112 23 77 36
Thing 3 83 92 17 45
Thing 4 12 45 68 48
SAMPLE QUESTION
3
What does a confusion matrix primarily help to evaluate in a classification model? 0
In a confusion matrix for a binary classifier, what does the term 'True Positives' (TP) refer to?
A. The instances correctly labeled as the negative class
B. The instances incorrectly labeled as the negative class
C. The instances correctly labeled as the positive class
D. The instances incorrectly labeled as the positive class
What does the 'False Negative' (FN) cell of a confusion matrix represent in a binary classification problem?
In a confusion matrix for a binary classifier, what does the term 'True Positives' (TP) refer to?
A. The instances correctly labeled as the negative class
B. The instances incorrectly labeled as the negative class
C. The instances correctly labeled as the positive class
D. The instances incorrectly labeled as the positive class
What does the 'False Negative' (FN) cell of a confusion matrix represent in a binary classification problem?
Training Error
Generalization Error
The data used to train the machine learning model affects the correctness or predictions
made by it. Sometimes inadequate data may result in inconsistent results. The more the
data you have the more accuracy you will get.
3
5