L10a - Machine Learning Basic Concepts
L10a - Machine Learning Basic Concepts
Machine Learning:
Basic Concepts
by Samuel I. G. Situmeang
Objectives
2
Machine Learning: Basic Concepts Artificial Intelligence
3
Machine Learning: Basic Concepts Artificial Intelligence
Terminology
4
Machine Learning: Basic Concepts Artificial Intelligence
A Few Quotes
5
Machine Learning: Basic Concepts Artificial Intelligence
Data everywhere!
Source: https://www.domo.com/
6
Machine Learning: Basic Concepts Artificial Intelligence
Data types
7
Machine Learning: Basic Concepts Artificial Intelligence
8
Machine Learning: Basic Concepts Artificial Intelligence
9
Machine Learning: Basic Concepts Artificial Intelligence
Applications of ML
10
Machine Learning: Basic Concepts Artificial Intelligence
Applications of ML
• Spam filtering
• Credit card fraud detection
• Digit recognition on checks, zip codes
• Detecting faces in images
• MRI image analysis
• Recommendation system
• Search engines
• Handwriting recognition
• Scene classification
• etc...
11
Machine Learning: Basic Concepts Artificial Intelligence
Research progress
Products
• More compute
• More data
• Better algorithms → Need more people who understand the
algorithms!
13
Machine Learning: Basic Concepts Artificial Intelligence
Interdisciplinary field
14
Machine Learning: Basic Concepts Artificial Intelligence
ML versus Statistics
http://statweb.stanford.edu/~jhf/ftp/dm-stat.pdf
15
Machine Learning: Basic Concepts Artificial Intelligence
Alan Turing proposed the concept of a learning machine in 1950 (in the
same paper that proposed the Turing test).
16
Machine Learning: Basic Concepts Artificial Intelligence
17
Machine Learning: Basic Concepts Artificial Intelligence
18
Machine Learning: Basic Concepts Artificial Intelligence
19
Machine Learning: Basic Concepts Artificial Intelligence
Data
• Traditional programming Computer Output
Program
20
Machine Learning: Basic Concepts Artificial Intelligence
21
Machine Learning: Basic Concepts Artificial Intelligence
• Supervised learning
• Classification. e.g. Logistic Regression, Decision Tree, KNN, Random Forest,
SVM, & Naive Bayes
• Numeric prediction/forecasting/regression. e.g. Linear Regression, KNN,
Gradient Boosting & AdaBoost
• Unsupervised learning
• Clustering. e.g. K-Means
• Pattern Discovery. e.g. Apriori, FP-Growth, & Eclat
• Semi-supervised learning
• Reinforcement learning.
• e.g. Q-Learning, Temporal Difference (TD), & Deep Adversarial Networks
22
Machine Learning: Basic Concepts Artificial Intelligence
Source: https://en.proft.me/
23
Machine Learning: Basic Concepts Artificial Intelligence
24
Machine Learning: Basic Concepts Artificial Intelligence
Unsupervised learning:
Learning a model from unlabeled data.
Supervised learning:
Learning a model from labeled data.
25
Machine Learning: Basic Concepts Artificial Intelligence
Unsupervised Learning
• Clustering/segmentation:
𝑓: ℝ𝑑 → 𝐶1 , … , 𝐶𝑘 set of cluster
26
Machine Learning: Basic Concepts Artificial Intelligence
Unsupervised Learning
Clustering/segmentation:
27
Machine Learning: Basic Concepts Artificial Intelligence
Unsupervised Learning
Clustering/segmentation:
28
Machine Learning: Basic Concepts Artificial Intelligence
Unsupervised Learning
Clustering/segmentation:
29
Machine Learning: Basic Concepts Artificial Intelligence
Supervised Learning
30
Machine Learning: Basic Concepts Artificial Intelligence
Supervised Learning
Classification:
31
Machine Learning: Basic Concepts Artificial Intelligence
Supervised Learning
Classification:
32
Machine Learning: Basic Concepts Artificial Intelligence
Supervised Learning
Classification:
33
Machine Learning: Basic Concepts Artificial Intelligence
Supervised Learning
Classification:
34
Machine Learning: Basic Concepts Artificial Intelligence
Supervised Learning
35
Machine Learning: Basic Concepts Artificial Intelligence
Supervised Learning
𝑓: ℝ𝑑 → ℝ 𝑓 is called a regressor
36
Machine Learning: Basic Concepts Artificial Intelligence
Supervised Learning
Regression:
37
Machine Learning: Basic Concepts Artificial Intelligence
Supervised Learning
Regression:
38
Machine Learning: Basic Concepts Artificial Intelligence
Supervised Learning
Regression:
39
Machine Learning: Basic Concepts Artificial Intelligence
Supervised Learning
Regression:
40
Machine Learning: Basic Concepts Artificial Intelligence
41
Machine Learning: Basic Concepts Artificial Intelligence
Training set
ML Algorithm
Model (f)
42
Machine Learning: Basic Concepts Artificial Intelligence
Training set
ML Algorithm
Income,
gender,
age, Credit amount $
family Model (f) Credit yes/np
status,
zipcode
43
Machine Learning: Basic Concepts Artificial Intelligence
Training set
ML Algorithm
Income,
gender,
age, Credit amount $
family Model (f) Credit yes/np
status,
zipcode
44
Machine Learning: Basic Concepts Artificial Intelligence
45
Machine Learning: Basic Concepts Artificial Intelligence
𝐸 𝑡𝑟𝑎𝑖𝑛 𝑓 = 𝑙𝑜𝑠𝑠 𝑦𝑖 , 𝑓 𝑥𝑖
𝑖=1
• Examples of loss functions:
• Classification error:
1 𝑠𝑖𝑔𝑛 𝑦𝑖 ≠ 𝑠𝑖𝑔𝑛 𝑓 𝑥𝑖
𝑙𝑜𝑠𝑠 𝑦𝑖 , 𝑓 𝑥𝑖 =ቊ
0 otherwise
46
Machine Learning: Basic Concepts Artificial Intelligence
𝐸 𝑡𝑟𝑎𝑖𝑛 𝑓 = 𝑙𝑜𝑠𝑠 𝑦𝑖 , 𝑓 𝑥𝑖
𝑖=1
• Examples of loss functions:
• Classification error:
1 𝑠𝑖𝑔𝑛 𝑦𝑖 ≠ 𝑠𝑖𝑔𝑛 𝑓 𝑥𝑖
𝑙𝑜𝑠𝑠 𝑦𝑖 , 𝑓 𝑥𝑖 =ቊ
0 otherwise
• Least square loss:
2
𝑙𝑜𝑠𝑠 𝑦𝑖 , 𝑓 𝑥𝑖 = 𝑦𝑖 − 𝑓 𝑥𝑖
47
Machine Learning: Basic Concepts Artificial Intelligence
𝐸 𝑡𝑟𝑎𝑖𝑛 𝑓 = 𝑙𝑜𝑠𝑠 𝑦𝑖 , 𝑓 𝑥𝑖
𝑖=1
• We aim to have 𝐸 𝑡𝑟𝑎𝑖𝑛 𝑓 small, i.e., minimize 𝐸 𝑡𝑟𝑎𝑖𝑛 𝑓
48
Machine Learning: Basic Concepts Artificial Intelligence
Overfitting-Underfitting and
Regularization
49
Machine Learning: Basic Concepts Artificial Intelligence
Overfitting/underfitting
50
Machine Learning: Basic Concepts Artificial Intelligence
Test error
Training error
51
Machine Learning: Basic Concepts Artificial Intelligence
52
Machine Learning: Basic Concepts Artificial Intelligence
53
Machine Learning: Basic Concepts Artificial Intelligence
Avoid overfitting
54
Machine Learning: Basic Concepts Artificial Intelligence
Regularization: Intuition
We want to minimize:
𝑙𝑜𝑠𝑠 𝑦𝑖 , 𝑓 𝑥𝑖 +𝐶×𝑅 𝑓
𝑖=1
55
Machine Learning: Basic Concepts Artificial Intelligence
Regularization: Intuition
56
Machine Learning: Basic Concepts Artificial Intelligence
Example: Split the data randomly into 60% for training, 20% for
validation and 20% for testing.
Source: https://towardsdatascience.com/
57
Machine Learning: Basic Concepts Artificial Intelligence
Source: https://towardsdatascience.com/
58
Machine Learning: Basic Concepts Artificial Intelligence
Source: https://towardsdatascience.com/
59
Machine Learning: Basic Concepts Artificial Intelligence
Source: https://towardsdatascience.com/
60
Machine Learning: Basic Concepts Artificial Intelligence
Source: https://towardsdatascience.com/
61
Machine Learning: Basic Concepts Artificial Intelligence
62
Machine Learning: Basic Concepts Artificial Intelligence
𝐸 𝐷𝑗
𝑗=1
63
Machine Learning: Basic Concepts Artificial Intelligence
Confusion matrix
64
Machine Learning: Basic Concepts Artificial Intelligence
65
Machine Learning: Basic Concepts Artificial Intelligence
• Accuracy:
• How many of the samples are classified correctly? C1
• A = 9/10 = 0.9
C2
C2 C2
F1 0.4 0
C2 C2
Error Analysis
𝑪𝟏
• Confusion Matrix
• How classes get confused?
Predicted
𝐶1 𝐶2 𝐶3 𝑪𝟐
3 0 1
Actual 0 3 1
1 0 1 𝑪𝟑
• Useful:
• Find classes that get confused with others
• Develop better features to solve the problem
Evaluation: Multi-class
𝑪𝟏
𝐶1 𝐶2 𝐶3
P 0.75 1 0.333
R 0.75 0.75 0.5
F1 0.75 0.86 0.4 𝑪𝟐
Evaluation: Multi-class
σ𝑁
𝑖=1 TP𝑖
• Micro-average of Precision= σ𝑁 𝑁
𝑖=1 TP𝑖 +σ𝑖=1 FP𝑖
σ𝑁
𝑖=1 TP𝑖
• Micro-average of Recall= σ𝑁 𝑁
𝑖=1 TP𝑖 +σ𝑖=1 FN𝑖
σ𝑁
𝑖=1 P𝑖
• Macro-average of Precision=
𝑁
σ𝑁
𝑖=1 R𝑖
• Macro-average of Recall=
𝑁
71
Machine Learning: Basic Concepts Artificial Intelligence
Evaluation: Multi-class
𝑪𝟏
• Majority class baseline
• Accuracy = 0.8
• Macro-F1 = 0.296
𝑪𝟐
• Macro-F1:
• Should be used in binary classification when
two classes are important
• e.g.: males/females while distribution is 80/20% 𝑪𝟑
Use the same confusion matrix, calculate the measure just introduced.
Actual Class\Predicted class cancer = yes cancer = no Total Recognition(%)
cancer = yes 90 210 300 30.00 (sensitivity)
cancer = no 140 9560 9700 98.56 (specificity)
Total 230 9770 10000 96.50 (accuracy)
73
Machine Learning: Basic Concepts Artificial Intelligence
74
Machine Learning: Basic Concepts Artificial Intelligence
• where 𝑦𝑖 is the actual expected output and 𝑦ො𝑖 is the model’s prediction.
• The higher this value, the worse the model is. It is never negative,
since we’re squaring the individual prediction-wise errors before
summing them, but would be zero for a perfect model.
75
Machine Learning: Basic Concepts Artificial Intelligence
𝑁
1 2
RMSE = 𝑦𝑖 − 𝑦ො𝑖 = MSE
𝑁
𝑖=1
76
Machine Learning: Basic Concepts Artificial Intelligence
Terminology review
77
Machine Learning: Basic Concepts Artificial Intelligence
78
Machine Learning: Basic Concepts Artificial Intelligence
79
Machine Learning: Basic Concepts Artificial Intelligence
References
80