04 Machine Learning Overview
04 Machine Learning Overview
Learning?
Machine Learning
● Section Overview:
○ What is Machine Learning?
○ What is Deep Learning?
○ Difference between Supervised and
Unsupervised Learning
○ Supervised Learning Process
○ Evaluating performance
○ Overfitting
What is Machine Learning?
● Machine Learning
○ Automated analytical models.
● Neural Networks
○ A type of machine learning architecture
modeled after biological neurons.
● Deep Learning
○ A neural network with more than one
hidden layer.
Machine Learning
Test
Data
Model
Data Data Model Model
Training &
Acquisitio Cleaning Testing Deployme
Building
n nt
Machine Learning Process
Data
Acquisitio
n
Machine Learning Process
Data Data
Acquisitio Cleaning
n
Machine Learning Process
Test
Data
Training
Data Data Data
Acquisitio Cleaning
n
Machine Learning Process
Test
Data
Model
Data Data Training &
Acquisitio Cleaning Building
n
Machine Learning Process
Test
Data
Model
Data Data Model
Training &
Acquisitio Cleaning Testing
Building
n
Machine Learning Process
Test
Data
Model
Data Data Model
Training &
Acquisitio Cleaning Testing
Building
n
Adjust
Model
Parameter
s
Machine Learning Process
Test
Data
Model
Data Data Model Model
Training &
Acquisitio Cleaning Testing Deployme
Building
n nt
Supervised Learning
● Overfitting
○ The model fits too much to the noise from
the data.
○ This often results in low error on training
sets but high error on test/validation
sets.
Machine Learning
Data
X
Machine Learning
Good Model
X
Machine Learning
● Overfitting
X
Machine Learning
● Overfitting
X
Machine Learning
● Overfitting
X
Machine Learning
● Overfitting
X
Machine Learning
● Underfitting
○ Model does not capture the underlying trend
of the data and does not fit the data well
enough.
○ Low variance but high bias.
○ Underfitting is often a result of an
excessively simple model.
Machine Learning
Data
X
Machine Learning
Underfitting
X
Machine Learning
● Good Model
Erro
r
Training
Time
Machine Learning
● Good Model
Erro
r
Epoch
s
Machine Learning
● Bad Model
Erro
r
Epoch
s
Machine Learning
Error
Epoch
s
Machine Learning
Error
Epoch
s
Machine Learning
Error
Epoch
s
Machine Learning
Error
Epoch
s
Machine Learning
Error
Epoch
s
Machine Learning
Error
Epoch
s
Machine Learning
TRAINED
MODEL
Model Evaluation
TRAINED
Test Image
from X_test
MODEL
Model Evaluation
TRAINED
Test Image
from X_test
MODEL
DOG
Correct Label
from y_test
Model Evaluation
TRAINED DOG
Test Image
from X_test
MODEL
Prediction on
Test Image
DOG
Correct Label
from y_test
Model Evaluation
TRAINED DOG
Test Image
from X_test
MODEL
Prediction on
Test Image
DOG
Correct Label DOG == DOG ?
from y_test
Compare Prediction to Correct Label
Model Evaluation
TRAINED CAT
Test Image
from X_test
MODEL
Prediction on
Test Image
DOG
Correct Label DOG == CAT ?
from y_test
Compare Prediction to Correct Label
Model Evaluation
● Accuracy
○ Accuracy in classification problems is
the number of correct predictions
made by the model divided by the total
number of predictions.
Model Evaluation
● Accuracy
○ For example, if the X_test set was 100
images and our model correctly
predicted 80 images, then we have
80/100.
○ 0.8 or 80% accuracy.
Model Evaluation
● Accuracy
○ Accuracy is useful when target classes
are well balanced
○ In our example, we would have roughly
the same amount of cat images as we
have dog images.
Model Evaluation
● Accuracy
○ Accuracy is not a good choice with
unbalanced classes!
○ Imagine we had 99 images of dogs and
1 image of a cat.
○ If our model was simply a line that
always predicted dog we would get 99%
accuracy!
Model Evaluation
● Accuracy
○ Imagine we had 99 images of dogs and
1 image of a cat.
○ If our model was simply a line that
always predicted dog we would get 99%
accuracy!
○ In this situation we’ll want to understand
recall and precision
Model Evaluation
● Recall
○ Ability of a model to find all the relevant
cases within a dataset.
○ The precise definition of recall is the
number of true positives divided by the
number of true positives plus the
number of false negatives.
Model Evaluation
● Precision
○ Ability of a classification model to
identify only the relevant data points.
○ Precision is defined as the number of
true positives divided by the number of
true positives plus the number of false
positives.
Model Evaluation
● F1-Score
○ In cases where we want to find an
optimal blend of precision and recall we
can combine the two metrics using what
is called the F1 score.
Model Evaluation
● F1-Score
○ The F1 score is the harmonic mean of
precision and recall taking both metrics
into account in the following equation:
Model Evaluation
● F1-Score
○ We use the harmonic mean instead of a
simple average because it punishes
extreme values.
○ A classifier with a precision of 1.0 and a
recall of 0.0 has a simple average of 0.5
but an F1 score of 0.
Model Evaluation
Software Researc
h
Domain
Knowledg
e
Confusion Matrix
Software Researc
h
Domain
Knowledg
e
Model Evaluation
Domain
Knowledg
e
Evaluating
Performance
REGRESSION
Evaluating Regression
● Clustering
○ Grouping together unlabeled data
points into categories/clusters
○ Data points are assigned to a cluster
based on similarity
Machine Learning
● Anomaly Detection
○ Attempts to detect outliers in a
dataset
○ For example, fraudulent transactions
on a credit card.
Machine Learning
● Dimensionality Reduction
○ Data processing techniques that
reduces the number of features in a
data set, either for compression, or to
better understand underlying trends
within a data set.
Machine Learning
● Unsupervised Learning
○ It’s important to note, these are
situations where we don’t have the
correct answer for historical data!
○ Which means evaluation is much
harder and more nuanced!
Unsupervised Process
Test
Data
Model
Data Data Training & Transformation Model
Acquisitio Cleaning Building Deployme
n nt
Machine Learning