Machine Learning An Algorithmic Perspective (2nd Ed) - 40-42

MACHINE LEARNING UNIT

Uploaded by

eeemictech

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

Machine Learning An Algorithmic Perspective (2nd Ed) - 40-42

MACHINE LEARNING UNIT

Uploaded by

eeemictech

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Preliminaries 19

FIGURE 2.4 The volume of the unit hypersphere for different numbers of dimensions.

make predictions on data inputs. In the next section we consider how to evaluate how well
an algorithm actually achieves this.

2.2 KNOWING WHAT YOU KNOW: TESTING MACHINE LEARNING ALGO-

RITHMS
The purpose of learning is to get better at predicting the outputs, be they class labels or
continuous regression values. The only real way to know how successfully the algorithm has
learnt is to compare the predictions with known target labels, which is how the training is
done for supervised learning. This suggests that one thing you can do is just to look at the
error that the algorithm makes on the training set.
However, we want the algorithms to generalise to examples that were not seen in the
training set, and we obviously can’t test this by using the training set. So we need some
different data, a test set, to test it on as well. We use this test set of (input, target) pairs by
feeding them into the network and comparing the predicted output with the target, but we
don’t modify the weights or other parameters for them: we use them to decide how well the
algorithm has learnt. The only problem with this is that it reduces the amount of data that
we have available for training, but that is something that we will just have to live with.

2.2.1 Overfitting
Unfortunately, things are a little bit more complicated than that, since we might also want
to know how well the algorithm is generalising as it learns: we need to make sure that we
do enough training that the algorithm generalises well. In fact, there is at least as much
danger in over-training as there is in under-training. The number of degrees of variability in
most machine learning algorithms is huge — for a neural network there are lots of weights,
and each of them can vary. This is undoubtedly more variation than there is in the function
we are learning, so we need to be careful: if we train for too long, then we will overfit the
data, which means that we have learnt about the noise and inaccuracies in the data as well
as the actual function. Therefore, the model that we learn will be much too complicated,
and won’t be able to generalise.
Figure 2.5 shows this by plotting the predictions of some algorithm (as the curve) at
20 Machine Learning: An Algorithmic Perspective

FIGURE 2.5 The effect of overfitting is that rather than finding the generating function
(as shown on the left), the neural network matches the inputs perfectly, including the
noise in them (on the right). This reduces the generalisation capabilities of the network.

two different points in the learning process. On the left of the figure the curve fits the
overall trend of the data well (it has generalised to the underlying general function), but
the training error would still not be that close to zero since it passes near, but not through,
the training data. As the network continues to learn, it will eventually produce a much
more complex model that has a lower training error (close to zero), meaning that it has
memorised the training examples, including any noise component of them, so that is has
overfitted the training data.
We want to stop the learning process before the algorithm overfits, which means that
we need to know how well it is generalising at each timestep. We can’t use the training data
for this, because we wouldn’t detect overfitting, but we can’t use the testing data either,
because we’re saving that for the final tests. So we need a third set of data to use for this
purpose, which is called the validation set because we’re using it to validate the learning so
far. This is known as cross-validation in statistics. It is part of model selection: choosing the
right parameters for the model so that it generalises as well as possible.

2.2.2 Training, Testing, and Validation Sets

We now need three sets of data: the training set to actually train the algorithm, the validation
set to keep track of how well it is doing as it learns, and the test set to produce the final
results. This is becoming expensive in data, especially since for supervised learning it all has
to have target values attached (and even for unsupervised learning, the validation and test
sets need targets so that you have something to compare to), and it is not always easy to
get accurate labels (which may well be why you want to learn about the data). The area of
semi-supervised learning attempts to deal with this need for large amounts of labelled data;
see the Further Reading section for some references.
Clearly, each algorithm is going to need some reasonable amount of data to learn from
(precise needs vary, but the more data the algorithm sees, the more likely it is to have seen
examples of each possible type of input, although more data also increases the computational
time to learn). However, the same argument can be used to argue that the validation and
Preliminaries 21

FIGURE 2.6 The dataset is split into different sets, some for training, some for validation,
and some for testing.

test sets should also be reasonably large. Generally, the exact proportion of training to
testing to validation data is up to you, but it is typical to do something like 50:25:25 if you
have plenty of data, and 60:20:20 if you don’t. How you do the splitting can also matter.
Many datasets are presented with the first set of datapoints being in class 1, the next in
class 2, and so on. If you pick the first few points to be the training set, the next the test
set, etc., then the results are going to be pretty bad, since the training did not see all the
classes. This can be dealt with by randomly reordering the data first, or by assigning each
datapoint randomly to one of the sets, as is shown in Figure 2.6.
If you are really short of training data, so that if you have a separate validation set there
is a worry that the algorithm won’t be sufficiently trained; then it is possible to perform
leave-some-out, multi-fold cross-validation. The idea is shown in Figure 2.7. The dataset is
randomly partitioned into K subsets, and one subset is used as a validation set, while the
algorithm is trained on all of the others. A different subset is then left out and a new model
is trained on that subset, repeating the same process for all of the different subsets. Finally,
the model that produced the lowest validation error is tested and used. We’ve traded off
data for computation time, since we’ve had to train K different models instead of just one.
In the most extreme case of this there is leave-one-out cross-validation, where the algorithm
is validated on just one piece of data, training on all of the rest.

2.2.3 The Confusion Matrix

Regardless of how much data we use to test the trained algorithm, we still need to work
out whether or not the result is good. We will look here at a method that is suitable for
classification problems that is known as the confusion matrix. For regression problems things
are more complicated because the results are continuous, and so the most common thing
to use is the sum-of-squares error that we will use to drive the training in the following
chapters. We will see these methods being used as we look at examples.
The confusion matrix is a nice simple idea: make a square matrix that contains all the
possible classes in both the horizontal and vertical directions and list the classes along the
top of a table as the predicted outputs, and then down the left-hand side as the targets. So
for example, the element of the matrix at (i, j) tells us how many input patterns were put

1 Addition
No ratings yet
1 Addition
4 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Choosing Model and Tuning
No ratings yet
Choosing Model and Tuning
20 pages
DSOST3
No ratings yet
DSOST3
31 pages
A "Short" Introduction To Model Selection
No ratings yet
A "Short" Introduction To Model Selection
25 pages
Receiver Operator Characteristic
No ratings yet
Receiver Operator Characteristic
25 pages
ML MU Unit 2
100% (2)
ML MU Unit 2
42 pages
10: Advice For Applying Machine Learning: Deciding What To Try Next
No ratings yet
10: Advice For Applying Machine Learning: Deciding What To Try Next
8 pages
ML MU Unit 2
100% (3)
ML MU Unit 2
84 pages
DOC-20250117-WA0014._20250117_193235_0000
No ratings yet
DOC-20250117-WA0014._20250117_193235_0000
22 pages
Lec10 Intro ML
No ratings yet
Lec10 Intro ML
93 pages
ML U-4
No ratings yet
ML U-4
63 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
116 pages
ML viva questions
No ratings yet
ML viva questions
25 pages
Machine Leafning
No ratings yet
Machine Leafning
5 pages
Unit III - I
No ratings yet
Unit III - I
15 pages
AIML Interview Questions
No ratings yet
AIML Interview Questions
17 pages
CSL0777 L08
No ratings yet
CSL0777 L08
29 pages
Unit IV
No ratings yet
Unit IV
51 pages
2020 Evaluation PDF
No ratings yet
2020 Evaluation PDF
25 pages
Top 45 Machine Learning Interview Questions in 2025
No ratings yet
Top 45 Machine Learning Interview Questions in 2025
37 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Unit 2
No ratings yet
Unit 2
37 pages
ML Question Answer
No ratings yet
ML Question Answer
21 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
32 pages
Training Evaluation
No ratings yet
Training Evaluation
42 pages
Guide
No ratings yet
Guide
24 pages
NNets-701-3_24_2011_ann
No ratings yet
NNets-701-3_24_2011_ann
13 pages
15-The Bias - Variance - Trade-Off-08-04-2024
No ratings yet
15-The Bias - Variance - Trade-Off-08-04-2024
23 pages
10 Advice for Applying Machine Learning
No ratings yet
10 Advice for Applying Machine Learning
25 pages
Machine Learning
No ratings yet
Machine Learning
14 pages
School of Computing and Information Systems The University of Melbourne COMP90049 Introduction To Machine Learning (Semester 1, 2022)
No ratings yet
School of Computing and Information Systems The University of Melbourne COMP90049 Introduction To Machine Learning (Semester 1, 2022)
4 pages
ML8Ensembles (1)
No ratings yet
ML8Ensembles (1)
31 pages
Quiz 1 Materials
No ratings yet
Quiz 1 Materials
159 pages
Cross Validation
No ratings yet
Cross Validation
10 pages
Lecture 9 - Evaluations
No ratings yet
Lecture 9 - Evaluations
68 pages
Chapter 2 Data Preprocessing
No ratings yet
Chapter 2 Data Preprocessing
23 pages
Lecture 2 Ai
No ratings yet
Lecture 2 Ai
24 pages
Types of ML
No ratings yet
Types of ML
4 pages
Model Generalization
No ratings yet
Model Generalization
117 pages
DEEP LEARNING UNIT 3
No ratings yet
DEEP LEARNING UNIT 3
19 pages
chapter 1 capstone project ai class 12
No ratings yet
chapter 1 capstone project ai class 12
5 pages
09_EnsembleLearning
No ratings yet
09_EnsembleLearning
36 pages
Chapter 7 Learning
No ratings yet
Chapter 7 Learning
34 pages
ML 5
No ratings yet
ML 5
14 pages
APS1070 Lecture (3) Slides
No ratings yet
APS1070 Lecture (3) Slides
70 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
How To Choose The Right Test Options When Evaluating Machine Learning Algorithms
No ratings yet
How To Choose The Right Test Options When Evaluating Machine Learning Algorithms
16 pages
Lecture 2 - Supervised Learning
No ratings yet
Lecture 2 - Supervised Learning
6 pages
Chapter-3-Common Issues in Machine Learning
No ratings yet
Chapter-3-Common Issues in Machine Learning
20 pages
Theory in Machine Learning
No ratings yet
Theory in Machine Learning
60 pages
ML 1-6
No ratings yet
ML 1-6
248 pages
ML Lec-10
No ratings yet
ML Lec-10
19 pages
Lec-1 Bias-variance-Tradeoff
No ratings yet
Lec-1 Bias-variance-Tradeoff
24 pages
UE20CS302 Unit3 Slides
No ratings yet
UE20CS302 Unit3 Slides
308 pages
06 Regularizations
No ratings yet
06 Regularizations
42 pages
Wk07 Topic07 2 - 202303
No ratings yet
Wk07 Topic07 2 - 202303
21 pages
Optimization
No ratings yet
Optimization
64 pages
Lecture 12 - Machine Learning
No ratings yet
Lecture 12 - Machine Learning
18 pages
ML m5_1
No ratings yet
ML m5_1
37 pages
Overfitting & Feature Engineering.pptx
No ratings yet
Overfitting & Feature Engineering.pptx
37 pages
(FREE PDF Sample) (Ebook PDF) International Business 2nd Edition by Mike Peng Ebooks
100% (3)
(FREE PDF Sample) (Ebook PDF) International Business 2nd Edition by Mike Peng Ebooks
41 pages
MADAM RIDES THE BUS
No ratings yet
MADAM RIDES THE BUS
4 pages
SOLAR PV MODULES Mounting System
100% (1)
SOLAR PV MODULES Mounting System
30 pages
Inggris 10
No ratings yet
Inggris 10
3 pages
E-Customer Application Form: Subscriber ID Subscriber Name
No ratings yet
E-Customer Application Form: Subscriber ID Subscriber Name
1 page
AArchitecture02 2
No ratings yet
AArchitecture02 2
32 pages
Languge Differences in Karachi
No ratings yet
Languge Differences in Karachi
14 pages
Big Data Analytics
No ratings yet
Big Data Analytics
3 pages
EDP 112 Module Finals
No ratings yet
EDP 112 Module Finals
90 pages
NPTEL Course Global Marketing Management Assignment III
No ratings yet
NPTEL Course Global Marketing Management Assignment III
3 pages
KidsReaders Catalogue 2019
No ratings yet
KidsReaders Catalogue 2019
16 pages
Who Are We?: Reference Number: MAN - MYI - 2020001
No ratings yet
Who Are We?: Reference Number: MAN - MYI - 2020001
2 pages
Alyssa Milano Meal Plan
No ratings yet
Alyssa Milano Meal Plan
6 pages
Module 2
No ratings yet
Module 2
8 pages
Billal: Subject: Offer For Supply and Installation of 1000 Kva 11/0.415 KV Sub-Station
100% (1)
Billal: Subject: Offer For Supply and Installation of 1000 Kva 11/0.415 KV Sub-Station
8 pages
Purpose: Updated Lesson Plan: Food and The Digestive System
No ratings yet
Purpose: Updated Lesson Plan: Food and The Digestive System
5 pages
Past Papers
No ratings yet
Past Papers
6 pages
1MRK504086-UEN C en Technical Reference Manual Transformer Protection IED RET 670 1.1
No ratings yet
1MRK504086-UEN C en Technical Reference Manual Transformer Protection IED RET 670 1.1
980 pages
Layayoga
No ratings yet
Layayoga
4 pages
English Listening Practice Level 4 - Learn English by Listening Engilsh With Subtitle
No ratings yet
English Listening Practice Level 4 - Learn English by Listening Engilsh With Subtitle
69 pages
SSC CHSL 2023 August 4 Shift 1
No ratings yet
SSC CHSL 2023 August 4 Shift 1
29 pages
Ms. & Mrs. Awesome - Mumbai Edition
No ratings yet
Ms. & Mrs. Awesome - Mumbai Edition
26 pages
GEE 102 Lesson 2 - Entrepreneurial Qualities
No ratings yet
GEE 102 Lesson 2 - Entrepreneurial Qualities
5 pages
Download (Ebook) Cambridge International AS and A Level Further Mathematics Further Pure Mathematics 2 Student Book (Cambridge International Examinations) by Ball, Helen ISBN 9780008257781, 0008257787 ebook All Chapters PDF
100% (2)
Download (Ebook) Cambridge International AS and A Level Further Mathematics Further Pure Mathematics 2 Student Book (Cambridge International Examinations) by Ball, Helen ISBN 9780008257781, 0008257787 ebook All Chapters PDF
65 pages
Nazym Tazairt CV
No ratings yet
Nazym Tazairt CV
2 pages
ICNDT Guidelines 2024
No ratings yet
ICNDT Guidelines 2024
42 pages
Willa B. Brown
No ratings yet
Willa B. Brown
1 page
Why Are Stories Important For Children
No ratings yet
Why Are Stories Important For Children
2 pages
Leapton
No ratings yet
Leapton
2 pages

Machine Learning An Algorithmic Perspective (2nd Ed) - 40-42

Uploaded by

Machine Learning An Algorithmic Perspective (2nd Ed) - 40-42

Uploaded by

Preliminaries 19

2.2 KNOWING WHAT YOU KNOW: TESTING MACHINE LEARNING ALGO-

2.2.2 Training, Testing, and Validation Sets

2.2.3 The Confusion Matrix

You might also like