0% found this document useful (0 votes)

2 views

Machine Learning-Lecture 16(Student)

The document discusses the evolution and fundamentals of deep learning, particularly focusing on neural networks, which gained popularity after 2010 due to advancements in architecture and the availability of large datasets. It explains single-layer and multilayer neural networks, detailing their structure, activation functions, and training processes, including the use of regularization techniques. The document also highlights the performance improvements of neural networks over traditional models in tasks such as digit classification.

Uploaded by

hubertkuo418

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Machine Learning-Lecture 16(Student)

Uploaded by

hubertkuo418

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Lecture 16: Deep Learning

⚫ The cornerstone of deep learning is the

⚫ Neural networks rose to fame in the late 1980s. Then along came SVMs,
boosting, and random forests, and neural networks fell somewhat from favor.
Part of the reason was that neural networks required a lot of tinkering, while
the new methods were more automatic.
⚫ Neural networks resurfaced after 2010 with the new name deep learning, with
new architectures, additional bells and whistles, and a string of success stories
on some niche problems such as

⚫ Many in the field believe that the major reason for these successes is the
availability of , made possible by
the wide-scale use of digitization in science and industry.
⚫ In this chapter we discuss the basics of neural networks and deep learning, and
then go into some of the specializations for specific problems, such as
for ,
and
for and other

10.1 Single Layer Neural Networks

⚫ A neural network takes an input vector of p variables X = (X1, X2,...,Xp) and
builds a nonlinear function f(X) to predict the response Y .
⚫ Figure 10.1 shows a simple feed-forward neural network for modeling a
quantitative response neural
network using p = 4 predictors.

1
⚫ In the terminology of neural networks, the four features X1,...,X4 make up the
units in the . The arrows indicate that each of the
inputs from the input layer feeds into each of the K
(we get to pick K; here we chose 5).
⚫ The neural network model has hidden units the form

⚫ It is built up here in two steps. First the K 𝐴𝑘 , k =

1, . . . , K, in activations the hidden layer are computed as functions of the
input features X1,...,Xp,

where g(z) is a that is specified in

advance. These K activations from the hidden layer then feed into the output
layer, resulting in

2
a linear regression model in the K = 5 activations. All the parameters
and need to be estimated from data.
⚫ In the early instances of neural networks, the activation
function was favored

which is the same function used in logistic regression to convert a linear

function into probabilities between zero and one (see Figure 10.2).
⚫ The preferred choice in modern neural networks is the
activation function, which takes the form

A ReLU activation can be computed and stored more efficiently than a sigmoid
activation. Although it thresholds at zero, because we apply it to a linear
function (10.2) the constant term will shift this inflection point.

⚫ Fitting a neural network requires estimating the unknown parameters in (10.1).

For a quantitative response, typically is used, so
that the parameters are chosen to minimize

3
⚫ Details about how to perform this minimization are provided in Section 10.7.

10.2 Multilayer Neural Networks

⚫ Modern neural networks typically have more than one ,
and often many units per layer.
⚫ In theory a single hidden layer with a large number of units has the ability to
approximate most functions. However, the learning task of discovering a good
solution is made much with each of
modest size.
⚫ We will illustrate a large on the famous and
publicly available handwritten digit dataset. Figure 10.3 shows
examples of these digits.
⚫ The idea is to build a model to into their correct
digit class . Every image has p = 28 × 28 = 784 pixels, each of which is an
eight-bit grayscale value between 0 and 255 representing the relative amount
of the written digit in that tiny square. These pixels are stored in the input
vector X (in, say, column order). The output is the class label, represented by a
vector Y = (Y0, Y1,...,Y9) of 10 dummy variables, with a one in the position
corresponding to the label, and zeros elsewhere. In the machine learning
community, this is known as . There are 60,000
training images, and 10,000 test images.

⚫ Figure 10.4 shows a architecture that works well

for solving the task. It differs from Figure 10.1 in
several ways:

4
➢ It has hidden layers L1 (256 units) and L2 (128 units) rather than
one. Later we will see a network with seven hidden layers.
➢ It has ten output variables, rather than one. In this case the variables
really represent a single qualitative variable and so are quite dependent.
➢ The function used for training the network is tailored for the
task.
⚫ The first hidden layer is as in (10.2), with

for k = 1,...,𝐾1 . The second hidden layer treats the activations of the first
hidden layer as and computes new activations

for ℓ = 1,...,𝐾2 .

5
(2) (2)
⚫ We have introduced additional superscript notation such as ℎ𝑙 (𝑋) and 𝑤𝑙𝑗

in (10.10) and (10.11) to indicate to which layer the and

( ) belong, in this case layer 2. The notation in Figure
10.4 represents the entire matrix of weights that feed from the input layer to
the first hidden layer L1. This matrix will have . Each
element feeds to the second hidden layer L2 via the matrix of weights
of dimension
Note:

⚫ We now get to the layer, where we now have responses rather

than one. The first step is to compute ten different linear models similar to our
single model (10.1)

for m = 0, 1,..., 9. The matrix stores all of these weights.

Note:

⚫ If these were all separate quantitative responses, we would simply set each
𝑓𝑚 (𝑋) = 𝑍𝑚 and be done. However, we would like our estimates to represent
class probabilities , just like in multinomial
logistic regression in Section 4.3.5. So we use the special activation
function (see (4.13) on page 141),

This ensures that the 10 numbers behave like

6
( and ). Even though the goal is to build a
classifier, our model actually estimates a probability for each of the 10
classes. The classifier then assigns the image to the class with the

⚫ To train this network, since the response is qualitative, we look for coefficient
estimates that minimize the negative multinomial log-likelihood

also known as the . Details on how to minimize this

entropy objective are given in Section 10.7.
⚫ Table 10.1 compares the test performance of the neural network with two
simple models presented in Chapter 4 that make use of linear decision
boundaries: and .
The improvement of neural networks over both of these linear methods is
dramatic:

⚫ Adding the number of coefficients in W1, W2 and B, we get in

all, more than times the number 785 × 9=7,065 needed for multinomial
logistic regression. Recall that there are images in the training set.
⚫ While this might seem like a large training set, there are almost four times as
many coefficients in the neural network model as there are observations in the
training set! To avoid , some regularization is needed. In this
example, we used two forms of : , which is
similar to ridge regression from Chapter 6, and regularization.
We discuss both forms of regularization in Section 10.7.

7
Computer Session
⚫ A Single Layer Network on the Hitters Data

8
⚫ A Multilayer Network on the MNIST Digit Data

9
10

Physics Informed Neural Network Theory and Applications
No ratings yet
Physics Informed Neural Network Theory and Applications
44 pages
Air Pollution Control and Design For Industry
No ratings yet
Air Pollution Control and Design For Industry
606 pages
Congratulations Weekly Challenge 1
No ratings yet
Congratulations Weekly Challenge 1
6 pages
Lesson Plan: Layers of Atmosphere
No ratings yet
Lesson Plan: Layers of Atmosphere
4 pages
TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
3.5/5 (10)
Neural Networks / Deep Learning
No ratings yet
Neural Networks / Deep Learning
9 pages
ECE/CS 559 - Neural Networks Lecture Notes #3 Some Example Neural Networks
No ratings yet
ECE/CS 559 - Neural Networks Lecture Notes #3 Some Example Neural Networks
7 pages
Unit 3
No ratings yet
Unit 3
12 pages
NN MTH404
No ratings yet
NN MTH404
9 pages
Session 1
No ratings yet
Session 1
8 pages
465-Lecture 2-4
No ratings yet
465-Lecture 2-4
43 pages
UNIT II DNN
No ratings yet
UNIT II DNN
24 pages
2K21_EE_192 MLP
No ratings yet
2K21_EE_192 MLP
59 pages
Deep Learning Algorithms Report PDF
No ratings yet
Deep Learning Algorithms Report PDF
11 pages
Module 2
No ratings yet
Module 2
44 pages
Sigmoid Neural Networks to Predict Handwritten Digits
No ratings yet
Sigmoid Neural Networks to Predict Handwritten Digits
16 pages
Deep Learning
No ratings yet
Deep Learning
78 pages
MSCDA 605 Machine Learning Exam Model Answers May_2019
No ratings yet
MSCDA 605 Machine Learning Exam Model Answers May_2019
7 pages
3 ArtificialNeuralNetworks PDF
No ratings yet
3 ArtificialNeuralNetworks PDF
77 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
123 pages
Control System Term Paper
No ratings yet
Control System Term Paper
12 pages
The Little Book of Deep Learning
No ratings yet
The Little Book of Deep Learning
168 pages
Neural Networks
No ratings yet
Neural Networks
37 pages
Deep Learning
No ratings yet
Deep Learning
21 pages
Unit Iv DM
No ratings yet
Unit Iv DM
58 pages
Information Sciences: Le Zhang, P.N. Suganthan
No ratings yet
Information Sciences: Le Zhang, P.N. Suganthan
3 pages
9.deep Feedforward Networks
100% (1)
9.deep Feedforward Networks
13 pages
A Survey of Randomized Algorithms For Training Neural Networks
No ratings yet
A Survey of Randomized Algorithms For Training Neural Networks
10 pages
week 03-04 - Deep Feedforward Networks - Intro
No ratings yet
week 03-04 - Deep Feedforward Networks - Intro
141 pages
Feed Forward Neural Network Assignment PDF
No ratings yet
Feed Forward Neural Network Assignment PDF
11 pages
ML unit 4
No ratings yet
ML unit 4
23 pages
THE_DEEP_NEURAL_NETWORK-A_REVIEW
No ratings yet
THE_DEEP_NEURAL_NETWORK-A_REVIEW
5 pages
NLP-NeuralNetworks Reading Notes
No ratings yet
NLP-NeuralNetworks Reading Notes
13 pages
Machine Learning
No ratings yet
Machine Learning
83 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
Deep Learning
No ratings yet
Deep Learning
13 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
20 pages
CS217_2024_lec11
No ratings yet
CS217_2024_lec11
7 pages
Richi's Neural Nets Summary
No ratings yet
Richi's Neural Nets Summary
114 pages
Neural network
No ratings yet
Neural network
7 pages
Deep Learning As A Building Block in Probabilistic Models: Pierre-Alexandre Mattei
No ratings yet
Deep Learning As A Building Block in Probabilistic Models: Pierre-Alexandre Mattei
57 pages
Learning Neural Activations: Fayyaz Ul Amir Afsar Minhas and Amina Asif
No ratings yet
Learning Neural Activations: Fayyaz Ul Amir Afsar Minhas and Amina Asif
10 pages
AI & ML Unit 5 Notes
No ratings yet
AI & ML Unit 5 Notes
23 pages
NeuralNetworks_JorgeAndreu
No ratings yet
NeuralNetworks_JorgeAndreu
6 pages
Ann PDF
No ratings yet
Ann PDF
129 pages
AML 03 Dense Neural Networks
No ratings yet
AML 03 Dense Neural Networks
20 pages
ML_Lec-22
No ratings yet
ML_Lec-22
25 pages
MODULE 2 DL SNOTES P1
No ratings yet
MODULE 2 DL SNOTES P1
16 pages
Artificial Intelligence - Chapter 7
No ratings yet
Artificial Intelligence - Chapter 7
18 pages
14 Deep
No ratings yet
14 Deep
6 pages
Ad3451 Ml Unit 4 Notes
No ratings yet
Ad3451 Ml Unit 4 Notes
34 pages
Mid Summary
No ratings yet
Mid Summary
13 pages
3b Dynamics
No ratings yet
3b Dynamics
19 pages
The Little Book of Deep Learning - (François Fleuret) - University of Geneva-2023.compressed
No ratings yet
The Little Book of Deep Learning - (François Fleuret) - University of Geneva-2023.compressed
163 pages
Notes Chapter8
No ratings yet
Notes Chapter8
4 pages
22222222222222222
No ratings yet
22222222222222222
1 page
Unit 2
No ratings yet
Unit 2
18 pages
Practical On Artificial Neural Networks: Amrender Kumar
No ratings yet
Practical On Artificial Neural Networks: Amrender Kumar
11 pages
Lesson 7.0 Supervised Learning With Neural Networks (1)
No ratings yet
Lesson 7.0 Supervised Learning With Neural Networks (1)
22 pages
Artificial Intelligence: Outline
No ratings yet
Artificial Intelligence: Outline
35 pages
Introduction to Deep Learning
From Everand
Introduction to Deep Learning
Eugene Charniak
No ratings yet
120 Advanced JavaScript Interview Questions
From Everand
120 Advanced JavaScript Interview Questions
Hernando Abella
No ratings yet
Ch21_Lecture 2025 updated
No ratings yet
Ch21_Lecture 2025 updated
80 pages
分組作業三
No ratings yet
分組作業三
4 pages
Ch25_Lecture 2025 updated
No ratings yet
Ch25_Lecture 2025 updated
84 pages
Ch23_Lecture 2025 updated
No ratings yet
Ch23_Lecture 2025 updated
55 pages
Ch22_Lecture 2025 updated
No ratings yet
Ch22_Lecture 2025 updated
68 pages
Machine Learning-Lecture 17(Student)
No ratings yet
Machine Learning-Lecture 17(Student)
7 pages
Ch24_Lecture 2025 updated
No ratings yet
Ch24_Lecture 2025 updated
51 pages
Machine Learning-Lecture 2(Student)
No ratings yet
Machine Learning-Lecture 2(Student)
9 pages
ll-LDA-practice-411210002
No ratings yet
ll-LDA-practice-411210002
3 pages
Support-Vector-Classifier
No ratings yet
Support-Vector-Classifier
7 pages
Support-Vector-Classifier
No ratings yet
Support-Vector-Classifier
7 pages
Support-Vector-Machine-R程式練習2
No ratings yet
Support-Vector-Machine-R程式練習2
3 pages
Mathematical Analysis Lecture 20240926_241019_114056
No ratings yet
Mathematical Analysis Lecture 20240926_241019_114056
12 pages
PCA-411210002
No ratings yet
PCA-411210002
4 pages
Resampling-Methods 411210002
No ratings yet
Resampling-Methods 411210002
3 pages
Mathematical Analysis Lecture 20241029_241105_233716
No ratings yet
Mathematical Analysis Lecture 20241029_241105_233716
12 pages
Mathematical Analysis Lecture 20241022_241102_155552
No ratings yet
Mathematical Analysis Lecture 20241022_241102_155552
12 pages
Mathematical Analysis Lecture 20241024_241102_155553
No ratings yet
Mathematical Analysis Lecture 20241024_241102_155553
10 pages
Mathematical Analysis Lecture 20240925_241012_115607
No ratings yet
Mathematical Analysis Lecture 20240925_241012_115607
13 pages
Mathematical Analysis Lecture 20241008_241019_114257
No ratings yet
Mathematical Analysis Lecture 20241008_241019_114257
11 pages
Improved_Techniques_for_Lower_Bounds_for_Odd_Perfe
No ratings yet
Improved_Techniques_for_Lower_Bounds_for_Odd_Perfe
2 pages
Mathematical Analysis Lecture 20240910_250224_211702
No ratings yet
Mathematical Analysis Lecture 20240910_250224_211702
10 pages
Mathematical Analysis Lecture 20241015_241019_163822
No ratings yet
Mathematical Analysis Lecture 20241015_241019_163822
18 pages
hw13_250104_160719
No ratings yet
hw13_250104_160719
3 pages
Mathematical Analysis Lecture 20241017_241019_114136
No ratings yet
Mathematical Analysis Lecture 20241017_241019_114136
18 pages
Mathematical Analysis Lecture 20241009_241019_134610
No ratings yet
Mathematical Analysis Lecture 20241009_241019_134610
10 pages
Mathematical Analysis Lecture 20240911_241012_161638
No ratings yet
Mathematical Analysis Lecture 20240911_241012_161638
21 pages
hw08_241124_172949
No ratings yet
hw08_241124_172949
2 pages
hw01
No ratings yet
hw01
3 pages
hw02
No ratings yet
hw02
3 pages
Admit Card 21026582112
No ratings yet
Admit Card 21026582112
2 pages
Where can buy Effective Leadership and Organization’s Market Success 1st Edition Ila Sharma ebook with cheap price
100% (1)
Where can buy Effective Leadership and Organization’s Market Success 1st Edition Ila Sharma ebook with cheap price
55 pages
List of 1000 Most Common Academic Words For English Language Writing Fluency
No ratings yet
List of 1000 Most Common Academic Words For English Language Writing Fluency
7 pages
EDUC 103 - Final Exam
No ratings yet
EDUC 103 - Final Exam
5 pages
The Science of Love and Attachment: 1. Lust
No ratings yet
The Science of Love and Attachment: 1. Lust
2 pages
Balanced Scorecard
No ratings yet
Balanced Scorecard
13 pages
Schema
No ratings yet
Schema
1 page
Quarter 1 - Module 1 - Lesson 2 - ENGLISH 10
100% (2)
Quarter 1 - Module 1 - Lesson 2 - ENGLISH 10
16 pages
Translation of Idioms in Business News Reports
No ratings yet
Translation of Idioms in Business News Reports
31 pages
Group1 Assignment1
No ratings yet
Group1 Assignment1
9 pages
BLOOMs Taxonomy Sample Verbs and Questions in English and Filipino
88% (8)
BLOOMs Taxonomy Sample Verbs and Questions in English and Filipino
3 pages
MCM301 Mid Term Fall 2006 Solved 100% Correct by Suleyman Khan
No ratings yet
MCM301 Mid Term Fall 2006 Solved 100% Correct by Suleyman Khan
7 pages
Homogenous Transformation Matrix
No ratings yet
Homogenous Transformation Matrix
3 pages
Loop Diagram - 1
No ratings yet
Loop Diagram - 1
6 pages
The Service-Dominant Logic of Marketing
No ratings yet
The Service-Dominant Logic of Marketing
16 pages
Detailed Lesson Plan for Observation- 4th Quarter
No ratings yet
Detailed Lesson Plan for Observation- 4th Quarter
6 pages
ESI-1000 Pilots Guide
No ratings yet
ESI-1000 Pilots Guide
48 pages
Fuji Film Xray D Iht Data
No ratings yet
Fuji Film Xray D Iht Data
4 pages
Midterm Practice
No ratings yet
Midterm Practice
29 pages
Term Test #1 (Fall 2014)
No ratings yet
Term Test #1 (Fall 2014)
5 pages
Ssic Lecture Notes Modules 1 To 5 Ordered and Edited
No ratings yet
Ssic Lecture Notes Modules 1 To 5 Ordered and Edited
96 pages
Helping Clients Determine Their Values
100% (2)
Helping Clients Determine Their Values
13 pages
EXAMPLE: - Find The Invers Fourier Transform of The Rectangular Shown in The Figure 4.40
No ratings yet
EXAMPLE: - Find The Invers Fourier Transform of The Rectangular Shown in The Figure 4.40
5 pages
Types of Orthodontic Appliances
No ratings yet
Types of Orthodontic Appliances
20 pages
AE-664A Assignment 1
No ratings yet
AE-664A Assignment 1
2 pages
Bride 01
No ratings yet
Bride 01
3 pages
Design Analysis of An Electric Induction Furnace For Melting Aluminum Scrap
No ratings yet
Design Analysis of An Electric Induction Furnace For Melting Aluminum Scrap
8 pages