0% found this document useful (0 votes)

6 views

Lect 12 -Deep Feed Forward NN- Review

The document provides a comprehensive review of deep feedforward neural networks, covering topics such as linear and non-linear classifiers, optimization techniques, and the importance of activation functions. It discusses the structure and function of perceptrons, various types of neural networks, and methods for loss optimization and regularization to prevent overfitting. Additionally, it highlights practical aspects of training neural networks, including gradient descent algorithms and adaptive learning rates.

Uploaded by

cs22b2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Lect 12 -Deep Feed Forward NN- Review

Uploaded by

cs22b2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 93

1

DEEP FEED FORWARD NEURAL

NETWORKS-
A COMPLETE REVIEW

Umarani Jayaraman
Outline
2

 Linear classifier
 perceptron
 Non-linear classifiers
 MLP, Neural Networks
 Optimization Techniques
 Loss Optimization – Gradient Descent
 Batch Optimization – Batch, Stochastic and
Mini-Batch
 Overfitting – Dropout, Early Stopping
3
Linear Classifier-The
Perceptron
The structural building
block of deep learning
The Perceptron: Forward Propagation
4
The Perceptron: Forward Propagation
5
The Perceptron: Forward Propagation
6
The Perceptron: Forward Propagation
7
Common activation function
8
Importance of Activation
9
Functions
 What is the purpose of activation
function?
 The purpose of activation functions is to
introduce non-linearity into the
network
 What if we wanted to build a neural
network to distinguish green against red
points?
Importance of activation
10
functions
 The purpose of activation functions is to
introduce non-linearity into the
network
 Linear activation functions produce
linear decision boundary with minimal
error
Importance of activation
11
functions
 The purpose of activation functions is to
introduce non-linearity into the
network
 Non-linearity allow us to approximate
arbitrarily complex functions
The perceptron: Example
12
The perceptron: Example
13
The perceptron: Example
14
The perceptron: Example
15
16 Neural Networks
Building Neural Networks
with Perceptorns
The perceptron : Simplified
17
The perceptron : Simplified
18

 This is also called as Single Layer Single

Output
Neural Network: Single Layer Multiple
Output or Multi output Perceptron
19

 Because all inputs are densely

connected to all outputs
 These layers are called Dense layers or
sometimes fully connected layers
Dense layer from scratch in TensorFlow
20
Neural Network: Single Layer Multiple
Output or Multi output Perceptron
21
Neural Network: Multiple Layer Multiple
Output
22

 Multiple Layer Neural Networks

Multiple Layer Neural
23
Network
Multi Output Perceptron (Multiple Layer
Perceptorn)
24
Deep Neural Networks
25
Deep Neural Networks
26
Summary
27

 The perceptron
 Activation Functions
 Neural Networks
 Types of Neural Networks
 Deep Neural Networks
28 Neural Networks
Applying Neural
Networks
Example Problem
29

 How do expertise in the field of

deep learning?
 Lets start with a simple two

feature model
 X = Number of lectures you
1
attend
 X = Hours spent on each topics
2
and the final project
Example Problem
30

 How do expertise in the field of deep

learning?
Example Problem
31

 How do expertise in the field of deep

learning?
Example Problem
32

 How do expertise in the field of deep

learning?
Example Problem
33

 How do expertise in the field of deep

learning?
Quantifying loss
34

 The loss of our network measures

the cost incurred from incorrect
predictions (misclassified samples)
Empirical Loss
35

 The empirical loss (mean loss) measures

the total loss over our entire dataset. Let
say there are n samples
i) Binary cross entropy loss
36

 Cross entropy loss can be used with

models that output a probability
between 0 and 1
ii) Mean Squared Error Loss
37

 Mean squared error loss can be used

with regression models that output
continuous real numbers
38 Neural Networks
Training Neural
Networks
Training Neural Network
39
Loss optimization
40

 The goal is to find the weight that

achieve the lowest error rate
Loss optimization
41

 The goal is to find the weight that

achieve the lowest error rate
Loss optimization
42
Loss optimization
43

 Randomly pick an initial weight (wo,w1)

Loss optimization
44

 Compute gradient
Loss Optimization
45

 Take small step in opposite direction of

gradient
Gradient Descent
46

 Repeat until convergence

Gradient Descent
47
Gradient Descent
48
Loss optimization for linear regression
49
Gradient Descent Algorithm
50
Gradient Descent
51
Computing Gradients (Step 3): Back
propagation
52

 How does a small change in one weight

(w2) affect the final loss J(w)?
Computing Gradients: Back propagation
53

 How does a small change in one weight

(w2) affect the final loss J(w)?
 Lets use chain rule
Computing Gradients: Back propagation
54

 How does a small change in one weight

(w2) affect the final loss J(w)?
 Lets use chain rule
Computing Gradients: Back propagation
55
Computing Gradients: Back propagation
56
Computing Gradients: Back propagation
57

 Repeat this for every weight in the

network using gradients from last layers
58 Neural Networks in Practice

Loss Optimization
Batch Optimization
Overfitting
Training Neural Networks is
59
Difficult
 Visualizing the loss landscape of neural
net, Dec 2017
 Loss functions can be difficult to
optimize
Loss Optimization
60

 The loss is extremely non convex

Loss Optimization
61

 Loss function is optimized through

gradient descent
Loss optimization
62

 How can we set the learning rate?

Loss optimization
63

 Small learning rate converges slowly and

gets stuck in false local minima
 Large learning rate overshoot, become
unstable and diverge
 Stable learning rates converge smoothly
and avoid local minima
Loss optimization
64

 How to deal with this?

 Idea 1: Try lots of different learning rates
and see what works “right”
 Idea 2: Do something smarter ! Design
an adaptive learning rate that “adapts”
to the landscape
Adaptive Learning Rates
65

 Learning rates are no longer fixed

 Can be larger or smaller depending on
 How large gradient is
 How fast learning is happening
 Size of particular weight
 Etc.
Gradient Descent
66
Algorithms
 Algorithm
 SGD
 Adam
 Adadelta
 Adagrad
 RMSProp
67 Neural Networks in Practice

Loss Optimization
Batch Optimization
Overfitting
Gradient Descent
68
Gradient Descent
69

It can be
computationally
intensive to compute
Stochastic Gradient
70
Descent

Easy to compute but very

noisy
71

Fast to compute and a

much better estimate of
the true gradient
Mini-batches while training
72

 More accurate estimation of gradient

 Smoother convergence
 Allows for larger leaning rates
 Mini-batches lead to fast training
Error minimization with
73
iterations
74 Neural Networks in Practice

Loss Optimization
Batch Optimization
Over-fitting
The problem of Over-fitting
75

 It is also known as problem of

generalization
The problem of Over-fitting
76
The problem of Over-fitting
77
Regularization
78

 What is it?
 Technique that constrains our optimization
problem to discourage (not to have)
complex models
 Why do we need it?
 Improve generalization of our model on
unseen data
Regularization 1: Dropout
79

 During training, randomly set some

activation to 0
Regularization 1: Dropout
80

 During training, randomly set some

activation to 0
 Typically ‘drop’ 50% of activations in
layer
 Forces network to not rely on any one
node
Regularization 1: Dropout
81

 During training, randomly set some

activation to 0
 Typically ‘drop’ 50% of activations in
layer
 Forces network to not rely on any one
node
Regularization 1: Dropout
82
Regularization 2 : Early
83
Stopping
 Stop training before we have a chance to
overfit
Regularization 2 : Early
84
Stopping
 Stop training before we have a chance to
overfit
Regularization 2 : Early
85
Stopping
 Stop training before we have a chance to
overfit
Regularization 2 : Early
86
Stopping
 Stop training before we have a chance to
overfit
Regularization 2 : Early
87
Stopping
 Stop training before we have a chance to
overfit
Regularization 2 : Early
88
Stopping
 Stop training before we have a chance to
overfit
Regularization 2 : Early
89
Stopping
Summary
90
91

Thank you
Extra Slides
92
Sources:
93

 Loss Functions
 https://deeplearningdemystified.com/arti
cle/fdl-3
 https://gombru.github.io/2018/05/23/cros
s_entropy_loss/

A I For Children
No ratings yet
A I For Children
140 pages
MidJourney Brain ChatGPT Training Guide
100% (2)
MidJourney Brain ChatGPT Training Guide
13 pages
Taylor Amarel Resume - December 2022
No ratings yet
Taylor Amarel Resume - December 2022
3 pages
Why Quality Doesnt Matter
No ratings yet
Why Quality Doesnt Matter
48 pages
ML Lec 10 Neural Networks
No ratings yet
ML Lec 10 Neural Networks
87 pages
Notes DL-1
No ratings yet
Notes DL-1
10 pages
Deep Learning
100% (2)
Deep Learning
49 pages
a imprimer 4
No ratings yet
a imprimer 4
4 pages
Deep Learning PDF
100% (1)
Deep Learning PDF
87 pages
Deep Learning (1)
No ratings yet
Deep Learning (1)
19 pages
Components-Algorithms/: The Basic Architecture of Neural Networks: Single Computational Layer
No ratings yet
Components-Algorithms/: The Basic Architecture of Neural Networks: Single Computational Layer
65 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
Deep Learning - Intro, Methods & Applications
100% (1)
Deep Learning - Intro, Methods & Applications
37 pages
shortnotedeeplearning (2)
No ratings yet
shortnotedeeplearning (2)
11 pages
Lecture_09_slides_-_after
No ratings yet
Lecture_09_slides_-_after
57 pages
Artificial Intelligence - Chapter 7
No ratings yet
Artificial Intelligence - Chapter 7
18 pages
Deep Learning and Its Applications
No ratings yet
Deep Learning and Its Applications
21 pages
Deep Learning concepts ppt
No ratings yet
Deep Learning concepts ppt
13 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
100 pages
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
No ratings yet
An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass
45 pages
Machine Learning: Feed Forward Neural Networks Backpropagation Algorithm Cnns and Rnns
No ratings yet
Machine Learning: Feed Forward Neural Networks Backpropagation Algorithm Cnns and Rnns
127 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
35 pages
Deep Learning
No ratings yet
Deep Learning
299 pages
Unit 2 v1.
No ratings yet
Unit 2 v1.
41 pages
Pure Optimization
No ratings yet
Pure Optimization
23 pages
Eng Ppt Tech
No ratings yet
Eng Ppt Tech
18 pages
3 - DeepLearning - and - CNN v3
No ratings yet
3 - DeepLearning - and - CNN v3
50 pages
Lecture 10
No ratings yet
Lecture 10
155 pages
26 Neural Nets
No ratings yet
26 Neural Nets
77 pages
Inference and Learning
No ratings yet
Inference and Learning
33 pages
Deepnet Lourentzou
No ratings yet
Deepnet Lourentzou
49 pages
2
No ratings yet
2
9 pages
Neural Networks—A Diffusion Model Changing the Landscape
No ratings yet
Neural Networks—A Diffusion Model Changing the Landscape
13 pages
Machine Learning
No ratings yet
Machine Learning
83 pages
Deep Learning concise notes
No ratings yet
Deep Learning concise notes
4 pages
Deep Learing
No ratings yet
Deep Learing
37 pages
Deep Learning_Part II-1
No ratings yet
Deep Learning_Part II-1
23 pages
Deep Learning Basics (Lecture Notes) : Romain Tavenard
No ratings yet
Deep Learning Basics (Lecture Notes) : Romain Tavenard
49 pages
Slides 11
No ratings yet
Slides 11
48 pages
2.game AI 1
No ratings yet
2.game AI 1
268 pages
The Deep Learning Revolution: Introductory Overview Lecture
No ratings yet
The Deep Learning Revolution: Introductory Overview Lecture
35 pages
unit-1
No ratings yet
unit-1
19 pages
2023246032-Backward Propagation and Other Differential Algorithms
No ratings yet
2023246032-Backward Propagation and Other Differential Algorithms
48 pages
Deep Learning Unit 2
No ratings yet
Deep Learning Unit 2
4 pages
ML.8-Neural Networks - Deep Learning (Week 12,13)
No ratings yet
ML.8-Neural Networks - Deep Learning (Week 12,13)
80 pages
Unit II
No ratings yet
Unit II
56 pages
ca3dl
No ratings yet
ca3dl
6 pages
tutorial 1,2
No ratings yet
tutorial 1,2
12 pages
8. Deep learning
No ratings yet
8. Deep learning
95 pages
ML Unit - 2
No ratings yet
ML Unit - 2
70 pages
L10 Neural Network
No ratings yet
L10 Neural Network
52 pages
AI - W7L13
No ratings yet
AI - W7L13
46 pages
cst414- Deep learning
No ratings yet
cst414- Deep learning
34 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Lect 5
No ratings yet
Lect 5
89 pages
Fundamentals of Deep Learning
No ratings yet
Fundamentals of Deep Learning
26 pages
6COM1044 Deep Learning 1
No ratings yet
6COM1044 Deep Learning 1
49 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
3 pages
Neural Net 3rdclass
No ratings yet
Neural Net 3rdclass
35 pages
Artificial Intelligence Interview Questions
From Everand
Artificial Intelligence Interview Questions
Tech Interviews
5/5 (2)
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
Python Machine Learning By Example: Unlock machine learning best practices with real-world use cases
From Everand
Python Machine Learning By Example: Unlock machine learning best practices with real-world use cases
Yuxi (Hayden) Liu
No ratings yet
Natural Computing with Python: Learn to implement genetic and evolutionary algorithms to solve problems in a pythonic way
From Everand
Natural Computing with Python: Learn to implement genetic and evolutionary algorithms to solve problems in a pythonic way
Giancarlo Zaccone
No ratings yet
Mathematics for Machine Learning: A Deep Dive into Algorithms
From Everand
Mathematics for Machine Learning: A Deep Dive into Algorithms
NIBEDITA Sahu
No ratings yet
Outline of Artificial Intelligence
No ratings yet
Outline of Artificial Intelligence
18 pages
Constrained Clustering
No ratings yet
Constrained Clustering
2 pages
AI One
No ratings yet
AI One
11 pages
CLASS 10 - AI - Cbse Handbook
100% (1)
CLASS 10 - AI - Cbse Handbook
123 pages
Customer Service Support 2024 Top Priorities
No ratings yet
Customer Service Support 2024 Top Priorities
12 pages
Image Features and Categorization: Computer Vision Jia-Bin Huang, Virginia Tech
No ratings yet
Image Features and Categorization: Computer Vision Jia-Bin Huang, Virginia Tech
70 pages
cc105_manuscript
No ratings yet
cc105_manuscript
17 pages
Digital Image Processing & Pattern Analysis (CSCE 563) : Course Outline & Introduction
No ratings yet
Digital Image Processing & Pattern Analysis (CSCE 563) : Course Outline & Introduction
24 pages
ppt
No ratings yet
ppt
10 pages
MH 11 (HS)
No ratings yet
MH 11 (HS)
5 pages
Stanford Dog Classification Using Convolutional Neural Network (CNN)
No ratings yet
Stanford Dog Classification Using Convolutional Neural Network (CNN)
8 pages
PFE PROJECTS IDEAS
No ratings yet
PFE PROJECTS IDEAS
4 pages
Arun_AI
No ratings yet
Arun_AI
31 pages
AI in Sports
No ratings yet
AI in Sports
2 pages
Classification Using Decision Trees
No ratings yet
Classification Using Decision Trees
43 pages
Ebook Deep Learning Objective Type Questions
No ratings yet
Ebook Deep Learning Objective Type Questions
102 pages
Next Word Prediction Model
No ratings yet
Next Word Prediction Model
2 pages
ĐỀ SỐ 2
No ratings yet
ĐỀ SỐ 2
6 pages
Elective Paper AIDA CS
No ratings yet
Elective Paper AIDA CS
490 pages
Jad Krafssi Exhibition
No ratings yet
Jad Krafssi Exhibition
5 pages
unit 2 Intellegent Agent note
No ratings yet
unit 2 Intellegent Agent note
25 pages
NLP - Experiment - 8 - A10
No ratings yet
NLP - Experiment - 8 - A10
16 pages
ISB LAI Brochure Batch 13
No ratings yet
ISB LAI Brochure Batch 13
27 pages
Fake News Detection: Muhammad Hassan Ur Rehman Sufyan Ahmed Huzaifa Shuja Taber Bin Zameer
No ratings yet
Fake News Detection: Muhammad Hassan Ur Rehman Sufyan Ahmed Huzaifa Shuja Taber Bin Zameer
21 pages
CÁC ĐỀ KEY VSTEP WRITING HAY GẶP
No ratings yet
CÁC ĐỀ KEY VSTEP WRITING HAY GẶP
19 pages
Intro To AI Course Outline Spring2021-SG-V2.0
No ratings yet
Intro To AI Course Outline Spring2021-SG-V2.0
2 pages

Lect 12 -Deep Feed Forward NN- Review

Uploaded by

Lect 12 -Deep Feed Forward NN- Review

Uploaded by

1

DEEP FEED FORWARD NEURAL

 This is also called as Single Layer Single

 Because all inputs are densely

 Multiple Layer Neural Networks

 How do expertise in the field of

 How do expertise in the field of deep

 How do expertise in the field of deep

 How do expertise in the field of deep

 How do expertise in the field of deep

 The loss of our network measures

 The empirical loss (mean loss) measures

 Cross entropy loss can be used with

 Mean squared error loss can be used

 The goal is to find the weight that

 The goal is to find the weight that

 Randomly pick an initial weight (wo,w1)

 Take small step in opposite direction of

 Repeat until convergence

 How does a small change in one weight

 How does a small change in one weight

 How does a small change in one weight

 Repeat this for every weight in the

 The loss is extremely non convex

 Loss function is optimized through

 How can we set the learning rate?

 Small learning rate converges slowly and

 How to deal with this?

 Learning rates are no longer fixed

Easy to compute but very

Fast to compute and a

 More accurate estimation of gradient

 It is also known as problem of

 During training, randomly set some

 During training, randomly set some

 During training, randomly set some

You might also like