Research Gradient Descent

The document discusses neural networks and cost functions. It explains that neural networks use weights and biases to determine outputs from multiple inputs. A cost function called mean squared error is used to minimize the difference between predicted and actual outputs. Gradient descent is employed to find the minimum of the cost function by computing the gradient and moving in the opposite direction repeatedly. Stochastic gradient descent is more efficient for large datasets by computing the gradient for small random samples rather than the entire training set at once.

Uploaded by

Andrew Jarrett

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views8 pages

Research Gradient Descent

Uploaded by

Andrew Jarrett

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 8

Neural Networks and

Cost Function
Andrew Jarrett
11/30/19
Review


Perceptions take several inputs and create one output. They use weights and
biases to decide the output.
 Sigmoid neuron

 This allows the perception to have an number in-between zero and one.

The main objective is to create an algorithm which lets us find the weights
and biases so that the output from the network approximates y(x) for each of
the training examples x
 In the textbook example this y(x) is a 10 dimensional vector
 each dimension corresponds to a number output in this case the 1 means the
output is an 8
Cost Function

 The equation below is the quadratic cost function, or mean squared error

 C is a function of the weights and biases, n is the number of training inputs, a

is the vector of the actual outputs for each training input x, and it is the sum
of all the training inputs (x).
 C will approach O when y(x) is approximately equal to the output a for all the
training inputs
 This is used because the cost function is smooth unlike the function of the
number images correctly classified.
Gradient Descent

 is used to solve minimization problems, or finding the absolute minimum of

It
a function
 Visualize a ball rolling down a hill and comes to a complete stop at the
bottom
 The ball is changing by a small amount therefore the change of the ball is C
in the equation below

The previous equation allows us to choose a value of v as to make the C
negative
 Therefore

  is a small positive parameter or the learning rate it dictates how fast the
program will learn

 This equation shows how the “ball” is rolling down the hill
 Summary: Gradient descent works by repeatedly computing the gradient of
the cost function then to move it in the opposite direction.
Gradient Descent and Learning


Stochastic Gradient Descent

 A problem occurs when using the cost function

 When having to find the C we must find

 This is extremely time consuming if the number of training samples is large

 This would be helpful in our project due to large data source

1.4+Computing+Gradient+Using+Backpropagation
No ratings yet
1.4+Computing+Gradient+Using+Backpropagation
5 pages
5 - How Do NNS Learn
No ratings yet
5 - How Do NNS Learn
4 pages
Curs3site PDF
No ratings yet
Curs3site PDF
38 pages
DL UNIT-I
No ratings yet
DL UNIT-I
30 pages
Neural-Network(Basics)
No ratings yet
Neural-Network(Basics)
48 pages
AI33
No ratings yet
AI33
6 pages
Gradient Descent and Cost Function
No ratings yet
Gradient Descent and Cost Function
14 pages
DL_Unit2
No ratings yet
DL_Unit2
113 pages
What Is Machine Learning by Coursera
No ratings yet
What Is Machine Learning by Coursera
47 pages
Tom Mitchell Provides A More Modern Definition
No ratings yet
Tom Mitchell Provides A More Modern Definition
10 pages
AI - W7L13
No ratings yet
AI - W7L13
46 pages
L3 Linear Regression and Gradient Descent
No ratings yet
L3 Linear Regression and Gradient Descent
46 pages
CS601 Machine Learning Unit 2 Notes 1672759753
No ratings yet
CS601 Machine Learning Unit 2 Notes 1672759753
14 pages
CS601 - Machine Learning - Unit 2 - Notes - 1672759753
No ratings yet
CS601 - Machine Learning - Unit 2 - Notes - 1672759753
14 pages
Lecture3_Linear Regression and Logistic Regression
No ratings yet
Lecture3_Linear Regression and Logistic Regression
60 pages
7 - Feedforward and Backpropagation
No ratings yet
7 - Feedforward and Backpropagation
55 pages
Multi Percept Ron
No ratings yet
Multi Percept Ron
14 pages
4. Gradient Descent
No ratings yet
4. Gradient Descent
15 pages
Mid 1 DL Notes
No ratings yet
Mid 1 DL Notes
15 pages
Linear Regression For Absolute Beginners With Implementation in Python
No ratings yet
Linear Regression For Absolute Beginners With Implementation in Python
17 pages
Assignment - 4
No ratings yet
Assignment - 4
24 pages
Intro To DL
No ratings yet
Intro To DL
28 pages
What Is Machine Learning?
No ratings yet
What Is Machine Learning?
12 pages
Deep Learning Week 201
No ratings yet
Deep Learning Week 201
3 pages
Neural Networks - 2
No ratings yet
Neural Networks - 2
79 pages
Neural Networks - Learning
No ratings yet
Neural Networks - Learning
26 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
19 pages
Lecture 3
No ratings yet
Lecture 3
56 pages
Topic 4 (Part 2) - NN learning
No ratings yet
Topic 4 (Part 2) - NN learning
92 pages
Neural Network - Optimization DRAFT 3.11
No ratings yet
Neural Network - Optimization DRAFT 3.11
66 pages
Gradient Descent
No ratings yet
Gradient Descent
7 pages
Topic 5 - Part2 NN Learning
No ratings yet
Topic 5 - Part2 NN Learning
90 pages
Convolutional Neural Network
100% (1)
Convolutional Neural Network
59 pages
DL Unit -2
No ratings yet
DL Unit -2
20 pages
Linear Regression
No ratings yet
Linear Regression
63 pages
Sigmoid Neural Networks to Predict Handwritten Digits
No ratings yet
Sigmoid Neural Networks to Predict Handwritten Digits
16 pages
cost function
No ratings yet
cost function
3 pages
Gradient Descent 5 Part 2
No ratings yet
Gradient Descent 5 Part 2
15 pages
CS435 Ch5
No ratings yet
CS435 Ch5
15 pages
Machine Learning
No ratings yet
Machine Learning
60 pages
Understanding Backpropagation Algorithm - Towards Data Science
No ratings yet
Understanding Backpropagation Algorithm - Towards Data Science
11 pages
nn2
No ratings yet
nn2
12 pages
CS601_Machine Learning_Unit 2 New
No ratings yet
CS601_Machine Learning_Unit 2 New
56 pages
DSCTP 2022 1 ML Slides
No ratings yet
DSCTP 2022 1 ML Slides
110 pages
9NeuralNetworksLearning
No ratings yet
9NeuralNetworksLearning
38 pages
A Weight Decides How Much Influence The Input Will Have On The Output
No ratings yet
A Weight Decides How Much Influence The Input Will Have On The Output
1 page
Unit 1 (1)
No ratings yet
Unit 1 (1)
72 pages
cours1
No ratings yet
cours1
42 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
100 pages
Cs229 Notes Deep Learning
No ratings yet
Cs229 Notes Deep Learning
21 pages
Unit 2
No ratings yet
Unit 2
36 pages
Top 7 Loss Functions to Evaluate Regression Models
No ratings yet
Top 7 Loss Functions to Evaluate Regression Models
8 pages
Neural Networks and Fuzzy Systems: Multi-Layer Feed Forward Networks
No ratings yet
Neural Networks and Fuzzy Systems: Multi-Layer Feed Forward Networks
27 pages
Lect 5
No ratings yet
Lect 5
89 pages
Part 1.1.neural Network and Training Algorithm
No ratings yet
Part 1.1.neural Network and Training Algorithm
34 pages
Lecture20 Backprop
No ratings yet
Lecture20 Backprop
77 pages
CS460 - Deep Learning - W02 & W03
No ratings yet
CS460 - Deep Learning - W02 & W03
44 pages
Unit VI Optimization Techniques question bank solved answer
No ratings yet
Unit VI Optimization Techniques question bank solved answer
20 pages
CCS355 Neural Networks and Deep Learning
No ratings yet
CCS355 Neural Networks and Deep Learning
142 pages
Vb Net Programming
From Everand
Vb Net Programming
Martin Booch
No ratings yet

Research Gradient Descent

Uploaded by

Research Gradient Descent

Uploaded by

Neural Networks and

 C is a function of the weights and biases, n is the number of training inputs, a

 is used to solve minimization problems, or finding the absolute minimum of

 A problem occurs when using the cost function

 When having to find the C we must find

 This is extremely time consuming if the number of training samples is large

You might also like