0% found this document useful (0 votes)
69 views15 pages

Gradient Descent for Neural Networks in Python

Deep learning

Uploaded by

sakshishetty149
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views15 pages

Gradient Descent for Neural Networks in Python

Deep learning

Uploaded by

sakshishetty149
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

IMPLEMENT THE GRADIENT DESCENT ALGORITHM FOR A THREE- STAGE FULLY

CONNECTED NEURAL NETWORK USING PYTHON AND


BASIC PACKAGES SUCH AS NUMPY AND MATPLOTLIB

EXPERIMENT 1
STRUCTURE OF THE NEURAL NETWORK

• Three-Stage Network:
• Input Layer: Takes the
input data.
• Hidden Layer: Processes
input from the input layer.
• Output Layer: Produces
the final output.
GRADIENT DESCENT

• Definition: Gradient Descent is an optimization algorithm used to


minimize the cost function by iteratively adjusting the weights.
• A first order Optimization Algorithm used to find Minima(Primarily) or
Maxima(Gradient Ascent) of a given function.
• Steps Involved:

1. Calculate the Predicted Output (Forward Propagation)


2. Compute the Loss/Error
3. Calculate the Gradient of the Loss (Backward Propagation)
4. Update the Weights Using the Gradients
REAL-TIME WORKING OF GRADIENT
DESCENT ALGORITHM
• Initialize Weights: Start with random weights.
• Forward Propagation: Calculate the predicted output using current weights.
• Compute Loss: Measure the error between predicted and actual output.
• Back Propagation: Calculate the gradient of the loss function w.r.t weights.
• Update Weights: Adjust weights in the direction of the negative gradient.
• Repeat: Iterate through the above steps until convergence.
MATHEMATICS HERE?

• We assume a parabola here, as its helpful in visualizing the basic principles of


gradient Descent.
• Updating Parameters based on gradient.
• Concept of moving towards a minimum.

• A single parabola will have only one minima & that would be global minima
• Cost Function(loss) we take as Mean Squared Error form.
• Aim is to minimize the cost.
• There are certain parameters involved here along with input & output.
FINDING MINIMA

• Gradient Descent in Action:


• Start at an initial point.
• Compute the gradient (slope) of the cost function.
• Take steps proportional to the negative gradient.
• Continue until the slope is near zero, indicating a minimum.

• Challenges:
• May get stuck in local minima.
• Requires careful tuning of parameters.
LEARNING RATE:

• Definition: The learning rate is a hyperparameter that controls how much


to change the model(how big the steps) in response to the estimated error
each time the model weights are updated.
• Importance: Determines the size of the steps taken towards the minimum
of the cost function.
• Effects:
• Too High: Can cause the algorithm to converge too quickly to a suboptimal
solution or even diverge.
• Too Low: Results in a slow convergence, potentially getting stuck in local minima.
BATCH SIZE

• Definition: The number of training examples utilized in one iteration.


• Variants:
• Batch Gradient Descent: Uses the entire dataset to compute the gradient.
• Stochastic Gradient Descent (SGD): Uses one training example per
iteration.
• Mini-batch Gradient Descent: Uses a subset of the training data (a mini-
batch) for each iteration.
• Importance: Affects the stability and speed of convergence.
NUMBER OF ITERATIONS (EPOCHS)

• Definition: The number of times the entire training dataset is passed


forward and backward through the neural network.
• Importance: Determines how many times the weights are updated.
• Effects:
• More Iterations: Typically improves the accuracy but requires more
computational resources.
• Fewer Iterations: Can lead to underfitting, where the model fails to learn
adequately from the training data.
CODING

• Importing Libraries
1. NumPy: Provides support for arrays and matrices.
2. Matplotlib: Used for plotting graphs.
CODING HINTS

• Defining the Neural Network Class which encapsulates the neural network logic within a
class for better organization.
• Initialization of Weights: Randomly initialize weights & Set a learning rate for the gradient
descent.
• Forward Propagation: Compute the dot product of inputs and weights.
• Backward Propagation and Gradient Calculation: Derivative of the sigmoid function, used
to calculate gradients.
• Gradient Descent Method: Iteratively update weights to minimize the cost function.
• Training the Network: Define the dataset & Train the neural network using gradient
descent.
CONCLUSION:

• Implemented a three-stage neural network with Python using NumPy


and Matplotlib.
• Applied gradient descent to optimize weights and minimize the cost
function.
• Understanding gradient descent provides a foundation for deeper
machine learning exploration, after which we can explore more
advanced optimization techniques.
THANK YOU

You might also like