0% found this document useful (0 votes)
16 views

Gradient Descent

Gradient descent is an optimization algorithm that follows the negative gradient of an objective function to locate its minimum. It iteratively takes steps proportional to the negative gradient of the function, moving in the direction of steepest descent. The algorithm stops when a stop condition is met, such as a target number of iterations or a sufficiently small gradient.

Uploaded by

Jeffin Bency
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Gradient Descent

Gradient descent is an optimization algorithm that follows the negative gradient of an objective function to locate its minimum. It iteratively takes steps proportional to the negative gradient of the function, moving in the direction of steepest descent. The algorithm stops when a stop condition is met, such as a target number of iterations or a sufficiently small gradient.

Uploaded by

Jeffin Bency
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

GRADIENT DESCENT

Gradient descent is an optimization algorithm that follows the negative


gradient of an objective function in order to locate the minimum of the
function.

Gradient Descent Optimization

Gradient descent is an optimization algorithm.

It is technically referred to as a first-order optimization algorithm as it


explicitly makes use of the first-order derivative of the target objective
function.

● Gradient: First order derivative for a multivariate objective function.

Specifically, the sign of the gradient tells you if the target function is increasing
or decreasing at that point.

● Positive Gradient: Function is increasing at that point.


● Negative Gradient: Function is decreasing at that point.

Gradient descent refers to a minimization optimization algorithm that follows


the negative of the gradient downhill of the target function to locate the
minimum of the function.

Similarly, we may refer to gradient ascent for the maximization version of the
optimization algorithm that follows the gradient uphill to the maximum of the
target function.
● Gradient Descent: Minimization optimization that follows the negative of
the gradient to the minimum of the target function.
● Gradient Ascent: Maximization optimization that follows the gradient to
the maximum of the target function.

Gradient Descent Algorithm

The gradient descent algorithm requires a target function that is being


optimized and the derivative function for the target function.

The target function f() returns a score for a given set of inputs, and the
derivative function f'() gives the derivative of the target function for a given set
of inputs.

● Objective Function: Calculates a score for a given set of input


parameters.
Derivative Function: Calculates derivative (gradient) of the objective
function for a given set of inputs.

The gradient descent algorithm requires a starting point (x) in the problem, such
as a randomly selected point in the input space.

The derivative is then calculated and a step is taken in the input space that is
expected to result in a downhill movement in the target function, assuming we
are minimizing the target function.

A downhill movement is made by first calculating how far to move in the input
space, calculated as the step size (called alpha or the learning rate) multiplied
by the gradient. This is then subtracted from the current point, ensuring we
move against the gradient, or down the target function.

● x_new = x – alpha * f'(x)

The size of the step taken is scaled using a step size hyperparameter.

● Step Size (alpha): Hyperparameter that controls how far to move in the
search space against the gradient each iteration of the algorithm.

If the step size is too small, the movement in the search space will be small and
the search will take a long time. If the step size is too large, the search may
bounce around the search space and skip over the optima.

The process of calculating the derivative of a point and calculating a new point
in the input space is repeated until some stop condition is met. This might be a
fixed number of steps or target function evaluations, a lack of improvement in
target function evaluation over some number of iterations, or the identification
of a flat (stationary) area of the search space signified by a gradient of zero.

● Stop Condition: Decision when to end the search procedure.


Gradient Descent Worked Example

In this section, we will work through an example of applying gradient descent to


a simple test optimization function.

First, let’s define an optimization function.

We will use a simple one-dimensional function that squares the input and defines
the range of valid inputs from -1.0 to 1.0.

You might also like