0% found this document useful (0 votes)
31 views6 pages

ML Lec-4

The document discusses gradient descent, an iterative optimization algorithm used to minimize a loss function. It provides the mathematical derivation of gradient descent for linear regression, including calculating the partial derivatives with respect to the parameters a and b at each step and updating the parameter values based on the learning rate. Gradient descent is run iteratively until the loss function reaches a small value, providing the optimal parameter values.

Uploaded by

BHARGAV RAO
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views6 pages

ML Lec-4

The document discusses gradient descent, an iterative optimization algorithm used to minimize a loss function. It provides the mathematical derivation of gradient descent for linear regression, including calculating the partial derivatives with respect to the parameters a and b at each step and updating the parameter values based on the learning rate. Gradient descent is run iteratively until the loss function reaches a small value, providing the optimal parameter values.

Uploaded by

BHARGAV RAO
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

ML

LECTURE-4
BY
Dr. Ramesh Kumar Thakur
Assistant Professor (II)
School Of Computer Engineering
v Gradient descent is an iterative optimization algorithm to find the minimum of a function. Here that function is
our Loss Function. We will use Mean Square Error (MSE) as Loss Function in this topic which is shown below:
� � � �
v E= �=�
[�� − (� + ���)]2 = �=�
(�� − �� )2
� �

v Understanding Gradient Descent


v Mathematical derivation of Gradient Descent in simple Linear Regression :
v 1. Initially let a = 0 and b = 0. Let L be our learning rate. This controls how much the value of b changes with
each step. L could be a small value like 0.0001 for good accuracy.
v 2. Calculate the partial derivative of the loss function with respect to a and b, and plug in the current values of x,
y, b and a in it to obtain the derivative value D.
1 �
v Db = 2 [�
�=1 �
− (� + ���)](−��)

−� �
v Db = � [�
�=� � �
− (� + ���)]

−� �
v Db = � (�
�=� � �
− �� )

v Db is the value of the partial derivative with respect to b.

v Similarly, the partial derivative with respect to a is Da:


1 �
v Da = 2 [�
�=1 �
− (� + ���)](−1)

−� �
v Da = �=�
[�� − (� + ���)]

−� �
v Da = �=�
(�� − �� )

v Mathematical derivation of Gradient Descent in simple Linear Regression :
v 3. Now we update the current value of b and a using the following equation:
v b = b − L×Db
v a = a − L×Da

v 4. We repeat this process until our loss function is a very small value or ideally 0 (which means 0 error or 100%
accuracy). The value of b and a that we are left with now will be the optimum values.

v Now going back to our analogy, b can be considered the current position of the person. D is equivalent to the
steepness of the slope and L can be the speed with which he moves. Now the new value of b that we calculate
using the above equation will be his next position, and L×D will be the size of the steps he will take.
v When the slope is more steep (D is more) he takes longer steps and when it is less steep (D is less), he takes
smaller steps.
v Finally he arrives at the bottom of the valley which corresponds to our loss = 0.
v Now with the optimum value of b and a our model is ready to make predictions !

You might also like