0% found this document useful (0 votes)
21 views

Basic Machine Learning: Case Study

Machine learning teaches computers to learn from experience by analyzing large amounts of data without relying on predetermined equations. Linear regression is a machine learning technique that finds the linear relationship between variables. Gradient descent is an algorithm that minimizes a loss function by iteratively adjusting parameter values, moving toward the minimum in small steps like a person descending into a valley. It was implemented to find the optimal slope and intercept values for a linear regression model that best fits sample data.

Uploaded by

Ritik Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Basic Machine Learning: Case Study

Machine learning teaches computers to learn from experience by analyzing large amounts of data without relying on predetermined equations. Linear regression is a machine learning technique that finds the linear relationship between variables. Gradient descent is an algorithm that minimizes a loss function by iteratively adjusting parameter values, moving toward the minimum in small steps like a person descending into a valley. It was implemented to find the optimal slope and intercept values for a linear regression model that best fits sample data.

Uploaded by

Ritik Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Basic Machine

Learning
(Linear Regression using Gradient Descent)

Case Study

Name: Ritik Patiram Singh


Roll No.: 430
Class/Div: SE MECH/B
Batch: B2
What is Machine Learning?
Machine learning teaches computers to do what comes naturally to humans and animals:
learn from experience. Machine learning algorithms use computational methods to
“learn” information directly from data without relying on a predetermined equation as a
model. The algorithms adaptively improve their performance as the number of samples
available for learning increases.

How Machine Learning Works?


Machine learning uses two types of techniques: supervised learning, which trains a model
on known input and output data so that it can predict future outputs, and unsupervised
learning, which finds hidden patterns or intrinsic structures in input data.
How Do You Decide Which Algorithm to Use?
Choosing the right algorithm can seem overwhelming—there are dozens of supervised and
unsupervised machine learning algorithms, and each takes a different approach to learning. There is no
best method or one size fits all. Finding the right algorithm is partly just trial and error—even highly
experienced data scientists can’t tell whether an algorithm will work without trying it out. But algorithm
selection also depends on the size and type of data you’re working with, the insights you want to get
from the data, and how those insights will be used.
When Should You Use Machine Learning?

Consider using machine learning when you have a complex task or problem involving a
large amount of data and lots of variables, but no existing formula or equation. For
example, machine learning is a good option if you need to handle situations like these: •
Hand-written rules and equations are too complex—as in face recognition and speech
recognition. • The rules of a task are constantly changing—as in fraud detection from
transaction records. • The nature of the data keeps changing, and the program needs to
adapt—as in automated trading, energy demand forecasting, and predicting shopping
trends.
Linear Regression using Gradient Descent:

Linear Regression:
In statistics, linear regression is a linear approach to modelling the relationship between a
dependent variable and one or more independent variables. Let X be the independent
variable and Y be the dependent variable. We will define a linear relationship between
these two variables as follows:
Y= mX+C
This is the equation for a line that you studied in high school. m is the slope of the line
and c is the y intercept. Today we will use this equation to train our model with a given
dataset and predict the value of Y for any given value of X.
Our challenege today is to determine the value of m and c, such that the line
corresponding to those values is the best fitting line or gives the minimum error.

Loss function:
The loss is the error in our predicted of m and c. Our goal is to minimize this error to
obtain the most accurance value of m and c.
We will use the Mean Squared Error function to calculate the loss.
There are three steps in this function:
1. Find the difference between the actual y and predicted y value(y = mx +
c), for a given x.
2. Square this difference.
3. Find the mean of the squares for every value in X.

Here yi is the actual value and   is the predicted value. Lets substitue the value of

So we square the error and find the mean. hence the name Mean Squared Error.
Now that we have defined the loss function, lets get into the interesting part - minimizing
it and finding m and c

The Gradient Descent Algorithm:


Gradient descent is a name for a generic class of computer algorithms which minimize a
function. These algorithms achieve this end by starting with initial parameter values and
iteratively moving towards a set of parameter values that minimize some cost function or
metric—that's the descent part. The movement toward best-fit is achieved by taking the
derivative of the variable or variables involved, towards the direction with the lowest
(calculus-defined) gradient—that's the gradient part.

Gradient descent is an important concept in computer science, and an illustrative example


of why CS has kind of overtaken statistics in importance when it comes to machine
learning: it's a general-purpose tool that can be used to "brute force" an optimal solution
in a wide range of scenarios, which doesn't have the elegance, closed-form solution, and
unfortunate sheer mathematical inpalatability of a statistical solution.

Understanding Gradient Descent

Imagine a valley and a person with no sense of direction who wants to get to the bottom
of the valley. He goes down the slope and takes large steps when the slope is steep and
small steps when the slope is less steep. He decides his next position based on his current
position and stops when he gets to the bottom of the valley which was his goal.
Let's try applying gradient descent to m and c and approach it step by step:
1. Initially let m = 0 and c = 0. Let L be our learning rate. This controls how
much the value of m changes with each step. L could be a small value
like 0.0001 for good accuracy.
2. Calculate the partial derivative of the loss function with respect to m, and
plug in the current values of x, y, m and c in it to obtain the derivative
value D.

 is the value of the partial derivative with respect to m. Similarly lets
find the partial derivative with respect to c, 

3. Now we update the current value of m and c using the following

equation: m= m- L * Dm
c= c- L * Dc

4. We repeat this process untill our loss function is a very small value or
ideally 0 (which means 0 error or 100% accuracy). The value
of m and c that we are left with now will be the optimum values.

Now going back to our analogy, m can be considered the current position of the
person. D is equivalent to the steepness of the slope and L can be the speed with which
he moves. Now the new value of m that we calculate using the above equation will be his

next positon, and   will be the size of the steps he will take. When the slope is more
steep (D is more) he takes longer steps and when it is less steep (D is less), he takes
smaller steps. Finally he arrives at the bottom of the valley which corresponds to our loss
= 0.
We repeat the same process above to find the value of c also. Now with the optimum
value of m and c our model is ready to make predictions !

Implementing the Model:


Now let's convert everything above into code and see our model in action !

# Making the imports


import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = (12.0, 9.0)

# Preprocessing Input data


data = pd.read_csv('data.csv')
X = data.iloc[:, 0]
Y = data.iloc[:, 1]
plt.scatter(X, Y)
plt.show()

# Building the model


m=0
c=0

L = 0.0001 # The learning Rate


epochs = 1000 # The number of iterations to perform gradient descent
n = float(len(X)) # Number of elements in X

# Performing Gradient Descent


for i in range(epochs):
Y_pred = m*X + c # The current predicted value of Y
D_m = (-2/n) * sum(X * (Y - Y_pred)) # Derivative wrt m
D_c = (-2/n) * sum(Y - Y_pred) # Derivative wrt c
m = m - L * D_m # Update m
c = c - L * D_c # Update c

print (m, c)
1.4796491688889395 0.10148121494753726
# Making predictions
Y_pred = m*X + c

plt.scatter(X, Y)
plt.plot([min(X), max(X)], [min(Y_pred), max(Y_pred)], color='red') # predicted
plt.show()

Conclusion:
The biggest advantage gradient descent has is that it requires no knowledge whatsoever of
the fundamentals of the model. We can apply the classifier we've built without knowing
anything about linear regression. In particular, we don't need to know that linear
regression has a closed-form solution, or what that solution looks like, or how to derive it.
Instead we just pick a metric, compute its derivative, and then use a computer to brute-
force a solution.
A gradient descent solution to modeling can be applied to any model metric, so long as
the metric has two properties: it's differentiable (most things are) and concave. Concavity
is the property that no matter where you are on the metric surface, the derivative will
point towards a point with better "fit", right up until you get to the bottom of the thing.
Things that are concave include funnels, eye contacts, and, it turns out, the linear
regression parameter space.

Reference:
https://github.com/chasinginfinity/ml-from-
scratch/tree/master/02%20Linear%20Regression%20using%20Gradient
%20Descent
https://towardsdatascience.com/linear-regression-using-gradient-
descent-97a6c8700931

end of report

You might also like