0% found this document useful (0 votes)

167 views98 pages

Applications of Regression Analysis

This document discusses different types of linear regression models: - Simple linear regression involves one dependent and one independent variable. - Multiple linear regression involves one dependent variable and two or more independent variables. It allows modeling of more complex relationships between variables. - Ordinary least squares and gradient descent are two common methods for fitting linear regression models to minimize error between predicted and actual values. Multiple linear regression allows modeling relationships between a dependent variable and multiple factors that influence it.

Uploaded by

Madhav Chaudhary

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

167 views98 pages

Applications of Regression Analysis

Uploaded by

Madhav Chaudhary

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Regression

Simple and Multiple Linear Regression, Nonlinear Regression, Logistic

Regression
Supervised/Unsupervised Learning

• Supervised Learning It is the machine learning task of inferring

a function from training & testing data
Example: Classification & Regression Problems

• Unsupervised Learning refers to the problem of trying to find

hidden structure in unlabeled data
Example: Clustering Problems
You have to only deal with ……..
Examples
• Regression Problem
• Prediction of wheat production
• Prediction of rainfall
• Point prediction of Stock Exchange
• Prediction of Price

• Classification Problems
• Prediction of cancer
• Win prediction of Sh. Narendra Modi ji
• Diabetic Prediction
• Classification of e-mail
Understand the Data……..
Classification & Regression
Data Set : UCI Library
Google “uci dataset”
Linear Regression
A statistical method that is used for predictive analysis
Makes predictions for continuous/real or numeric variables such as sales, salary,
age, product price, etc.

Shows a linear relationship between a dependent (y) and one or more

independent (X) variables, hence called as linear regression

Dependent variable is continuous in nature

linear regression shows the linear relationship, which means it finds how the
value of the dependent variable is changing according to the value of the
independent variable
Linear regression performs the task to predict a
dependent variable value (y) based on a given
independent variable (x)

So, this regression technique finds out a linear

relationship between x (input) and y(output).
Hence, the name is Linear Regression

When the value of x increases, the value of y is

likewise increasing

X (input) is the work experience and Y (output)

is the salary of a person

The regression line is the best fit line for our

model
Simple Linear Regression

If there is a single input variable (x), such linear regression is called simple linear regression
y= b*x+a
or
y= a0+a1*x
• Y= Dependent Variable (Target Variable) (ESTIMATED OUTPUT)
• X= Independent Variable (predictor Variable) (INPUT)
• a0= Intercept of the line (Gives an additional degree of freedom) (regression coefficient)
• a1 = Linear regression coefficient (scale factor to each input value) (SLOPE) (regression coefficient)
The motive of the linear regression algorithm is to find the best values for a_0 and a_1
X: amount of fertilizer, y: size of crop (every time we add a unit to X, the dep variable y increases
proportionally)
Multiple Linear Regression
If there is more than one input variable, such linear regression is called multiple linear
regression
y= a + b_1*x_1 + b_2*x_2 + b_3*x_3+… + b_n*x_n
or
y= a_0 + a_1*x_1 + a_2*x_2+.. + a_n*x_n

Eg: Add amount of sunlight and rainfall in a growing season to the fertilizer variable, with all 3
affecting y

X_1: amount of fertilizer,

X_2: amount of sunlight,
X_3: amount of rainfall,
y: size of crop
y= 0.9 + 1.2*x1 + 2*x2 + 4*x3

Which feature is more important?

Example, Result (y)
quiz(x1), Assignments (x2), Project(x3)

A Linear Regression model’s main aim is to find the best fit

linear line and the optimal values of intercept and
coefficients such that the error is minimized

Error is the difference between the actual value and

Predicted value and the goal is to reduce this difference
• x is our independent variable which is plotted
on the x-axis
• y is the dependent variable which is plotted
on the y-axis
•Black dots are the data points i.e the actual
values
•a0 is the intercept which is 10
• a1 is the slope of the x variable
•The blue line is the best fit line predicted by
a1
the model i.e the predicted values lie on the
blue line.

The vertical distance between the data point

and the regression line is known as error or
a0 residual

Each data point has one residual and the sum

of all the differences is known as the Sum of
Residuals/Errors
Residual/Error = Actual values – Predicted Values

Sum of Residuals/Errors = Sum(Actual- Predicted Values)

Square of Sum of Residuals/Errors = (Sum(Actual- Predicted Values))2

Cost Function (J)
• To figure out the best possible values for a_0 and a_1 which would provide the best fit line for the data
points
• Since we want the best values for a_0 and a_1, we convert this search problem into a minimization problem
where we would like to minimize the error between the predicted value and the actual value

This provides the average squared error over all the data points. Therefore, this cost function is also known as
the Mean Squared Error (MSE) function
Now, using this MSE function we are going to change the values of a_0 and a_1 such that the MSE value settles
at the minima
Linear regression: A sample curve-fitting

• Y = m*x + b

• Method of least
square
• Gradient descent
approach

16
Ordinary Least Squares (OLS) Method

y= m*x+b

Linear Regression Simplified - Ordinary Least Square vs Gradient Descent: [Link]

regression-simplified-ordinary-least-square-vs-gradient-descent-48145de2cf76
[Link]
Gradient Descent

• Gradient descent is a method of updating a_0 and a_1 to reduce the cost function (MSE)
• The idea is that we start with some values for a_0 and a_1 and then we change these
values iteratively to reduce the cost

In the gradient descent

algorithm, the number of steps
you take is the learning rate

This decides on how fast the

algorithm converges to the
minima
Gradient Descent
To find these gradients, we take partial derivatives with respect to a_0 and a_1
Gradient Descent

To find these gradients, we take partial derivatives with respect to a_0 and a_1

The partial derivates are the gradients and they are used to update the values of
a_0 and a_1

Alpha is the learning rate which is a hyper-parameter that you must specify

A smaller learning rate could get you closer to the minima but takes more time to
reach the minima, a larger learning rate converges sooner but there is a chance
that you could overshoot the minima
Multiple/Multivariate Linear
Regression
Simple Linear Regression

One dependent variable and only one independent variable

If “income” is explained by the “education” of an individual then the

regression is expressed in terms of simple linear regression as follows:

Income= b0+ b1*Education + e

e is the model random error

It’s the simplest form of Linear Regression that is used when there is a single input
variable for the output variable
Multiple Linear Regression

We will see how multiple input variables together

influence the output variable, while also learning how
the calculations differ from that of Simple LR model
Multiple Linear Regression

Instead of one, there will be two or more independent variables

If we have an additional variable (let’s say “experience”) in the previos equation, then it
becomes a multiple regression:

Income= b0+ b1Education+ b2Experience+ e

It’s a form of linear regression that is used when there are two or more predictors
Multiple Linear Regression

Multiple linear regression is used to estimate the relationship between two or more
independent variables and one dependent variable. You can use multiple linear regression
when you want to know:

 How strong the relationship is between two or more independent variables and one
dependent variable (e.g. how rainfall, temperature, and amount of fertilizer added
affect crop growth)
 The value of the dependent variable at a certain value of the independent variables
(e.g. the expected yield of a crop at certain levels of rainfall, temperature, and
fertilizer addition)
Applications

• You are a public health researcher interested in social factors that

influence heart disease. You survey 500 towns and gather data on the
percentage of people in each town who smoke, the percentage of
people in each town who bike to work, and the percentage of people in
each town who have heart disease

• Because you have two independent variables and one dependent

variable, and all your variables are quantitative, you can use multiple
linear regression to analyze the relationship between them
Applications

• Prediction of used-car prices based on make, model, year, shift and color
[make, model, year, shi , color] → car prices

• Prediction of the price for a house in the market based on location, lot
size, number of bed rooms, neighborhood characteristics etc.
[loca on, lot size, # bed rooms, crime rate, school ra ngs] → house prices
Simple vs. Multi Linear Regression

Can we use simple linear regression to study our output against all independent
variables separately? NO

Some of these factors will affect the price of the house positively
For example: more the area, the more the price
On the other hand, factors like distance from the workplace, and the crime rate can
influence your estimate of the house negatively

Running separate simple linear regressions will lead to different outcomes when we
are interested in just one
There may be an input variable that is itself correlated with or dependent on some
other predictor. This can cause wrong predictions and unsatisfactory results
Multiple Linear Regression

The line of best fit through the data points is a

straight line, rather than a curve or some sort
of grouping factor
Multiple Linear Regression

Y is the output variable, and X terms are the

corresponding input variables

Notice that this equation is just an extension of

Simple Linear Regression, and each predictor
has a corresponding slope coefficient (β).
Multiple Linear Regression

The first β term (βo) is the intercept constant

(unit value) and is the value of Y in absence of
all predictors (i.e when all X terms are 0)

It may or may or may not hold any significance

in a given regression problem. It’s generally
there to give a relevant nudge to the line/plane
of regression
Multiple Linear Regression

β1, β2, ---βn- Slopes of coefficients wrt x1,

x2, …. xn

If value of x1 has been increased by 1 unit:

β1 says that by how much value it is going to
affect the price (output)
i.e., the effect that increasing the value of the
independent variable has on the predicted y
value
Cost Function
It is a function that assigns a cost to instances where the model deviates from the
observed data. In this case, cost is the sum of squared errors.

The summation of square of difference between our predicted value and the actual value
divided by twice of length of data set
A smaller mean squared error implies a better performance

• m is the number of training example

• ½ is a constant that helps cancel 2 in derivative of the function when doing calculations for
gradient descent
Understanding of Dataset
• The advertising data set consists of
the sales of a product in 200
different markets, along with
advertising budgets for three
different media: TV, radio, and
newspaper.
• The first row of the data says that
the advertising budgets for TV,
radio, and newspaper were $230.1k,
$37.8k, and $69.2k respectively, and
the corresponding number of units
that were sold was 22.1k (or 22,100)
Applying Multi-Regression

In Simple Linear Regression, we can see how each advertising medium affects
sales when applied without the other two media

However, in practice, all three might be working together to impact net sales. We
did not consider the combined effect of these media on sales.

Multiple Linear Regression solves the problem by taking account of all the
variables in a single expression.
Applying Multi-Regression

Finding the values of these constants(β) is what regression model does by

minimizing the error function and fitting the best line or hyperplane (depending
on the number of input variables)

This is done by minimizing the Residual Sum of Squares (RSS), which is obtained
by squaring the differences between actual and predicted outcomes.
Applying Multi Regression to get the coefficients
• If we fix the budget for TV &
Output: newspaper, then increasing the radio
Intercept 2.938889 budget by $1000 will lead to an
increase in sales by around 189
TV 0.045765 units(0.189*1000)
radio 0.188530
newspaper -0.001037
• Similarly, by fixing the radio &
newspaper, we infer an approximate
rise of 46 units of products per $1000
increase in the TV budget
Applying Multi Regression to get the coefficients
• If there is unit increase in sales value, then
we need to increase the expenditure of TV
Output: by 0.045 units and so on.
Intercept 2.938889 • For the newspaper budget, since the
coefficient is quite negligible (close to zero),
TV 0.045765 it’s evident that the newspaper is not
radio 0.188530 affecting the sales.
newspaper -0.001037 • In fact, it’s on the negative side of zero(-
0.001) which, if the magnitude was big
enough, could have meant that this agent is
rather causing the sales to fall.
• So we don’t have to do that much
expenditure
Applying Multi Regression to get the coefficients

Output:
If we run Simple Linear Regression using just
Intercept 2.938889
the newspaper budget against sales, we’ll
observe the coefficient value of around 0.055,
TV 0.045765
which is quite significant in comparison to what
radio 0.188530
we saw above
newspaper -0.001037
Collinearity
How these variables are correlated with each other?

 In multiple linear regression, it is possible that some of the independent variables

are actually correlated with one another, so it is important to check these before
developing the regression model

 If two independent variables are too highly correlated, then only one of them
should be used in the regression model.

ad = pd.read_csv("[Link]")
[Link]()

Download Dataset from: [Link]

Example Code link: [Link]
Collinearity- How these variables are correlated with each other?
 A situation in which two or more input variables are linearly related
 The independent variables are not too highly correlated with each other
Close to 1: Strong Correlation (Ignore)
Close to 0: Weak Correlation

Correlation Matrix for advertising data

Correlation Heatmap for Advertising Data

Dummy Variables (One Hot Encoding)
As we know in the Multiple Regression Model we use a lot of categorical data. Using Categorical Data is a
good method to include non-numeric data into the respective Regression Model. Categorical Data refers to
data values that represent categories-data values with the fixed and unordered number of values, for
instance, gender(male/female). In the regression model, these values can be represented by Dummy
Variables.
These variables consist of values such as 0 or 1 representing the presence and absence of categorical values.
Feature Selection
Forward Selection: We start with a model without any predictor and just the intercept
term. We then perform simple linear regression for each predictor to find the best
performer(lowest RSS). We then add another variable to it and check for the best 2-
variable combination again by calculating the lowest RSS(Residual Sum of Squares). After
that the best 3-variable combination is checked, and so on. The approach is stopped
when some stopping rule is satisfied.

Backward Selection: We start with all variables in the model, and remove the variable
that is the least statistically significant. This is repeated until a stopping rule is reached.
For instance, we may stop when there is no further improvement in the model score.
R squared
 The coefficient of determination (R-squared) is a statistical metric that is used to measure
how much of the variation in outcome can be explained by the variation in the
independent variables

 R2 by itself can't thus be used to identify which predictors should be included in a model
and which should be excluded

 R² (between 0 to 1) is the measure of the degree to which variance in data is explained by

the model. Mathematically, it’s the square of the correlation between actual and
predicted outcomes

 0 indicates that the outcome cannot be predicted by any of the independent variables and
1 indicates that the outcome can be predicted without error from the independent
variables
R squared

R² closer to 1 indicates that the model is good and explains the variance in data well. A value
closer to zero indicates a poor model.
Implement multi-linear regression to calculate
price

3000 sqr ft area, 3 bedrooms, 40 year old ------------- > PRICE?????

2500 sqr ft area, 4 bedrooms, 5 year old ------------- > PRICE?????

[Link]
Nonlinear Regression
Nonlinear Regression

 Non-linear regression is a method to model a non-linear relationship between

the dependent variable and a set of independent variables

 Simple linear regression relates two variables (X and Y) with a straight line (y =
mx + b), while nonlinear regression relates the two variables in a nonlinear
(curved) relationship

48
Nonlinear Regression

If the data shows a curvy trend, then linear regression will not produce very
accurate results when compared to a non-linear regression because, as the
name implies, linear regression presumes that the data is linear

49
Why Nonlinear Regression?

These data points correspond to

China’s GDP from 1960–2014

The first column is the year and

the second is China’s
corresponding annual gross
domestic income in US dollars for
that year

50
Why Nonlinear Regression?

First, can GDP be predicted based on time?

Second, can we use a simple linear regression to model it?

If the data shows a curvy trend, then linear regression would not produce very
accurate results when compared to a non-linear regression
Simply because, as the name implies, linear regression presumes that the data is
linear

51
Why Nonlinear Regression?
The scatter plot shows that there
seems to be a strong relationship
between GDP and time, but the
relationship is not linear
The growth starts off slowly, then
from 2005 onward, the growth is
very significant
Finally, it decelerates slightly in
the 2010
It looks like either a logistical or
exponential function
So, it requires a special estimation
method of the non-linear
regression procedure
52
Methods to handle nonlinear data

 Polynomial Regression

 Linearization of non-linear problems

53
Polynomial Regression

54
Polynomial Regression

Essentially any relationship that is not linear can be termed as non-linear and is
usually represented by the polynomial of degrees (maximum power of )

= + 1 + 2
2 +…… + k
k

= + 1Z1+ 2Z2+…… + kZk  MULTIPLE LINEAR REGRESSION

55
Implementation

[Link]
regression/

56
Logistic Regression
Classification
 Classification is a process of categorizing a given set of data into classes

 The process starts with predicting the class of given data points

 The classes are often referred to as target, label or categories

Binary vs. Multi-Classification
Classification Vs. Regression
Parameter CLASSIFICATION REGRESSION

The mapping function is used for Mapping Function is used for

Basic mapping values to predefined the mapping of values to
classes continuous output

Involves prediction of Discrete values Continuous values

by measurement of root
Method of calculation by measuring accuracy
mean square error

Logistic Regression, Naïve Bayes, K- Simple Linear Regression,

Example Algorithms Nearest Neighbor, Support Vector Multi Linear Regression, Non-
Machines, Decision Trees, etc. linear Regression etc.
Logistic Regression
A learning algorithm that we use when the output labels Y in a supervised learning
problem are all either zero or one

FOR BINARY CLASSIFICATION PROBLEM

Logistic Regression

When the dependent variable(target) is categorical

For example,
To predict whether an email is spam (1) or not (0)

Linear Regression Continuous (can’t apply)

Logistic Regression model builds a regression model to predict the probability that a
given data entry belongs to the category numbered as “1”
Linear vs. Logistic Regression
Sigmoid Function (Threshold=0.5)
Sigmoid Function
Sigmoid Function σ(z) =

z = wi xi + b
= w1x1 + w2x2 +… wnxn + b
= w1x1 + w2x2 +… wnxn + b
= wTx+b

y= ( ) = ( )
Sigmoid Function
 If ‘z’ goes to infinity,
Y(predicted) will become 1

 If ‘z’ goes to negative infinity,

Y(predicted) will become 0

If z is very large-> will be close to

zero
So Sigmoid σ(Z) =1
If z is very small or very large negative number:
So Sigmoid σ(Z) =0

*Keep b and W as separate parameters

Cost Function

To train the parameters W and b of logistic regression, we need to define

cost function

The Weights (W) are trained so as to minimize the cost defined as Mean
Squared Error (MSE)
Loss (Error) Function

To measure how well our algorithm is doing

Loss function would be the square root error:

L(y',y) = 1/2 (y' - y)^2

We want this as small as possible

But we won't use this notation because it leads us to optimization problem which is non
convex, means it contains local optimum points
Loss (Error) Function

If we try to use the cost function of the

linear regression in ‘Logistic Regression’
then it would be of no use as it would end
up being a non-convex function with many
local minimums, in which it would be
very difficult to minimize the cost value and
find the global minimum
Loss (Error) Function

So, Gradient Descent may not find global

optimum

We need to define to measure how good

our output y-hat is when the true label is y
Loss (Error) Function

So, we define a different loss

function for logistic regression that
plays a similar role as squared error,
that will give us an optimization
problem (convex)

L(y',y) = - (ylog(y') + (1-y)log(1-y'))

*We want this as small as possible

Loss (Error) Function

L(y',y) = - (ylog(y') + (1-y)log(1-y'))

For single training example

CASE 1: If y = 1 ==> L(y',1) = -log(y') ==> we want y' to be the largest ==> y' biggest value
is 1
(But y' can’t be larger that 1 (sigmoid function). So we want it to be close to 1

CASE 2: If y = 0 ==> L(y',0) = -log(1-y') ==> we want 1-y' to be the largest ==> y' to be
smaller as possible because it can only has 1 value
Loss (Error) Function
For single training example
For logistic regression, the Cost function is defined as:
−log(y') if y = 1
−log(1−y') if y = 0

If y=1, when it predicts correctly (positive class in 100% probability), the

sample cost j is 0; the cost keeps increasing when it is less likely to be the
positive class; when it incorrectly predicts that there is no chance to be the
positive class, the cost is indefinitely high
Loss (Error) Function
For single training example
For logistic regression, the Cost function is defined as:
−log(y') if y = 1
−log(1−y') if y = 0

If y=0, when it predicts correctly (negative class in 100% probability), the

sample cost j is 0; the cost keeps increasing when it is more likely to be the
positive class; when it incorrectly predicts that there is no chance to be the
negative class, the cost is indefinitely high
Loss (Error) Function
Cost Function

J(w) = L(y ,y)

J(w) = y log(y ) + (1−y) log(1−y )

 The loss function computes the error for a single training example
 The cost function is the average of the loss functions of the entire
training set
 It is also known as logarithmic loss or log loss
Gradient Descent

• Now the question arises, how do we reduce the cost value

• This can be done by using Gradient Descent
• The main goal of Gradient descent is to minimize the cost value. i.e. min J(w).

J(w) = y log(y ) + (1−y) log(1−y )

Gradient Descent

• To minimize the cost function, we need to run the gradient descent function on
each parameter i.e.

w = w - alpha * dw
Gradient Descent

• We want to predict w and b that minimize the cost function

• Our cost function is convex

• First we initialize w and b to 0,0 or initialize them to a random value in the convex
function and then try to improve the values the reach minimum value

• In Logistic regression we generally always use 0,0 instead of random.

Gradient Descent
Gradient Descent
Gradient Descent

• The gradient decent algorithm repeats: w = w - alpha * dw

where alpha is the learning rate (how big a step we take on each iteration or GD)
dw is the derivative of w (Change to w), The derivative is also the slope of w

• The derivative give us the direction to improve our parameters

• The actual equations we will implement:

 w = w - alpha * d(J(w,b) / dw) (how much the function slopes in the w direction)
 b = b - alpha * d(J(w,b) / db) (how much the function slopes in the b direction)
Gradient Descent
Lower value of “alpha” is preferred, because if the learning rate is a big number then we may
miss the minimum point and keep on oscillating in the convex curve
Derivatives (slope of function)

• Derivative of a linear line is its slope

• ex. f(a) = 3a = d(f(a))/d(a) = 3

• if a = 2 then f(a) = 6

• if we move little bit a = 2.001 then f(a) = 6.003

means
that we multiplied the derivative (Slope) to the moved area and added it to the
last result
• In straight line, derivative or slope is same everywhere
Derivatives
• f(a) = a^2 ==> d(f(a))/d(a) = 2a
a = 2 ==> f(a) = 4
a = 2.0001 ==> f(a) = 4.0004 approx.

• f(a) = a^3 ==> d(f(a))/d(a) = 3a^2

• f(a) = log(a) ==> d(f(a))/d(a) = 1/a

To conclude, Derivative is the slope and slope is different in different points in the
function
Computational Graph

J(a,b,c) = 3(a+bc)

u = bc
v = a+u
J = 3v
Computational Graph

In Logistic Regression, J is the cost function that we try to minimize

Path: Left to Right

But for derivatives there is Right to Left computation (Backward) to yield derivative of
final output variable
Derivatives with Computational Graph

Calculus chain rule says: If x -> y -> z (x effects y and y effects z)

Then d(z)/d(x) = d(z)/d(y) * d(y)/d(x)

dJ/dV=?
If we take value of V and change it little bit, How would value of J change?
Derivatives with Computational Graph

=??? =3
Derivatives with Computational Graph

a = 5 = 5.001
v = 11 = 11.001
=1 J =33 = 33.003

= = 3*1= 3
Derivatives with Computational Graph
Derivatives with Computational Graph
Logistic Regression Derivatives

We want to modify the parameters w and b to reduce the loss (L)

So compute derivative with this loss (means backward)

Logistic Regression Derivatives

We want to modify the parameters w and b to reduce the loss (L)

So compute derivative with this loss (means backward)

Logistic Regression Derivatives

Then from right to left we will calculate derivations compared to the result:

d(a) = d(L)/d(a) = -(y/a) + ((1-y)/(1-a))

d(z) = d(L)/d(z) = a - y
d(W1) = X1 * d(z)
d(W2) = X2 * d(z)
d(b) = d(z)
Logistic Regression Derivatives

w1 = w1 - alpa * dw1
w2 = w2 - alpa * dw2
b = b - alpa * db

We are done with computing derivatives and implement GD w.r.t a single training example
For m training examples-> divide by m
Logistic Regression Pseudo code
J = 0; dw1 = 0; dw2 =0; db = 0; # Devs.
w1 = 0; w2 = 0; b=0; # Weights
for i = 1 to m
# Forward pass
z(i) = W1*x1(i) + W2*x2(i) + b
a(i) = Sigmoid(z(i))
J += (-Y(i)*log(a(i)) + (1-Y(i))*log(1-a(i)))

# Backward pass
dz(i) = a(i) - Y(i)
dw1 += dz(i) * x1(i)
dw2 += dz(i) * x2(i)
db += dz(i)
J /= m
dw1/= m
dw2/= m
db/= m

# Gradient descent
w1 = w1 - alpa * dw1
w2 = w2 - alpa * dw2
b = b - alpa * db
Practical: Implement logistic regression to
predict Heart Disease

[Link]

Understanding Linear Regression Techniques
No ratings yet
Understanding Linear Regression Techniques
61 pages
Linear Regression Overview in Data Science
100% (1)
Linear Regression Overview in Data Science
14 pages
Linear & Logistic Regression Guide
No ratings yet
Linear & Logistic Regression Guide
34 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
11 pages
Linear Regression Overview and Implementation
No ratings yet
Linear Regression Overview and Implementation
36 pages
Linear Regression Basics Explained
No ratings yet
Linear Regression Basics Explained
6 pages
Linear Regression Lecture
No ratings yet
Linear Regression Lecture
13 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
19 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
36 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
9 pages
Reference Material Ies
No ratings yet
Reference Material Ies
31 pages
Linear Regression in Python Explained
No ratings yet
Linear Regression in Python Explained
18 pages
Understanding Linear Regression Techniques
No ratings yet
Understanding Linear Regression Techniques
45 pages
Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
45 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
51 pages
Understanding Regression Techniques in ML
No ratings yet
Understanding Regression Techniques in ML
59 pages
Understanding Regression in Machine Learning
No ratings yet
Understanding Regression in Machine Learning
76 pages
Supervised Learning: Regression Models
No ratings yet
Supervised Learning: Regression Models
123 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
28 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
24 pages
Understanding Linear Regression in ML
No ratings yet
Understanding Linear Regression in ML
9 pages
Understanding Linear Regression Variables
No ratings yet
Understanding Linear Regression Variables
18 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
20 pages
Chapter 2 - Supervised Learning
No ratings yet
Chapter 2 - Supervised Learning
76 pages
Regression
No ratings yet
Regression
14 pages
Statistical Decision Theory & Linear Regression
No ratings yet
Statistical Decision Theory & Linear Regression
16 pages
Linear Regression Fundamentals
No ratings yet
Linear Regression Fundamentals
4 pages
ML Notes Mod 2
No ratings yet
ML Notes Mod 2
28 pages
Understanding Linear Regression Basics
100% (1)
Understanding Linear Regression Basics
8 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
29 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
30 pages
Understanding Linear Regression in ML
No ratings yet
Understanding Linear Regression in ML
9 pages
Linear Regression in Machine Learning
No ratings yet
Linear Regression in Machine Learning
10 pages
Data Science Regression Techniques Guide
No ratings yet
Data Science Regression Techniques Guide
27 pages
Unit-3 Ai ML
No ratings yet
Unit-3 Ai ML
180 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
17 pages
Understanding Linear Regression in ML
No ratings yet
Understanding Linear Regression in ML
8 pages
Linear Regression Explained: Types & Uses
No ratings yet
Linear Regression Explained: Types & Uses
5 pages
Understanding Linear Regression in ML
No ratings yet
Understanding Linear Regression in ML
17 pages
Regression Analysis: Types and Challenges
No ratings yet
Regression Analysis: Types and Challenges
17 pages
Supervised Learning & Linear Regression Guide
No ratings yet
Supervised Learning & Linear Regression Guide
37 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
2 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
81 pages
Linear Regression in Machine Learning
No ratings yet
Linear Regression in Machine Learning
5 pages
ML Exp 1
No ratings yet
ML Exp 1
4 pages
ML Unit-4
No ratings yet
ML Unit-4
27 pages
Linear Regression Explained: Concepts & Applications
No ratings yet
Linear Regression Explained: Concepts & Applications
25 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
46 pages
Predictor Variables for IMDB Ratings
No ratings yet
Predictor Variables for IMDB Ratings
43 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
83 pages
Linear Regression Explained: Basics & Uses
No ratings yet
Linear Regression Explained: Basics & Uses
2 pages
Supervised Learning: Regression & Classification
No ratings yet
Supervised Learning: Regression & Classification
18 pages
Linear Regression: Simple & Multiple Models
No ratings yet
Linear Regression: Simple & Multiple Models
43 pages
Linear Regression
No ratings yet
Linear Regression
32 pages
Linear Regression in Machine Learning
No ratings yet
Linear Regression in Machine Learning
54 pages
Supervised Learning: Regression & Classification
No ratings yet
Supervised Learning: Regression & Classification
136 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
29 pages
Correlation Analysis with PROC CORR
No ratings yet
Correlation Analysis with PROC CORR
36 pages
CFA Level 1 Quantitative Methods Guide
No ratings yet
CFA Level 1 Quantitative Methods Guide
17 pages
Statistics For Business and Economics: Simple Regression
No ratings yet
Statistics For Business and Economics: Simple Regression
64 pages
Marketing Strategies for University Success
No ratings yet
Marketing Strategies for University Success
15 pages
Grade 9 Regression Analysis Activity Sheets
No ratings yet
Grade 9 Regression Analysis Activity Sheets
5 pages
Understanding Linear Regression Concepts
No ratings yet
Understanding Linear Regression Concepts
16 pages
EViews Homework: CEO Salary Analysis
No ratings yet
EViews Homework: CEO Salary Analysis
13 pages
Management Accounting Quiz 1
No ratings yet
Management Accounting Quiz 1
6 pages
Machine Learning Techniques Overview
No ratings yet
Machine Learning Techniques Overview
6 pages
OLS Regression Interpretation in ECON2280
No ratings yet
OLS Regression Interpretation in ECON2280
69 pages
Simple Linear Regression Explained
No ratings yet
Simple Linear Regression Explained
45 pages
Ch13 Multiple Regres
No ratings yet
Ch13 Multiple Regres
46 pages
Supervised Learning: KNN & Decision Trees
No ratings yet
Supervised Learning: KNN & Decision Trees
38 pages
Overview of Materials Management Concepts
No ratings yet
Overview of Materials Management Concepts
51 pages
Statistical Methods for Agriculture
No ratings yet
Statistical Methods for Agriculture
151 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
19 pages
Simple Linear Regression Equation Explained
No ratings yet
Simple Linear Regression Equation Explained
12 pages
Key Machine Learning Algorithms Explained
No ratings yet
Key Machine Learning Algorithms Explained
44 pages
Multiple Regression Analysis Assignments
No ratings yet
Multiple Regression Analysis Assignments
3 pages
Econometrics Exercises and Problems
No ratings yet
Econometrics Exercises and Problems
30 pages
Econometrics: Linear Regression Models
No ratings yet
Econometrics: Linear Regression Models
13 pages
Data Science Project: Regression Analysis
No ratings yet
Data Science Project: Regression Analysis
16 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
9 pages
Simple vs Multiple Linear Regression
100% (2)
Simple vs Multiple Linear Regression
39 pages
Module 5-Classification and Regression Techniques (Data Science)
No ratings yet
Module 5-Classification and Regression Techniques (Data Science)
35 pages
Data Analytics Skills for Managers Exam
No ratings yet
Data Analytics Skills for Managers Exam
12 pages
Hair PPT Ch05
No ratings yet
Hair PPT Ch05
18 pages
Understanding SSE in Regression Analysis
No ratings yet
Understanding SSE in Regression Analysis
25 pages
Economics and Business Ethical Behavior and Organizational Innovation Analysis of Small and Medium Sized Enterprises in Latvia
No ratings yet
Economics and Business Ethical Behavior and Organizational Innovation Analysis of Small and Medium Sized Enterprises in Latvia
15 pages

Applications of Regression Analysis

Uploaded by

Applications of Regression Analysis

Uploaded by

Regression

Simple and Multiple Linear Regression, Nonlinear Regression, Logistic

• Supervised Learning It is the machine learning task of inferring

• Unsupervised Learning refers to the problem of trying to find

Shows a linear relationship between a dependent (y) and one or more

Dependent variable is continuous in nature

So, this regression technique finds out a linear

When the value of x increases, the value of y is

X (input) is the work experience and Y (output)

The regression line is the best fit line for our

X_1: amount of fertilizer,

Which feature is more important?

A Linear Regression model’s main aim is to find the best fit

Error is the difference between the actual value and

The vertical distance between the data point

Each data point has one residual and the sum

Sum of Residuals/Errors = Sum(Actual- Predicted Values)

Square of Sum of Residuals/Errors = (Sum(Actual- Predicted Values))2

Linear Regression Simplified - Ordinary Least Square vs Gradient Descent: [Link]

In the gradient descent

This decides on how fast the

One dependent variable and only one independent variable

If “income” is explained by the “education” of an individual then the

Income= b0+ b1*Education + e

We will see how multiple input variables together

Instead of one, there will be two or more independent variables

Income= b0+ b1*Education+ b2*Experience+ e

• You are a public health researcher interested in social factors that

• Because you have two independent variables and one dependent

The line of best fit through the data points is a

Y is the output variable, and X terms are the

Notice that this equation is just an extension of

The first β term (βo) is the intercept constant

It may or may or may not hold any significance

β1, β2, ---βn- Slopes of coefficients wrt x1,

If value of x1 has been increased by 1 unit:

• m is the number of training example

Finding the values of these constants(β) is what regression model does by

 In multiple linear regression, it is possible that some of the independent variables

Download Dataset from: [Link]

Correlation Matrix for advertising data

Correlation Heatmap for Advertising Data

 R² (between 0 to 1) is the measure of the degree to which variance in data is explained by

3000 sqr ft area, 3 bedrooms, 40 year old ------------- > PRICE?????

2500 sqr ft area, 4 bedrooms, 5 year old ------------- > PRICE?????

 Non-linear regression is a method to model a non-linear relationship between

These data points correspond to

The first column is the year and

First, can GDP be predicted based on time?

Second, can we use a simple linear regression to model it?

 Linearization of non-linear problems

= + 1Z1+ 2Z2+…… + kZk  MULTIPLE LINEAR REGRESSION

 The classes are often referred to as target, label or categories

The mapping function is used for Mapping Function is used for

Involves prediction of Discrete values Continuous values

Logistic Regression, Naïve Bayes, K- Simple Linear Regression,

FOR BINARY CLASSIFICATION PROBLEM

When the dependent variable(target) is categorical

Linear Regression Continuous (can’t apply)

 If ‘z’ goes to negative infinity,

If z is very large-> will be close to

*Keep b and W as separate parameters

To train the parameters W and b of logistic regression, we need to define

To measure how well our algorithm is doing

Loss function would be the square root error:

L(y',y) = 1/2 (y' - y)^2

We want this as small as possible

If we try to use the cost function of the

So, Gradient Descent may not find global

We need to define to measure how good

So, we define a different loss

L(y',y) = - (y*log(y') + (1-y)*log(1-y'))

*We want this as small as possible

L(y',y) = - (y*log(y') + (1-y)*log(1-y'))

If y=1, when it predicts correctly (positive class in 100% probability), the

If y=0, when it predicts correctly (negative class in 100% probability), the

Income= b0+ b1Education+ b2Experience+ e

L(y',y) = - (ylog(y') + (1-y)log(1-y'))

L(y',y) = - (ylog(y') + (1-y)log(1-y'))