0% found this document useful (0 votes)
30 views31 pages

LAB04 RegressionTasks

Uploaded by

areebshoukat26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views31 pages

LAB04 RegressionTasks

Uploaded by

areebshoukat26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 31

Linear

Regression
and logistic
regression
Task no . 1:

Explain Linear Regression with example (Lab work)

Answer:

 Regression:

Regression analysis is a form of predictive modelling technique, which

investigates the relationship between dependent and independent variables.


Uses of regression:

Three major uses for regression analysis are

 Determining the strength of predictors.

 Forecasting an effect.

 Trend forecasting.

Linear Regression:

Linear Regression is a statistical method that can be used to find the relationship

between predictor features and independent or dependent variables. It solves

regression problem. The graph shows straight line.

Use of linear regression:

Linear regression is used for

 Evaluating Trends and Sales Estimates.

 Analyzing the impact of Price Changes.

 Assessment of risk in financial services and insurance domain.

Formula:

y=mx+c

Where as

y = Output / Dependent Variable.

m = Slope.
X = Input / Independent Variable.

C = Constant / y-intercept.

For example:

x y (Actual Values)

1 3

2 4

3 2

4 4

5 5

First we find mean of x and y values by using formula.


Mean=
∑ of all values
Total Values .

Mean of x:

1+2+3+ 4+5
∑ x= 5

∑ x =3

Mean of y:

3+ 4+ 2+4 +5
∑ y= 5

∑ y=3.6

Now, we use linear equation.

y=mx+c equation (A)

y=m x +c equation (B)

Now we find slope by using formula.

m=
∑ ( x−x)( y − y)
¿¿

Put the values in formula.

m¿ ( 1−3 ) ( 3−3.6 ) + ( 2−3 )( 4−3.6 ) + ( 3−3 )(¿2−3.6


¿
) + ( 4−3 )( 4−3.6 ) +(5−3)(5−3.6)
4
m= =0.4
10

Now, we find c. Put the values in equation (B)

3.6=0.4 (3)¿+ c

c=3.6−1.2

c=2.4

Prediction values of y:

We find prediction values of y by using values of m ,x, and c in equation (A).

y1:

y 1=1 ( 0.4 )+ 2.4=2.8

y2 :

y 2=2 ( 0.4 ) +2.4=0.8+2.4=3.2

y3 :

y 3=3 ( 0.4 ) +2.4=1.2+2.4=3.6

y4 :

y 1=4 ( 0.4 )+ 2.4=1.6+2.4 = 4.0

y5 :

y 5=5 ( 0.4 ) +2.4=2.0+2.4=4.4


R2:

R =∑ ¿ ¿ ¿
2

2
R =¿¿

2
R =0.307

2
R =0.307 ×100 %=30.7 %

R- squared : R-squared value is a statistical measure of how close the data are

to the fitted the regression line. It is also known as coefficient of determination,

or the coefficient of multiple determination.

If value of R2 = 1 its mean no error occur 100 % line is fitted.

If value of R2 = 0 its mean error occur 100 % line is not fitted.

Task no . 2:
Implement Linear Regression from scratch (Lab work)
Task no . 3:

Implement Linear Regression on HeadBrain dataset from scratch (Lab work)


Task no . 4:

Implement Linear Regression using built-in Model (Lab work)


Task no . 5:

Explain and implement Logistic Regression on any dataset (Homework)

Answer:

Logistic Regression:

Logistic Regression produces result in a binary format which is used to predict the outcome

of a categorical dependent variable. It solves classification problems. The graph represent S-

Curve.

So the outcome should be discrete / categorical such as:

 0 or 1.

 Yes or No.

 True or False.

 High and Low.

Logistic regression will allow you to analyze the set of variables and predict a categorical

outcome. Since here we need to predict whether she will get into school or not, which is a

classification problem, logistic regression will be used.


Why are we not using Linear regression in this case?

The reason is that linear regression is used to predict a continuous quantity, rather than

categorical one. Here we are going to predict whether or not your sister is going to get into

grad school. So that is clearly a categorical outcome. So when the result in outcome can take

only classes of values, like two classes of values, it is sensible to have a model that predicts

the value as either 0 or 1, or in a probability form ha ranges between zero and one. So linear

regression does not have this ability. If we use linear regression to model a binary outcome,

the resulting model will not predict y values in the range of 0 and 1, because linear

regression works on continuous dependent variables, and not on categorical variables.

That’s why we make use of logistic regression. So we understand that linear regression is

used to predict continuous quantities, and logistic regression is used to predict categorical

quantities .

Logistic regression is primary technique. It is similar to linear regression because it belongs

to generate linear models. It belongs to same class as linear regression , but there is no reason

behind the name logistic regression. Logistic regression is mainly used for classification

purpose because here we will have to predict a dependent variable which is categorical in

nature .

Logistic regression equation is derived from same equation, except we need to make a few

alterations, because the output is only categorical. So logistic regression does not calculate

necessary the outcome as zero or one.

Logistic Regression Equation:

The logistic regression equation is derived from the Straight line Equation.
Equation of straight line :

y=β 0 + β 1 x 1 + β 1 x 2 +…+ ε . Range is from −∞ ¿ ∞ .

whereas

β 0=c =Constant∨ y intercept .

β 1=m=Slope .

x = Input / Independent Variable.

y = Output / Dependent Variable.

Let’s try to reduce the Logistic Regression Equation from Straight Line Equation.

β 0 + β 1 x 1 + β1 x 2 + …

In logistic regression y can be only from 0 to 1.

Now , to get the range of y between 0 and infinity , let’s transform y.

y
y−1

where as y = 0 then and y =1 then infinity.

Now, the range is between 0 to infinity.

Let us transform it further to get range between – (infinity) and (infinity).

y
log [¿ ]=¿ y =β 0+ β1 x 1+ β 1 x2 +… ¿ (Final logistic regression).
1− y

How logistic regression work:


Take a look at this graph.
Now I told you that the outcome in a logistic regression is categorical. Your outcome will

either be 0 or 1, or it will be a probability form ha ranges between zero and one. Now, some

of you might think that why do we have an S curve. We can obviously have a straight line.

We have something known as a Sigmoid curve, because we have values ranging between

zero and one, which will basically show the probability. So may be our output will be 0.7,

which is a probability value. if it is 0.7, it means that your outcome is basically one. So that’s

why we have this sigmoid curve like this.

First we take a linear regression equation.

y=β 0 + β 1 x 1 +ϵ

Represent a relationship between p(X) = Pr(Y=1|X) and X. here because this Pr denotes

Probability and this value basically denotes that the probability of y = 1 , given some value

of x, If you wanted to calculate probability using the linear regression model, then the

probability will look something like

(β 0+ β1 x)
e
P ( X )=
e (β + β x) +1
0 1

Now the next step is to calculate something known as a logic function. Now, the logic

function is nothing, but it is a link function that is represented as an S curve or as a sigmoid


curve that ranges between the value 0and 1 it basically calculates the probability of the output

variable.

(β0 +β1 x)
e
P ( X )= (β0+ β x)
e +1

p (e +1 ) =e
( β0 +β x ) ( β0+ β1 x)

p . e(
β 0+ β x )
+ p=e (
β0 +β 1 x )

p=e(
β 0+ β 1 x )
− p . e(
β0 + β x )

p=e(
β 0+ β 1 x )
( 1− p )

p
=e (
β +β 0 1 x)

( 1− p )

ln
[ p
( 1− p ) ]
=β 0+ β 1 x

Uses of logistic regression:

Logistic regression is used across many scientific fields. In Natural Language Processing
(NLP), it’s used to determine the sentiment of movie reviews, while in Medicine it can be
used to determine the probability of a patient developing a particular disease.

Implement Logistic Regression:

Following are these.

 Collect Data: Importing Libraries.


 Analyzing Data: Creating different plots to check relationships between them.
 Data Wrangling: Clean the data by removing Nan values and unnecessary columns in
data set.
 Train and Test Data: Build the model on train data and predict the output on test data.
 Accuracy Check: Calculate accuracy to check how accurate your values are.

From Scratch:
Using Built in Commands:

You might also like