0% found this document useful (0 votes)
6 views

Linear Regression

Uploaded by

ssreya873
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Linear Regression

Uploaded by

ssreya873
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Supervised

Learning
-
Classification
Linear Regression: Prediction Model
X (years of Y (salary, Rs
Given one variable X experience) 1,000)
Goal: Predict value of Y 3 30
Example: 8 57
◦ Given Years of Experience 9 64
◦ Predict Salary 13 72
Questions: 3 36
◦ When X=10, what is Y? 6 43
◦ When X=25, what is Y? 11 59
◦ This is known as regression
21 90
1 20
16 83
LINEAR REGRESSION
• Plot a graph between the cost and area of the house
Area Cost (Lakh) • The area of the house is represented in the X-axis while
(sq.feet) X Y cost is represented in Y-axis
1000 30 • What will Regression Do?
1200 40 • Fit the line through these points
1300 50
1450 70
1495 70
1600 80

Cost

Area
LINEAR REGRESSION

Predict for House area=1100?

Need to represent this line mathematically!


Regression
We assume that we have feature variables:
◦ or independent variables

The target variable is also known as dependent variable.

We are given a dataset of the form where, is a -dimensional feature vector (real), and a real
value

We want to learn a function which given a feature vector predicts a value that is as close as
possible to the value or itself

Minimize error or sum of square of the difference between actual value y i or predicted value :
Linear Regression Example
Linear Regression: Y=3.5*X+23.2
Y =
120

100

80
Salary

60

40

20

0
0 5 10 15 20 25
Years
Linear Regression - Visualization
The Simple Regression Model
Fit as good as possible a regression line through the data points:

Fitted regression line


For example, the i-th
data point
The Simple Regression Model
Definition
A simple regression of y on x explains variable y in terms of a
single variable x

Slope parameter
Intercept

Y =

Dependent variable,
LHS variable,
explained variable, Independent variable,
response variable,… RHS variable,
explanatory variable,
Control variable,…
The Simple Regression Model
Example: Soybean yield and fertilizer

Rainfall,
land quality,
presence of parasites, …
Measures the effect of fertilizer on
yield, holding all other factors fixed

Example: A simple wage equation

Labor force experience,


tenure with current employer,
work ethic, intelligence, …
Measures the change in hourly wage
given another year of education,
holding all other factors fixed
Linear regression
• Univariate Linear regression

Training Set

Learning Algorithm

Estimated Model Representation


price
Size of H h ( x)   0  1 x
house (predicted
(x) y)
Hypothesis

is are the parameters

Lets visualize this hypothesis

Consider 0 = 1.5 and 1 = 0

h(x) = 1.5 + 0x
Hypothesis
Consider 0 = 0 and 1 = 1 Consider 0 = 0 and 1 = 0.5
h(x) = 0 + 1x h(x) = 0 + 0.5x
Linear Regression Example
Linear Regression: Y=3.5*X+23.2

120

100

80
Salary

60

40

20

0
0 5 10 15 20 25
Years
Basic Idea

Learn a linear equation Y =

◦ To be learned:
Example
Salary Dataset
X Y
Years of Salary in
Experience 1000s
3 30
8 57
9 64
13 72
3 36
6 43
11 59
21 90
1 20
16 83
Multivariate models
simple regression model
(Yrs_Experience) x y (Income)

Multivariate or multiple regression model


(IQ) x1

(Work_score) x2
y (Income)
(Yrs_Experience) x3

(Age) x4
More than one prediction attribute

Consider two independent attributes X1, X2 and a dependent variable


Y
For example,
◦ X1=‘years of experience’
◦ X2=‘age’
◦ Y=‘salary’

Equation: Y    1 x1   2 x2
Outliers

Regression is sensitive to outliers:


◦ The line will “tilt” to accommodate very extreme values

Solution: remove the outliers


◦ But make sure that they do not capture useful information
Normalization

In the regression problem sometimes our features may have very different scales:
◦ For example: predict the GDP of a country using the features - number_of_properties and the income
◦ The weights in this case will not be interpretable

Solution: Normalize the features by replacing the values with the z-scores

You might also like