1.a. Regression
1.a. Regression
R.Kasilingam
Types Regression
• Simple Regression
• Multiple Regression
• Step-wise Regression
• Hierarchical Regression
• Dummy
• Binary
• Multi-nominal
• Quantile Regression
Purpose
• To find the influence of one variable (x) on another (y)
• To find out extent of influence
• So one variable will be independent (x) and one variable
is dependent (y)
• The influence is measured by beta
• So the equation is
• Y = a+ bX
• a is called intercept
• b is beta or Regression co-efficient
Y=a+bX
Y=a+bX
• When X =0 then
• bx = b*0 =0
• Then Y= a
• A is a point of intersection
Y=a+bx
• If b =2, a=5
• Let us assume X is 10 then
• Y = 5+(2*10)=25
• Let me increase X by 1 which
means X=11 then
• Y=5+(2*11) =27 which means Y
increases by 2
• So b or beta is when there is a one
unit increase in X what is the
increase in Y
• So b or beta explains extent of
influence of X on Y
• b or beta is regression Co-efficient
Y =5+2X
X A B Y Y=a+bx
10 5 2 25 5+(2*10)=25
11 5 2 27 5+(2*11)=27
• β=dy/dx
• β =2/1 =2
• Beta is slope
• Correlation and Regression difference
What is a and b?
Y X Y
2 1 3
4 2 5
6 3 7
8 4 9
10 5 11
12 6 13
14 7 15
16 8 17
18 9 19
20 10 21
Hypothesis
• Null : There is no influence
• Which means b=0
• Alternate : There is a influence
Data set
• Family Income
• Total Income – Metric
• Total Monthly Savings - Metric
Testing
• If the sig. value is less than 0.05 then reject the Null
hypothesis
• Sig < .05 – Reject H0
Testing
• If the sig. value is less than 0.05 then reject the Null
hypothesis
• Sig < .05 – Reject H0
• It is a probability of committing type 1 error
• Type 1 error is rejecting H0 when it is true
• When sig value is less than 0.05 our error is less Which
means we are not committing error
• So we can reject H0
Coefficientsa
Intercept
t statistics
• ƐY = Na+bƐX
• ƐXY = aƐX+bƐ
• S y/x is a SE for Regression
• =SE for intercept
• =SE for beta
Confidence Interval
• Lower limit =Co-efficient -1.96*SE
• Upper Limit =Co-efficient +1.96*SE
SE
• Use the standard error of the coefficient to measure the
precision of the estimate of the coefficient. The smaller the
standard error, the more precise the estimate.
• The standard error of the Stiffness (variable) coefficient is
smaller than that of Temp (variable). Therefore, our model
was able to estimate the coefficient for Stiffness with
greater precision.
Save
Predicted and Residuals
• It is a prediction technique
• Y = + Residuals
• Y=Predicted +Residual
• Y +e
• Residuals = Y -
35000
30000
f(x) = 0.161314408073057 x + 3281.03310190484
R² = 1
25000
20000
Predicted
Series1
15000
Linear (Series1)
10000
5000
0
0 20000 40000 60000 80000 100000 120000 140000 160000 180000
Income
ANOVA
ANOVAa
Model Sum of Squares df Mean Square F Sig.
Degrees of Freedom:
dfRegression = No. of Predictor variables (k) (1)
dfResidual = n-k-1 (552-1-1=550)
dfTotal = n-1 (552-1)
F Value
• R is correlation value
• The relationship between actual Y and Predicted Y ()
• /n-2
2
𝑅
• is degree of determination
• =.163
• /n-2
• /n-2 (MSE)
• /n-2
• Explained/Predicted/Regressed
• Unexplained/Not Predicted/Residual
2
𝑅 𝐴𝑑𝑗𝑢𝑠𝑡𝑒𝑑
• 1-.838=0.162
• = .5 = .4
• =2.4, =3
2
𝑅 𝐴𝑑𝑗𝑢𝑠𝑡𝑒𝑑
Model Sum of Squares df Mean Square F Sig.
• What is R Square
Standardized-Descriptives
Descriptive – Save Standardized
Changed Data Set
Regression with new variables
Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) -2.776E-16 .039 .000 1.000
Coefficientsa
Ztotalin Ztotsav
N Valid 552 552
Missing 0 0
Mean .0000000 .0000000
Std. Deviation 1.00000000 1.00000000
Variance 1.000 1.000
Sample Size
• 10 to 15 cases for each predictor
• 50+8k or 104+k whichever is larger here k is number of
predictors
Y i = β1 + β 2 X i + u i
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 59.234 31.477 1.882 .062
mpg -.267 1.300 -.017 -.206 .837
a. Dependent Variable: sales
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) -4.618 .888 -5.203 .000
LCO2 .636 .064 .809 9.920 .000
a. Dependent Variable: LLE
Model Summaryb
Total 504567800.00 11
Total 504567800.00 11
a. Dependent Variable: GOLD
b. Predictors: (Constant), GDP
c. Predictors: (Constant), GDP, GDP_C
d. Predictors: (Constant), GDP, GDP_C, GDP_S
Thank You