Multivar 2 - Simple and Multiple Regression PDF
Multivar 2 - Simple and Multiple Regression PDF
Faculty of Engineering
Gadjah Mada University
Type of relationship
Number of predicted variables
Type of relationship
Measurement scale of
the dependent variable
: dependence
: one
: single
: metric
Purpose:
To predict the changes in the dependent variable
as a result of changes in the independent
variables.
1
3
4
5
7
6
8
10
14
20
What is regression?
The problem of fitting a line to the data, i.e. pairs of
numbers (x,y).
The problem of predicting one variable (y) from
values of another variable (x).
To use data on a quantitative independent variable
to predict or explain variation in a quantitative
dependent variable (Ott, 2001). Prediction refers to
future values, explanation refers to current or past
values; both requires unit of association.
where
0 : the intercept, the value of y when x = 0
1 : the slope, the change in y when there is one-unit
change in x
y = 0 + 1x
where
: random error, deviation of actual y values from their
predicted values (unpredictable and ignored factors)
y = 0 + 1x
(xi x). yi
1 =
i=1
n
(xi x)2
i=1
0 = y - 1x
The mean squared error (MSE)
MSE = [(n-1)sy2 - 12(n-1)sx2]/(n-2)
where
MS : the mean square
SS
: sum of the squares
v
: number of degree of freedom = n-1(sampling)
School
1
2
3
4
5
6
7
8
9
10
37.01
26.51
36.51
40.70
37.10
33.90
41.80
33.40
41.01
37.20
7.20
-11.71
12.32
14.28
6.31
6.16
12.70
-0.17
9.85
-0.05
School
11
12
13
14
15
16
17
18
19
20
23.30
35.20
34.90
33.10
22.70
39.70
31.80
31.70
43.10
41.01
-12.86
0.92
4.77
-0.96
-16.04
10.62
2.66
-10.99
15.03
12.77
-15
-10
-5
10
15
20
1 = 3189.88-20(3.14)(35.08)/[(20-1)(92.65)] = 0.56
0 = 35.08-0.56(3.14) = 33.32
MSE = [(20-1)(33.84)-(0.56)2(20-1)(92.65)]/(20-2)
= 5.01
-15
-10
-5
Linear regression
10
15
20
Original equation:
T = c.Vb
Calculate the constants c and b!
What is correlation?
A measure of the linear relationship between two
variables
Measurement of the strength of linear relation
between x and y.
The sample correlation coefficient (r), -1 r 1,
related to the estimated slope
r = sxy/sxsy = sxy/(sx2sy2)= 1sx/sy
The correlation coefficient r is a positive number if
y tends to increase as x increases; r is negative if y
tends to decrease as x increase; r is zero if there is
either no relation between changes in x and
changes in y or there is a nonlinear relation.
25
41
47
59
54
56
49
43
30
10
20
20
30
30
30
40
40
50
x = 30.00
y = 44.89
For any given set of values x1, x2, x3, , and xr and the
corresponding values of y, a linear relationship between
variables is given by
y = 0 + 1x1 + 2x2 + 3x3 + + rxr
y = n 0 + 1x1 + 2 x2
x1y = 0 x1 + 1 x12 + 2 x1x2
x2y = 0 x2 + 1 x1x2 + 2 x22
y
38
40
85
59
40
60
68
53
x1
x2
1
2
3
4
1
2
3
4
5
5
5
5
10
10
10
10
y
31
35
42
59
18
34
29
42
x1
x2
1
2
3
4
1
2
3
4
15
15
15
15
20
20
20
20
x1
x2
x12
x22
=
=
=
=
40
200
120
3000
x1x2
x1y
x2y
y
=
=
=
=
500
1989
8285
733
0 = 48.2
1 = 7.83
2 = -1.76
Hence, the multiple regression equation is
y = 48.2 + 7.83x1 - 1.76x2