University of Hong Kong: Administrative Matters
University of Hong Kong: Administrative Matters
( )
( )
( )
( )
/ 198.311 183.186 /3
9.55
/ 1 183.186/ 353 5 1
r ur
ur
SSR SSR q
F
SSR n k
= =
Multiple Regression Analysis: Inference
In this case, looking at Table G.3b in the textbook tells us that the critical value for an
F distribution with (3,120) degrees of freedom is 2.68.
We actually want (3,347) degrees of freedom but this is close.
Since 9.55 is much greater than 2.68, we clearly reject the hypothesis that none of the
performance statistics influence baseball players salary.
Multiple Regression Analysis: Inference
Stata can do F tests:
. t est bavg hr unsyr r bi syr
( 1) bavg = 0. 0
( 2) hr unsyr = 0. 0
( 3) r bi syr = 0. 0
F( 3, 347) = 9. 55
Pr ob > F = 0. 0000
Multiple Regression Analysis: Inference
Why are the variables jointly significant but not individually? It turns out they are
highly correlated:
. cor r bavg hr unsyr r bi syr
( obs=353)
| bavg hr unsyr r bi syr
- - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - -
bavg | 1. 0000
hr unsyr | 0. 1906 1. 0000
r bi syr | 0. 3291 0. 8907 1. 0000
So we know that the best players get paid more, but if this is because of home runs,
batting average or whatever we cant tell.
Multiple Regression Analysis: Inference
It is also possible to test a set of general linear restrictions, as opposed to merely
exclusion restrictions (hypotheses that some of the | parameters are equal to zero).
As an example we will look at the rationality of housing price assessments.
If assessors of housing prices are doing their job correctly, then their assessment
should include the value of everything you can observe on the house; for example the
number of bedrooms and the number of square feet.
Multiple Regression Analysis
In a regression context, this means that if you are predicting the sale price of a house
based on its assessed value, you shouldnt need variables for the number of bedrooms
or the size of the property.
If you do, it means the price assessment hasnt taken these variables into account
properly.
Multiple Regression Analysis: Inference
Specifically, the model that we have in mind is
This is the unrestricted model.
The hypothesis we want to test is:
Multiple Regression Analysis: Inference
We can construct the restricted model by substituting the restrictions
(|
1
=1;|
2
=0;|
3
=0;|
4
=0) into the unrestricted model.
The restricted model is then
( ) ( ) ( ) ( )
0 1 2 3 4
ln ln ln ln price assess lotsize sqrft bdrms u | | | | | = + + + + +
0 1 2 3 4
: 1, 0, 0, 0 H | | | | = = = =
( ) ( )
( ) ( )
0
0
ln ln
ln ln
price assess u
price assess u
|
|
= + +
= +
Multiple Regression Analysis: Inference
. d
Cont ai ns dat a f r omhpr i ce1. dt a
obs: 88
var s: 10 17 Mar 2002 12: 21
si ze: 3, 168 ( 99. 5%of memor y f r ee)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
st or age di spl ay val ue
var i abl e name t ype f or mat l abel var i abl e l abel
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
pr i ce f l oat %9. 0g house pr i ce, $1000s
assess f l oat %9. 0g assessed val ue, $1000s
bdr ms byt e %9. 0g number of bdr ms
l ot si ze f l oat %9. 0g si ze of l ot i n squar e f eet
sqr f t i nt %9. 0g si ze of house i n squar e f eet
col oni al byt e %9. 0g =1 i f home i s col oni al st yl e
l pr i ce f l oat %9. 0g l og( pr i ce)
l assess f l oat %9. 0g l og( assess
l l ot si ze f l oat %9. 0g l og( l ot si ze)
l sqr f t f l oat %9. 0g l og( sqr f t )
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Sor t ed by:
Multiple Regression Analysis: Inference
The unrestricted model is:
. r egr ess l pr i ce l assess l l ot si ze l sqr f t bdr ms
Sour ce | SS df MS Number of obs = 88
- - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - F( 4, 83) = 70. 58
Model | 6. 19607473 4 1. 54901868 Pr ob > F = 0. 0000
Resi dual | 1. 82152879 83 . 02194613 R- squar ed = 0. 7728
- - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Adj R- squar ed = 0. 7619
Tot al | 8. 01760352 87 . 092156362 Root MSE = . 14814
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
l pr i ce | Coef . St d. Er r . t P>| t | [ 95%Conf . I nt er val ]
- - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
l assess | 1. 043065 . 151446 6. 89 0. 000 . 7418453 1. 344285
l l ot si ze | . 0074379 . 0385615 0. 19 0. 848 - . 0692593 . 0841352
l sqr f t | - . 1032384 . 1384305 - 0. 75 0. 458 - . 378571 . 1720942
bdr ms | . 0338392 . 0220983 1. 53 0. 129 - . 0101135 . 0777918
_cons | . 263743 . 5696647 0. 46 0. 645 - . 8692972 1. 396783
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Multiple Regression Analysis: Inference
To estimate the restricted model, we create a new independent variable equal to
ln(price)-ln(assess):
. gener at e l passess=l pr i ce- l assess
. r egr ess l passess
Sour ce | SS df MS Number of obs = 88
- - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - F( 0, 87) = 0. 00
Model | 0. 00 0 . Pr ob > F = .
Resi dual | 1. 88014885 87 . 021610906 R- squar ed = 0. 0000
- - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Adj R- squar ed = 0. 0000
Tot al | 1. 88014885 87 . 021610906 Root MSE = . 14701
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
l passess | Coef . St d. Er r . t P>| t | [ 95%Conf . I nt er val ]
- - - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
_cons | - . 0848135 . 0156709 - 5. 41 0. 000 - . 1159612 - . 0536658
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Multiple Regression Analysis: Inference
In this case, q (the number of restrictions) is 4, n (the number of observations) is 88,
and k (the number of variables in the unrestricted model) is 4. The F statistic for the
hypothesis test is:
From table G.3b, the critical value for an F distribution with (4,90) degrees of freedom
is 2.47. 0.667 is clearly less than that; therefore we fail to reject the hypothesis that
assessed prices accurately take into account observable characteristics of the property
(|
1
=1;|
2
=0;|
3
=0;|
4
=0).
Multiple Regression Analysis: Inference
Another form of the F statistic that is sometimes seen is the R squared form.
This derivation requires the following facts, from the definition of R squared:
( )
( )
( )
( )
/ 1.8801 1.8215 / 4
0.667
/ 1 1.8215/ 88 4 1
r ur
ur
SSR SSR q
F
SSR n k
= =
( )
2
2
2
1
SSE SST SSR
R
SST SST
SSR R SST SST
SSR SST R
= =
=
=
=
=
=
Multiple Regression Analysis: Inference
Find the critical value for the F test by looking at the appropriate table (G.3 in your
textbook) for an F distribution with (q,n-k-1) degrees of freedom.
You must choose the significance level G.3a is 10%, G.3b is 5%, and G.3c is 1%.
On this table, q is the numerator degrees of freedom, and n-k-1 is the
denominator degrees of freedom.
If the F statistic exceeds the critical value, you reject the hypothesis. Otherwise
you fail to reject the hypothesis.