U X X Y H
U X X Y H
Y 1 2 X 2 ... k X k u
H 0 : 2 ... k 0
H 1 : at least one 0
This sequence describes two F tests of goodness of fit in a multiple regression model. The
first relates to the goodness of fit of the equation as a whole.
Y 1 2 X 2 ... k X k u
H 0 : 2 ... k 0
H 1 : at least one 0
We will consider the general case where there are k – 1 explanatory variables. For the F
test of goodness of fit of the equation as a whole, the null hypothesis, in words, is that the
model has no explanatory power at all.
© Christopher Dougherty 1999–2006 2
F TESTS OF GOODNESS OF FIT
Y 1 2 X 2 ... k X k u
H 0 : 2 ... k 0
H 1 : at least one 0
Of course we hope to reject it and conclude that the model does have some explanatory
power.
Y 1 2 X 2 ... k X k u
H 0 : 2 ... k 0
H 1 : at least one 0
The model will have no explanatory power if it turns out that Y is unrelated to any of the
explanatory variables. Mathematically, therefore, the null hypothesis is that all the
coefficients 2, ..., k are zero.
© Christopher Dougherty 1999–2006 4
F TESTS OF GOODNESS OF FIT
Y 1 2 X 2 ... k X k u
H 0 : 2 ... k 0
H 1 : at least one 0
The alternative hypothesis is that at least one of these coefficients is different from zero.
Y 1 2 X 2 ... k X k u
H 0 : 2 ... k 0
H 1 : at least one 0
In the multiple regression model there is a difference between the roles of the F and t tests.
The F test tests the joint explanatory power of the variables, while the t tests test their
explanatory power individually.
© Christopher Dougherty 1999–2006 6
F TESTS OF GOODNESS OF FIT
Y 1 2 X 2 ... k X k u
H 0 : 2 ... k 0
H 1 : at least one 0
In the simple regression model the F test was equivalent to the (two-sided) t test on the
slope coefficient because the ‘group’ consisted of just one variable.
Y 1 2 X 2 ... k X k u
H 0 : 2 ... k 0
H 1 : at least one 0
ESS ( k 1)
F ( k 1, n k )
RSS ( n k )
ESS
( k 1)
TSS R 2
( k 1)
RSS (1 R 2
) (n k )
(n k )
TSS
The F statistic for the test was defined in the last sequence in Chapter 2. ESS is the
explained sum of squares and RSS is the residual sum of squares.
Y 1 2 X 2 ... k X k u
H 0 : 2 ... k 0
H 1 : at least one 0
ESS ( k 1)
F ( k 1, n k )
RSS ( n k )
ESS
( k 1)
TSS R 2
( k 1)
RSS (1 R 2
) (n k )
(n k )
TSS
It can be expressed in terms of R2 by dividing the numerator and denominator by TSS, the
total sum of squares.
Y 1 2 X 2 ... k X k u
H 0 : 2 ... k 0
H 1 : at least one 0
ESS ( k 1)
F ( k 1, n k )
RSS ( n k )
ESS
( k 1)
TSS R 2
( k 1)
RSS (1 R 2
) (n k )
(n k )
TSS
ESS / TSS is the definition of R2. RSS / TSS is equal to (1 – R2). (See the last sequence in
Chapter 2.)
S 1 2 ASVABC 3 SM 4 SF u
The educational attainment model will be used as an example. We will suppose that S
depends on ASVABC, the ability score, and SM, and SF, the highest grade completed by the
mother and father of the respondent, respectively.
© Christopher Dougherty 1999–2006 11
F TESTS OF GOODNESS OF FIT
S 1 2 ASVABC 3 SM 4 SF u
H0 : 2 3 4 0
The null hypothesis for the F test of goodness of fit is that all three slope coefficients are
equal to zero. The alternative hypothesis is that at least one of them is non-zero.
S 1 2 ASVABC 3 SM 4 SF u
H0 : 2 3 4 0
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
S 1 2 ASVABC 3 SM 4 SF u
H0 : 2 3 4 0
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
ESS /( k 1) 1181 / 3
F ( k 1, n k ) F ( 3,536) 104.3
RSS /( n k ) 2024 / 536
In this example, k – 1, the number of explanatory variables, is equal to 3 and n – k, the
number of degrees of freedom, is equal to 536.
S 1 2 ASVABC 3 SM 4 SF u
H0 : 2 3 4 0
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
ESS /( k 1) 1181 / 3
F ( k 1, n k ) F ( 3,536) 104.3
RSS /( n k ) 2024 / 536
The numerator of the F statistic is the explained sum of squares divided by k – 1. In the
Stata output these numbers are given in the Model row.
S 1 2 ASVABC 3 SM 4 SF u
H0 : 2 3 4 0
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
ESS /( k 1) 1181 / 3
F ( k 1, n k ) F ( 3,536) 104.3
RSS /( n k ) 2024 / 536
The denominator is the residual sum of squares divided by the number of degrees of
freedom remaining.
S 1 2 ASVABC 3 SM 4 SF u
H0 : 2 3 4 0
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
ESS /( k 1) 1181 / 3
F ( k 1, n k ) F ( 3,536) 104.3
RSS /( n k ) 2024 / 536
Hence the F statistic is 104.3. All serious regression packages compute it for you as part of
the diagnostics in the regression output.
S 1 2 ASVABC 3 SM 4 SF u
H0 : 2 3 4 0
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
1181 / 3
Fcrit,0.1% ( 3,500) 5.51 F ( 3,536) 104.3
2024 / 536
The critical value for F(3,536) is not given in the F tables, but we know it must be lower than
F(3,500), which is given. At the 0.1% level, this is 5.51. Hence we easily reject H0 at the
0.1% level.
© Christopher Dougherty 1999–2006 18
F TESTS OF GOODNESS OF FIT
S 1 2 ASVABC 3 SM 4 SF u
H0 : 2 3 4 0
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
1181 / 3
Fcrit,0.1% ( 3,500) 5.51 F ( 3,536) 104.3
2024 / 536
This result could have been anticipated because both ASVABC and SF have highly
significant t statistics. So we knew in advance that both 2 and 4 were non-zero.
S 1 2 ASVABC 3 SM 4 SF u
H0 : 2 3 4 0
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
1181 / 3
Fcrit,0.1% ( 3,500) 5.51 F ( 3,536) 104.3
2024 / 536
It is unusual for the F statistic not to be significant if some of the t statistics are significant.
In principle it could happen though. Suppose that you ran a regression with 40 explanatory
variables, none being a true determinant of the dependent variable.
© Christopher Dougherty 1999–2006 20
F TESTS OF GOODNESS OF FIT
S 1 2 ASVABC 3 SM 4 SF u
H0 : 2 3 4 0
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
1181 / 3
Fcrit,0.1% ( 3,500) 5.51 F ( 3,536) 104.3
2024 / 536
Then the F statistic should be low enough for H0 not to be rejected. However, if you are
performing t tests on the slope coefficients at the 5% level, with a 5% chance of a Type I
error, on average 2 of the 40 variables could be expected to have ‘significant’ coefficients.
© Christopher Dougherty 1999–2006 21
F TESTS OF GOODNESS OF FIT
S 1 2 ASVABC 3 SM 4 SF u
H0 : 2 3 4 0
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
1181 / 3
Fcrit,0.1% ( 3,500) 5.51 F ( 3,536) 104.3
2024 / 536
The opposite can easily happen, though. Suppose you have a multiple regression model
which is correctly specified and the R2 is high. You would expect to have a highly
significant F statistic.
© Christopher Dougherty 1999–2006 22
F TESTS OF GOODNESS OF FIT
S 1 2 ASVABC 3 SM 4 SF u
H0 : 2 3 4 0
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
1181 / 3
Fcrit,0.1% ( 3,500) 5.51 F ( 3,536) 104.3
2024 / 536
However, if the explanatory variables are highly correlated and the model is subject to
severe multicollinearity, the standard errors of the slope coefficients could all be so large
that none of the t statistics is significant.
© Christopher Dougherty 1999–2006 23
F TESTS OF GOODNESS OF FIT
S 1 2 ASVABC 3 SM 4 SF u
H0 : 2 3 4 0
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
1181 / 3
Fcrit,0.1% ( 3,500) 5.51 F ( 3,536) 104.3
2024 / 536
In this situation you would know that your model is a good one, but you are not in a
position to pinpoint the contributions made by the explanatory variables individually.
Y 1 2 X 2 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
We now come to the other F test of goodness of fit. This is a test of the joint explanatory
power of a group of variables when they are added to a regression model.
Y 1 2 X 2 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
For example, in the original specification, Y may be written as a simple function of X2. In
the second, we add X3 and X4.
Y 1 2 X 2 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 3 4 0
H 1 : 3 0 or 4 0 or both 3 and 4 0
The null hypothesis for the F test is that neither X3 nor X4 belongs in the model. The
alternative hypothesis is that at least one of them does, perhaps both.
Y 1 2 X 2 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 3 4 0
H 1 : 3 0 or 4 0 or both 3 and 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
For this F test, and for several others which we will encounter, it is useful to think of the F
statistic as having the structure indicated above.
Y 1 2 X 2 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 3 4 0
H 1 : 3 0 or 4 0 or both 3 and 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
The ‘improvement’ is the reduction in the residual sum of squares when the change is
made, in this case, when the group of new variables is added.
Y 1 2 X 2 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 3 4 0
H 1 : 3 0 or 4 0 or both 3 and 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
The ‘cost’ is the reduction in the number of degrees of freedom remaining after making the
change. In the present case it is equal to the number of new variables added, because that
number of new parameters are estimated.
© Christopher Dougherty 1999–2006 30
F TESTS OF GOODNESS OF FIT
Y 1 2 X 2 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 3 4 0
H 1 : 3 0 or 4 0 or both 3 and 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
(Remember that the number of degrees of freedom in a regression equation is the number
of observations, less the number of parameters estimated. In this example, it would fall
from n – 2 to n – 4 when X3 and X4 are added.)
© Christopher Dougherty 1999–2006 31
F TESTS OF GOODNESS OF FIT
Y 1 2 X 2 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 3 4 0
H 1 : 3 0 or 4 0 or both 3 and 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
The ‘remaining unexplained’ is the residual sum of squares after making the change.
Y 1 2 X 2 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 3 4 0
H 1 : 3 0 or 4 0 or both 3 and 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
The ‘degrees of freedom remaining’ is the number of degrees of freedom remaining after
making the change.
. reg S ASVABC
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .148084 .0089431 16.56 0.000 .1305165 .1656516
_cons | 6.066225 .4672261 12.98 0.000 5.148413 6.984036
------------------------------------------------------------------------------
We will illustrate the test with an educational attainment example. Here is S regressed on
ASVABC using Data Set 21. We make a note of the residual sum of squares.
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
Now we have added the highest grade completed by each parent. Does parental education
have a significant impact? Well, we can see that a t test would show that SF has a highly
significant coefficient, but we will perform the F test anyway. We make a note of RSS.
© Christopher Dougherty 1999–2006 35
F TESTS OF GOODNESS OF FIT
Y 1 2 X 2 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 3 4 0
H 1 : 3 0 or 4 0 or both 3 and 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
The improvement in the fit on adding the parental variables is the reduction in the residual
sum of squares.
Y 1 2 X 2 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 3 4 0
H 1 : 3 0 or 4 0 or both 3 and 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
The cost is 2 degrees of freedom because 2 additional parameters have been estimated.
Y 1 2 X 2 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 3 4 0
H 1 : 3 0 or 4 0 or both 3 and 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
The remaining unexplained is the residual sum of squares after adding SM and SF.
Y 1 2 X 2 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 3 4 0
H 1 : 3 0 or 4 0 or both 3 and 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
Y 1 2 X 2 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 3 4 0
H 1 : 3 0 or 4 0 or both 3 and 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
Y 1 2 X 2 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 3 4 0
H 1 : 3 0 or 4 0 or both 3 and 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
Y 1 2 X 2 3 X 3 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
This sequence will conclude by showing that t tests are equivalent to marginal F tests when
the additional group of variables consists of just one variable.
Y 1 2 X 2 3 X 3 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
Suppose that in the original model Y is a function of X2 and X3, and that in the revised model
X4 is added.
Y 1 2 X 2 3 X 3 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 4 0
H1 : 4 0
The null hypothesis for the F test of the explanatory power of the additional ‘group’ is that
all the new slope coefficients are equal to zero. There is of course only one new slope
coefficient, 4.
© Christopher Dougherty 1999–2006 44
F TESTS OF GOODNESS OF FIT
Y 1 2 X 2 3 X 3 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 4 0
H1 : 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
The F test has the usual structure. We will illustrate it with an educational attainment model
where S depends on ASVABC and SM in the original model and on SF as well in the revised
model.
© Christopher Dougherty 1999–2006 45
F TESTS OF GOODNESS OF FIT
. reg S ASVABC SM
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1328069 .0097389 13.64 0.000 .1136758 .151938
SM | .1235071 .0330837 3.73 0.000 .0585178 .1884963
_cons | 5.420733 .4930224 10.99 0.000 4.452244 6.389222
------------------------------------------------------------------------------
Here is the regression of S on ASVABC and SM. We make a note of the residual sum of
squares.
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
Now we add SF and again make a note of the residual sum of squares.
Y 1 2 X 2 3 X 3 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 4 0
H1 : 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
Y 1 2 X 2 3 X 3 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 4 0
H1 : 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
The cost is just the single degree of freedom lost when estimating 4.
Y 1 2 X 2 3 X 3 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 4 0
H1 : 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
The remaining unexplained is the residual sum of squares after adding SF.
Y 1 2 X 2 3 X 3 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 4 0
H1 : 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
Y 1 2 X 2 3 X 3 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 4 0
H1 : 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
Y 1 2 X 2 3 X 3 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 4 0
H1 : 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
Y 1 2 X 2 3 X 3 u RSS1
Y 1 2 X 2 3 X 3 4 X 4 u RSS 2
H0 : 4 0
H1 : 4 0
improvement cost
F(cost, d.f. remaining) =
remaining degrees of freedom
unexplained remaining
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
( 2069.3 2023.6) / 1
F (1,536) 12.10 Fcrit,0.1% 10.96
2023.6 / 536
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
( 2069.3 2023.6) / 1
F (1,536) 12.10 Fcrit,0.1% 10.96
2023.6 / 536
t crit,0.1% 3.31
The critical value of t at the 0.1% level with 500 degrees of freedom is 3.31. The critical
value with 536 degrees of freedom must be lower. So we reject H0 again.
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
( 2069.3 2023.6) / 1
F (1,536) 12.10 Fcrit,0.1% 10.96
2023.6 / 536
3.48 2 12.11 t crit,0.1% 3.31
It can be shown that the F statistic for the F test of the explanatory power of a ‘group’ of
one variable must be equal to the square of the t statistic for that variable. (The difference
in the last digit is due to rounding error.)
© Christopher Dougherty 1999–2006 57
F TESTS OF GOODNESS OF FIT
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
( 2069.3 2023.6) / 1
F (1,536) 12.10 Fcrit,0.1% 10.96
2023.6 / 536
3.48 2 12.11 t crit,0.1% 3.31 3.312 10.96
It can also be shown that the critical value of F must be equal to the square of the critical
value of t. (The critical values shown are for 500 degrees of freedom, but this must also be
true for 536 degrees of freedom.)
© Christopher Dougherty 1999–2006 58
F TESTS OF GOODNESS OF FIT
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
( 2069.3 2023.6) / 1
F (1,536) 12.10 Fcrit,0.1% 10.96
2023.6 / 536
3.48 2 12.11 t crit,0.1% 3.31 3.312 10.96
Hence the conclusions of the two tests must coincide.
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
( 2069.3 2023.6) / 1
F (1,536) 12.10 Fcrit,0.1% 10.96
2023.6 / 536
3.48 2 12.11 t crit,0.1% 3.31 3.312 10.96
This result means that the t test of the coefficient of a variable is a test of its marginal
explanatory power, after all the other variables have been included in the equation.
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
( 2069.3 2023.6) / 1
F (1,536) 12.10 Fcrit,0.1% 10.96
2023.6 / 536
3.48 2 12.11 t crit,0.1% 3.31 3.312 10.96
If the variable is correlated with one or more of the other variables, its marginal explanatory
power may be quite low, even if it genuinely belongs in the model.
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
( 2069.3 2023.6) / 1
F (1,536) 12.10 Fcrit,0.1% 10.96
2023.6 / 536
3.48 2 12.11 t crit,0.1% 3.31 3.312 10.96
If all the variables are correlated, it is possible for all of them to have low marginal
explanatory power and for none of the t tests to be significant, even though the F test for
their joint explanatory power is highly significant.
© Christopher Dougherty 1999–2006 62
F TESTS OF GOODNESS OF FIT
. reg S ASVABC SM SF
------------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
ASVABC | .1257087 .0098533 12.76 0.000 .1063528 .1450646
SM | .0492424 .0390901 1.26 0.208 -.027546 .1260309
SF | .1076825 .0309522 3.48 0.001 .04688 .1684851
_cons | 5.370631 .4882155 11.00 0.000 4.41158 6.329681
------------------------------------------------------------------------------
( 2069.3 2023.6) / 1
F (1,536) 12.10 Fcrit,0.1% 10.96
2023.6 / 536
3.48 2 12.11 t crit,0.1% 3.31 3.312 10.96
If this is the case, the model is said to be suffering from the problem of multicollinearity
discussed in the previous sequence.