0% found this document useful (0 votes)
360 views

Econometrics - Solution sh.2B 2024

The document provides solutions to a worksheet on multiple linear regression. It includes true/false questions and explanations about the assumptions, properties, and interpretation of the multiple linear regression model. Key points covered include that OLS estimators are unbiased but require normality for inferences, adjusting R-squared for number of variables, and the relationship between F-statistic and R-squared.

Uploaded by

rayanmaazsiddig
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
360 views

Econometrics - Solution sh.2B 2024

The document provides solutions to a worksheet on multiple linear regression. It includes true/false questions and explanations about the assumptions, properties, and interpretation of the multiple linear regression model. Key points covered include that OLS estimators are unbiased but require normality for inferences, adjusting R-squared for number of variables, and the relationship between F-statistic and R-squared.

Uploaded by

rayanmaazsiddig
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Faculty of Economics and Political Sciences

Economics Department
Third Year
Econometrics I
TA. Menna Sherif

SOLUTION OF SHEET 2B
MULTIPLE LINEAR REGRESSION MODEL

State with reason whether the statements true or false.


Answers:
1. The classical linear regression model assumes that the errors are independent,
have a positive mean and a decreasing variance.

False. CLRM assumes that the error terms are independent


→ cov(ui , u j | xi, xj ) = 0; i  j
A zero mean → E (u | x) = 0
A constant variance → var(u | x) = E (u 2 ) =  2

2. The linearity assumption states that the parameters and the variables have to
be linear.

False. It assumes only that the model is linear in parameters and not
necessarily in variables. For instance,
Y = 1 X  e u is not linear but after the log transformation, the equation
2

becomes ln Y = ln 1 +  2 ln X + u which is linear in parameters. Also


Y = 1 +  2 X 2 + u is linear by the OLS assumptions.

3. If all the values of 𝑋𝑖 are equal, so the estimation of the parameters in the simple
regression model will be more easier.

False. Having equal values of x means that we cannot estimate the OLS
coefficients.
As we know ̂ 2 = (∑YiXi - nȲx̄) / (∑Xi2-nx̄2), since the denominator is
equal to zero, we cannot estimate ̂ 2 ; and also we cannot estimate ̂1
as it depends linearly on ̂ 2 .

1
4. In the multiple regression model if we have K independent variables, so we have K+1
parameters.
True. They are the intercept plus the coefficient of each independent
variable, but if we have a regression from the origin, the no. of
parameters will equate the no. of independent variables.

5. The multiple regression model assumes that the vector u is a stochastic vector with
expected value of zero and variance of identity matrix.
False. For a regression model, Y = X + u , u is a stochastic error term
with E(u|x)=0 and var(u | x) =  2 I that is constant variance
(homoscedasicity) and not an identity matrix.

6. Even in the case of multi-colinearity among the explanatory variables, it is


possible to find the inverse of (X’X).
True. In the case of imperfect multicolinearity.

7. The OLS residuals 𝑢̂ are uncorrelated, if this is the case for the errors 𝑢.
False. Not necessary, as I may take a sample which is characterised by
correlation between the residuals (withdrew unrepresentative sample).

8. If the regressor matrix X in the linear model 𝑌 = 𝑋𝛽 + 𝑢 is not a full rank, then
an unbiased estimator for 𝛽 does not exist.
True. If the matrix is not in a full rank, its determinant would be zero.
1
And as [𝑋 −1 = 𝑎𝑑𝑗] ,we cannot get the inverse of the matrix, hence
𝐷
we cannot estimate the parameters (ie. 𝛽𝑠 will not exist).

9. In a 3 variables model: 𝑦𝑖 = 𝛽1 + 𝛽2 𝑥𝑖 + 𝛽3 𝑧𝑖 + 𝑢𝑖 , if 𝑧𝑖 = 𝑥𝑖2 , then the


regression model suffers from a problem of perfect multicollinearity.
False. As the relationship among 𝑧𝑖 and 𝑥𝑖 is non-linear, but the
multicollinearity concerns the linear relationship between the
variables.

10. The OLS assumptions are related to the sample characteristics.


False. As all the assumptions of CLRM or OLS models are related to the
population.

2
11. In the case of perfect multi-colinearity among k explanatory variables, the
rank of (X’X) is k.
False. If we have k explanatory variables, the order of (X’X) would be
(k+1)(k+1). And this matrix would be in full rank if its rank is (k+1), but
in case of perfect multicollinearity, (X’X) would not be in the full rank,
ie. Its rank would be less then (k+1), so it could be k or less.

12. The normality assumption of the disturbance term is necessary to apply least
squares method.
False. The normality assumption is only needed for constructing the
confidence intervals and hypothesis testing. In other words, it is an
important assumption for conducting inferences to the true value of
parameters.

13. Even though the error term in the linear regression model is not normally
distributed, OLS estimators are still normally distributed.

False. The normality assumption of the error term is necessary for the
normality of the OLS estimators.
Since Yi is a function of the error term Yi = 1 +  2 X i + ui , if u i is
normally distributed with zero mean and constant variance  Yi is
normally distributed also.
Since ˆ1andˆ2 are linear in Yi , the estimators are also normally
distributed.

14. Even though the disturbance term in the CLRM is not normally distributed, the OLS
estimators are still unbiased.
True. The OLS estimators’ unbiasedness does not depend on the
normality of the error term. It depends on the assumption that the
expected value of the error term is equal to zero,( Estimators’
unbiasedness means that the expectation of the estimators is equal to
the true value of the parameter).

15. To have unbiased estimators, all the assumptions of OLS have to be satisfied.
False. OLS estimators’ unbiasedness depends on the assumptions that
enable us estimating the model (i.e. linear model, unstochastic
explanatory variables, sample size being greater than the no. of
parameters, variability in Xs , correct specification of the model and no
perfect multicollinearity) plus the assumption that the expectation of the
error term is equal to zero.

3
16. The variance of the estimators is equal to ϭ2(x'x)-1. True.
17. OLS estimators ̂1 and ̂ 2 are biased, consistent and efficient estimators for the
real parameters  1 and  2 .
False. OLS estimators are unbiased, consistent and efficient estimators
for the real parameters. (Define each)

18. In the multiple regression models, the dimension of the variance-covariance


matrix of the errors is n×k.
False. As it is nxn.

19. The test of significance for the whole relation is just equal to the test of the
significance for the intercept.
False. The test of significance of the whole relation is testing the joint
significance of the slope coefficients which is different from testing the
significance of the intercept.

20. The more the standard error for the parameters increases the more we can trust
in the confidence intervals of these parameters.
False. The larger the standard error of the estimators, the less we can
trust the confidence intervals of the parameters as an inference of the
true value of the parameters.

For example,
1 = ˆ1  t[( n − no.of . parameters),  2 ] * Se( ˆ1 )
If the standard error of the estimator increases, the boundaries of the
interval also increase. Hence, our confidence on the interval reliability
will decrease.

21. A confidence coefficient of 95% means that the probability for the real
parameters to exist between the lower and upper confidence limit is 95%.
False. A confidence coefficient of 95% means that in 95 out of 100
cases, the estimates are near the true parameter.

22. In the case of multiple regression model, the relation between F and will
be negative.
False. In the case of multiple regression model, there is a positive
relationship between F and where,
𝑅2 ⁄𝑛𝑜.𝑜𝑓 𝑟𝑒𝑔𝑟𝑒𝑠𝑠𝑜𝑟𝑠
𝑓=
1−𝑅2 ⁄𝑛−𝑛𝑜.𝑜𝑓 𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟𝑠
The higher the , the greater is the nominator, the higher is the F value.

4
23. Value is used to compare between different multiple regression models.
False. We use the adjusted -squared ( , instead of to compare
between the multiple regression models which have a different
number of independent variables as it is adjusted for the degrees of
freedom, where
n−1
̅2 = 1 − (1 − R2 )
R
n−no.of parameters
However, there are two conditions to be satisfied in order to use in
comparison:
• The models must have the same dependent variable (i.e. we can't
use to compare models with Y and lnY).
• The models must have the same sample size (n).

24. If =1, then =zero.


False. If

25. In the multiple regression model, there is a negative relationship between


and .
True. According to the following rule,

̂ 2 (𝑛−𝑛𝑜.𝑜𝑓 𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟𝑠)
𝜎
𝑅2 = 1 − ∑(𝑌𝑖 −𝑌̅)
The higher the is, the greater the nominator will be, hence the lower
the would be, assuming other things holding constant.

26. If the R² differs greatly from R² adjusted, the model may contain unhelpful
predictors. True.

27. A confidence Interval for each unknown parameters is equivalent to a two-


tailed hypothesis test for
True. As confidence interval contains (+

𝑡𝛼 )
2

28. A Type I error is the probability of rejecting the null hypothesis when it is
right.
True. A Type I error is the probability of rejecting the null hypothesis
when it is right (ie. It is the probability of incorrectly rejecting the null

5
hypothesis). But a type II error is the probability of failing to reject a
null hypothesis that is wrong.

Multiple choices answers:

Part 1: choose the correct answer

1. Regression analysis with one dependent variable and two or more


independent variables is called _______.
a) indicator regression b) nonlinear regression
c) multiple regression d) time-series regression

2. In the model 𝑦𝑖 = 𝛽𝑜 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝛽3 𝑥3 + 𝑢𝑖 , 𝛽3 is ______________.


a) a partial regression coefficient b) the regression constant
c) the error of prediction d) the response variable

1
3. The nonlinear model 𝑌 = can be transformed into a linear model
𝛽0 +𝛽1 𝑥1 +𝜀
as ___________.
1
a) y = log β0 + x log β1 b) = log β0 + x log β1 + ε
𝑦

𝟏
c) = 𝜷𝟎 + 𝜷𝟏 𝒙𝟏 + 𝜺 d) log y = log β0 + x log β1
𝒚

4. Under the matrix notation for the classical linear regression model, 𝑌 =
𝑋𝛽 + 𝑢 with n number of observations and k the number of explanatory
variables, what are the dimensions of u?
a) n×k b) n×1 c) k×1 d) 1×1

5. What are the dimensions of 𝑢̂′ 𝑢̂ ?


a) n×n b) n×1 c) k×k d) 1×1

6. Which one of the following statements best describes the algebraic


representation of the fitted regression line?
a) b) c) d)

6
7. What is the meaning of the term "heteroscedasticity"?
a) The variance of the errors is not constant
b) The variance of the dependent variable is not constant
c) The errors are not linearly independent of one another
d) The errors have non-zero mean

8. Suppose we use an estimator 𝛽̂ to estimate some parameter 𝛽. If E[𝛽̂ ] =


𝛽, then we say that 𝛽̂ is
a) symmetrical b) BLUE c) non-linear d) unbiased

9. Which one of the following statements best describes a Type II error?


a) It is the probability of incorrectly rejecting the null hypothesis
b) It is equivalent to the power of the test
c) It is equivalent to the size of the test
d) It is the probability of failing to reject a null hypothesis that was
wrong

10. If an estimator is said to be consistent, it is implied that,


a) On average, the estimated coefficient values will equal the true values.
b) The OLS estimator is unbiased and no other unbiased estimator has a
smaller variance.
c) The estimates will converge upon the true values as the sample size
increases.
d) The coefficient estimates will be as close to their true values as possible
for small and large samples.

Part 2: The following estimated model studies the trade-off between time
spent sleeping and working and to look at other factors affecting sleep:
slêep = 3638.25 − 0.148totwork − 11.13educ + 2.20age
(112.28) (0.017) (5.88) (1.45)
2
where n = 706, R = 0.113 and the standard errors are reported between
parenthesis. Given the above information choose the correct answer.

11. If someone works five more hours per week, the predicted value of sleep
a) becomes 44.4 minutes. b) decreases by 44.4 minutes.
c) increases by 44.4 minutes. d) We cannot decide

7
(note: we must have values for all the explanatory vars to get the predicted
value of the dependent var.)

12. The coefficient of determination shows that:


a) total work, education and age explain much the time spent sleeping.
b) total work, education and age did not explain much the time spent
sleeping.
c) just total work explains the time spent sleeping.
d) None of the above.

13. Given the standard error value of the variable education, we can conclude
that at 5% significance level:
a) Education has no significant impact on time spent sleeping.
b) Education has significant impact on time spent sleeping.
c) We cannot conclude if education has significant impact or not on time
spent sleeping.
d) None of the above.
−𝟏𝟏.𝟏𝟑
𝒕 = 𝟓.𝟖𝟖 = −𝟏. 𝟖𝟗 < 𝟐 , we do not reject the null
hypothesis, so var. edu is not significant

Part 3: Given industry data, we estimate the following equation (standard


errors are in parentheses)
𝐥𝐨𝐠̂
(𝑹𝑫) = −𝟒. 𝟑𝟖 + 𝟏. 𝟎𝟖𝟒 𝐥𝐨𝐠(𝒔𝒂𝒍𝒆𝒔) + 𝟎. 𝟎𝟐𝟏𝟕 𝒑𝒓𝒐𝒇𝒎𝒂𝒓𝒈
(0.047) (0.060) (0.0218)

Where n=32, R2=0.918, RD= spending on research and development, sales is


the annual sales measured as a proxy for firm size and profmarg is the profit as
a percentage of the sales (profit margin).

14. Degrees of freedom for the F-test for the significance of the model equals:
a) 2, 30 b) 2, 29 c) 2, 32 d) 2, 31

DF of f test= the degrees of freedom of the numerator (which is the no. of


regressors) and the degrees of freedom of the denominator (which is n-no.of
parameters).

15. Ceteris paribus, 1% increase in sales


a) increases R&D spending by 1.084%.
b) increases R&D spending by 1.084 units.
c) have no effect on R&D.
d) uncertain.

8
16. The 95% confidence interval for the sales elasticity given that t
critical=2.045, equals:
a) (-0.143, 2.311)
b) (1.024, 1.227)
c) (-0.961, 3.129)
d) (0.9613, 1.2067)
̂ − 𝒕𝜶/𝟐 ∗ 𝒔𝒆(𝜷
𝜷 ̂) < 𝜷 < 𝜷 ̂ + 𝒕𝜶/𝟐 ∗ 𝒔𝒆(𝜷
̂)
1.084 - 2.045 (0.06) < 𝜷 < 1.084 + 2.045 (0.06)

17. The profit margin have


a) Positive significant impact on R&D spending.
b) Negative significant impact on R&D spending.
c) Insignificant impact on R&D spending.
d) Cannot decide.

t= 0.0217/ 0.0218 = 0.995 < 2 , do not reject the null hypothesis so the var.
prof.marg is not significant at 95% CL.

18. For sales = 70000 and profit margin = 145, the expected value of R&D
spending will be:
a) 4.018586 b) 75878 c) 10423.17 d) 7.4488

𝐥𝐨𝐠̂ (𝑹𝑫) = −𝟒. 𝟑𝟖 + 𝟏. 𝟎𝟖𝟒 𝐥𝐨𝐠(𝟕𝟎𝟎𝟎𝟎)= 0.872086


Shift log (0.872086)= 7.4488
We did not substitute the value of profmarg because the var. is not
significant.

You might also like