0% found this document useful (0 votes)
142 views

Midterm 1a Solutions

This document provides instructions and questions for the Economics 2P91 midterm exam to be written on October 5, 2022. It includes instructions on exam timing and materials allowed, as well as 7 questions testing multiple choice, short answer, and long answer concepts. The long answer question involves using ordinary least squares regression to analyze the relationship between wins and average shots against for 5 NHL teams in the 2021-2022 season.

Uploaded by

kyle.krist13
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
142 views

Midterm 1a Solutions

This document provides instructions and questions for the Economics 2P91 midterm exam to be written on October 5, 2022. It includes instructions on exam timing and materials allowed, as well as 7 questions testing multiple choice, short answer, and long answer concepts. The long answer question involves using ordinary least squares regression to analyze the relationship between wins and average shots against for 5 NHL teams in the 2021-2022 season.

Uploaded by

kyle.krist13
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

BROCK UNIVERSITY

DEPARTMENT OF ECONOMICS
ECONOMICS 2P91

MIDTERM 1

Wednesday, October 5, 2022, 1:00 - 3:50 pm

Time Allowed: 170 minutes

Name: ____________________________________________
Student ID: ____________________________

Please read the following instructions carefully before you begin.

(1) This is a 170 minute, closed-book exam.

(2) There are 7 questions in this exam. Answer all parts of the questions. Use the answer
booklet provided to write down your answers. Do not write in this exam booklet.

(3) Please keep your desk clear of bags and other belongings. You should only have your
writing materials and your calculator on your desk.

(4) All cellular phones, electronic dictionaries, and other electronic devices other than your
calculator should be left in your bag. Please make sure that your cellular phone
is turned off during the exam.

(5) Write your name and student ID number on your exam booklet.

(6) Note the formula sheet attached at the end of this exam.

(7) Clearly show your work in all questions where computation is required.

(8) Please write legibly.

(9) Round your answers to two decimal places.

1
Multiple Choice (30 points)
Question 1
A p-value of 0.07 is statistically significant at the
(a) 1% level

(b) 5% level

(c) 10% level

(d) None of these answers is correct

(e) Multiple answers are correct (specify which)

Question 2
The variance of the sampling distribution is smaller when
(a) ûi 2 is smaller

(b) ûi 2 is larger

(c) (Xi − X̄)2 is smaller

(d) n is smaller

(e) None of these answers is correct

(f) Multiple answers are correct (specify which)

Question 3
Which of the following is linear in parameters?
(a) Yi = β0 + ln(β1 ) ln(Xi ) + ui

(b) ln(Yi ) = ln(β0 ) + β1 ln(Xi ) + ui


 
(c) Yi = β0 + β1 X1i + ui

β0
(d) Yi = β1
ln(Xi ) + ui

(e) None of these answers is correct

(f) Multiple answers are correct (specify which)

Question 4
Imagine we are interested in the effect of schooling on wages. Imagine further that we are
concerned about parents’ socioeconomic status being an omitted variable in the error term.
We have omitted variable bias if

2
(a) Parents’ socioeconomic status affects wages and is not correlated with schooling.

(b) Parents’ socioeconomic status does not affect wages and is not correlated with schooling.

(c) Parents’ socioeconomic status affects wages and is correlated with schooling.

(d) Parents’ socioeconomic status does not affects wages and is correlated with schooling.

(e) None of these answers is correct

(f) Multiple answers are correct (specify which)

Question 5
If we reject the null hypothesis that β1 = β1,0 , it means that we think

(a) Our estimate of βˆ1 is correct.

(b) Our estimate of βˆ1 was most likely unlucky.

(c) The sampling distribution of the true value, β1 , isn’t centered on β1,0

(d) The sampling distribution of the true value, β1 , is centered on β1,0

(e) None of these answers is correct

(f) Multiple answers are correct (specify which)

3
Short Answer (30 points)
Question 6

(a) What is the difference between the population and sample regression functions? Write
out both functions, and explain how they differ.
The population regression function (PRF) is Yi = β0 + β1 Xi and is typically unobserved
/ unknown, and is based on the entire population data. On the other hand, the sample
regression data (SRF) is Ŷi = βˆ0 + βˆ1 Xi and is estimated using the observed sample
data. In other words, the SRF is an estimate of the unknown PRF, based on a sample
from the population.

(b) What is the difference between homoskedasticity and heteroskedasticity? Sketch a graph
of each.
Homoskedasticity means "same variance" and heteroskedasticity means "different vari-
ance". In a homoskedastic world, the variance of ui does not depend on Xi —it is constant
for each observation. In a heteroskedastic world, the variance of ui depends on Xi –it can
change across observations. In the figure for heteroskedasticity below as X gets larger,
the variance of ui gets larger whereas for homoskedasticity the variance of ui is the same
no matter what X is.

4
(c) What does it mean for OLS to be unbiased? How does this relate to assumption 1, that
the error term and our independent variable are uncorrelated?
Unbiasedness mean that E(βˆ1 = β1 ), which means that on average (or “in expectation”)
βˆ1 equals the true population value, β1 . If assumption 1 is violated, then E(βˆ1 ) 6= β1 , it
equals β1 + bias.

(d) Why do we need to make assumption 3, outliers are rare (or there are no outliers)? What
would you do if you had an outlier in the data?
Because OLS squares the residuals, a large residual becomes very large and this can
dramatically shift our value of βˆ1 as OLS tries to make that really large residual smaller
(this is what we saw with our 8th grade millionaire example). Because this is a problem,
we need to assume outliers are unlikely.
If we had an outlier we could 1) use the LAD estimator; 2) exclude that observation
from the dataset; or 3) try to replace the Yi for that outlier with the Ŷi we would get
from running the regression of Y on X without the outlier. There are pros and cons of
each of these but you don’t need to know them for now, it is enough to know that there
are some options for dealing with outliers.

(e) In your own words, what does assumption 2, our data is identically independently dis-
tributed, mean? Are you concerned about it if your data was about the number of crimes
in each neighbourhood in St. Catharines every year? Why or why not?
Assumption 2 means that having information about the number of crimes in on neigh-
bourhood of St. Catharines in a given year, say last year, have isn’t informative about
the number of crimes for that neighbourhood in another year, say this year. We should

5
be concerned about assumption 2 here because this is time series data, that is we’re
looking at the same units of observations multiple times. We saw in class that because
many variables tend to be “sticky” or “persistent” the values from one year end up be-
ing very close to values from close-by years. This means that assumption 2 would be
violated.

6
Long Answer (45 points)
Question 7

Obs # Team Wins Avg. Shots Against


1 Toronto Y1 = 54 X1 = 29.19
2 Minnesota Y2 = 53 X2 = 29.45
3 Edmonton Y3 = 49 X3 = 30.75
4 New Jersey Y4 = 27 X4 = 30.88
5 Seattle Y5 = 27 X5 = 28.39

Table 1: Average shots against and wins in 2021-2022 NHL season

(a) Use the OLS formulas provided in class to regress the number of wins (Yi ) on the corre-
sponding average shots against (Xi ). That is, estimate the following regression model:

Y i = β0 + β1 Xi + u i (1)

Report your estimates for β̂0 and β̂1 . Show your work.
The correct value for β̂1 is 0.095 but because of rounding you may have an answer of 0.17
if using the first equation for βˆ1 and 0.1 or 0.9 if using the second equation (depending
on rounding).
2.5 marks for plugging in the X̄ and Ȳ values into equation for βˆ0
The correct β̂0 is 39.18 but with rounding your answer might be either 39.03 if you got
0.1 for βˆ1 or 39.95 if they got 0.17.

(b) The variance of the sampling distribution (σ̂β̂2 ) is 65.09. Use this information and your
1
estimates to test the following hypothesis at the 5% significance level (using the t-critical
value of 3.18):

H0 : β1 = 0
H1 : β1 6= 0

Based on your result, which conclusions can you draw about the relationship between
team wins and defensive ability?
The correct t-values are: 0.01 if using 0.1 or 0.09 for βˆ1 0.02 if using 0.16 or 0.17 for βˆ1
We want to compare this to the t-critical of 3.18, which means we will fail to reject the null
hypothesis. Therefore, the estimate is not statisticially significant at the 5% significance
level (95% confidence level) and we cannot rule out that there is no relationship between
team wins and defensive ability.

7
(c) Using your estimates, calculate the 95% confidence interval for βˆ1 .
The correct confidence interval is [-23.75, 23.94] which uses the t-critical value of 3.18
rather than the large sample 1.96. Using the 1.96 interval as specified in the equation
and 0.1 as the βˆ1 gives [-14.60, 14.80]. If they used 0.09 it’s very similar [-14.61, 14.80].
If you used βˆ1 as 0.17 it’s [-14.53, 14.87] and if 0.16 it’s [-14.54, 14.86].
Can use 3.18 instead of 1.96 in the equation and get full marks. You were not penalized
for using 1.96 instead of 3.18.

8
Formulas

1
Pn 2 2
1 n−2 i=1 (Xi − X̄) ûi
σ̂β̂21 = ×  Pn 2
n 1
(Xi − X̄)2
n i=1

q
SE(β̂1 ) = σ̂β̂2
1

β̂0 = Ȳ − β̂1 X̄

1
Pn
n i=1 Xi Yi − X̄ Ȳ
β̂1 = 1
P n 2 2
n i=1 Xi − X̄

Pn
i=1 (Xi − X̄)(Yi − Ȳ )
β̂1 = Pn 2
i=1 (Xi − X̄)

β̂1 − β1,0
t=
SE(β̂1 )

CI95 = [βˆ1 − 1.96 ∗ SE(βˆ1 ), βˆ1 + 1.96 ∗ SE(βˆ1 )]

You might also like