Topic 1
Topic 1
S&W
Chapters 2 and 3
Introduction & Statistics
Review
• In 2013, economists asks: “Raising the federal minimum wage to $9 per hour
would make it noticeably harder for low-skilled workers to find employment.”
– Only 34% agreed (see https://www.igmchicago.org/igm-economic-experts-
panel/)
• In-class quizzes
• Midterms:
– Possibly September 13th/16th morning
• Finals:
– Date TBC
y x
Dependent Variable Independent Variable
Explained Variable Explanatory Variable
Response Variable Control Variable
Predicted Variable Predictor Variable
Regressand Regressor
LHS RHS
– Did Lee Kuan Yew (X) have an effect on Singapore’s development (Y)?
– Does having only one child (X) affect a child’s development (Y)?
• This isolates the impact of the treatment from the impact of other
factors.
– Only difference between groups should be the treatment
– Any difference in outcomes must be due to the treatment
standard=
deviation variance σ Y
=
skewness =
3
σY
= measure of asymmetry of a distribution
• skewness = 0: distribution is symmetric
• skewness > (<) 0: distribution has long right (left) tail
E (Y − µY )
4
kurtosis =
4
σY
= measure of mass in tails
= measure of probability of large values
• kurtosis = 3: normal distribution
• skewness > 3: heavy tails (“leptokurtotic”)
Copyright © 2019 Pearson Education Ltd. All Rights Reserved.
(b) Moments of a population distribution: mean,
variance, standard deviation, covariance, correlation
(3 of 3)
So is the correlation…
Copyright © 2019 Pearson Education Ltd. All Rights Reserved.
The correlation coefficient is defined in
terms of the covariance:
cov( X , Z ) σ XZ
corr( =
X ,Z) = = rXZ
var( X ) var( Z ) σ X σ Z
• –1 ≤ corr(X,Z) ≤ 1
• corr(X,Z) = 1 mean perfect positive linear association
• corr(X,Z) = –1 means perfect negative linear association
• corr(X,Z) = 0 means no linear association
= σ Y2
= measure of the squared spread of the distribution
standard=
deviation variance σ Y
=
Ῡ is a random variable:
– The individuals in the sample are drawn at random. Thus the values of
(Y1, …, Yn) are random
– Thus functions of (Y1, …, Yn), such as Ῡ , are random: had a different
sample been drawn, they would have taken on a different value
– I.e. Ῡ has a probability distribution
Properties are determined by the sampling distribution of Ῡ
– The probability distribution associated with possible values of Ῡ for
different possible samples
– The mean and variance of Ῡ are the mean and variance of its sampling
distribution, E(Ῡ ) and var(Ῡ ).
– The concept of the sampling distribution underpins all of econometrics.
Implications:
1. Ῡ is an unbiased estimator of μY (that is, E(Ῡ ) = μY)
2. var(Ῡ ) is inversely proportional to n
1. the spread of the sampling distribution is proportional to 1/ n
2. Thus the sampling uncertainty associated with Y is proportional
to 1/ n (larger samples, less uncertainty, but square-root law)
• For this class, LLN and CLT are auxiliary results that you won’t
need to prove but need to understand the intuition for:
– LLN helps prove/show that our estimators are consistent
– CLT allows us to construct confidence intervals and do hypotheses
testing on our estimator
…
1
𝑌𝑌𝑛𝑛 = ∗ (𝑌𝑌1 + 𝑌𝑌2 + ⋯ + 𝑌𝑌𝑛𝑛 )
𝑛𝑛
Y − E (Y )
− is approximately N (0,1) (Converges in distribution
var(Y ) to normal dist. CLT)
Hypothesis Testing
The hypothesis testing problem (for the mean): based on evidence
we have, test whether a null hypothesis is true versus whether an
alternative hypothesis is true. That is, test
– H0: E(Y ) = μY,0 vs. H1: E(Y ) > μY,0 (1-sided, >)
– H0: E(Y ) = μY,0 vs. H1: E(Y ) < μY,0 (1-sided, <)
– H0: E(Y ) = μY,0 vs. H1: E(Y ) ≠ μY,0 (2-sided)
p − value =
Pr[| Y − µY ,0 | > |Y act − µY ,0 |]
Where Ῡ act is the value of Ῡ actually observed (nonrandom)
Fact:
If (Y1,…,Yn) are i.i.d. and E(Y4) < ∞, then
p
s →σ Y2
2
Y
• This rule is an estimator just like the sample mean but it will produce
two values instead of one: upper and lower values of the intervals
= 657.4 − 650.0
= 7.4
Size Ῡ s2 n
small 657.4 19.4 238
large 650.0 17.9 182
|t| > 1.96, so reject (at the 5% significance level) the null
hypothesis that the two means are the same.