0% found this document useful (0 votes)

70 views

STAT 2006 Chapter 2 - 2022

1. Point estimation involves using sample data to estimate unknown population parameters. A point estimator is a statistic that provides a single value as an estimate of the parameter. 2. The maximum likelihood estimation (MLE) method chooses estimates that make the observed sample data most probable or "likely". The MLE is the value of the parameter(s) that maximize the likelihood function. 3. For a sample from a normal distribution with unknown mean μ and variance σ2, the MLEs of μ and σ2 are the sample mean and sample variance, respectively. The document provides an example of finding MLEs using a sample of female student weights.

Uploaded by

Hiu Yin Yuen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views

STAT 2006 Chapter 2 - 2022

Uploaded by

Hiu Yin Yuen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 83

Point Estimation

STAT 2006 Chapter 2

Estimation

Presented by
Simon Cheung
Email: [email protected]

Department of Statistics, The Chinese University of Hong Kong

STAT 2006 - Jan 2022 1

Introduction

Suppose we want to learn a population mean 𝜇𝜇 or a population proportion 𝑝𝑝. For

example,
• 𝑝𝑝 is the unknown proportion of college students in Hong Kong who do not have a
smart phone.
• 𝜇𝜇 is the unknown mean number of hours it takes to serve a customer in a restaurant.
In either case, we cannot survey the entire population. That is, we cannot survey all
college students in Hong Kong, nor can we measure the service time of all customers in
the restaurant. So, we take a random sample from the population, and use the data to
estimate the value of the population parameter.
To provide a good estimate of the population parameter, we introduce the concept of an
unbiased estimator and minimum variance unbiased estimator.

STAT 2006 - Jan 2022 2

Point Estimation

We denote a random sample of size 𝑛𝑛 as 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 and denote the corresponding
observed values of the random sample as 𝑥𝑥1 , 𝑥𝑥2 , … , 𝑥𝑥𝑛𝑛 .
The pdf 𝑓𝑓 𝑥𝑥; 𝜃𝜃 of the random variable 𝑋𝑋 depends on an unknown parameter 𝜃𝜃, taking
value in a set Ω. Ω is called the parameter space.
For example, if 𝜇𝜇 is the GPA of all college students, then the parameter space is Ω =
𝜇𝜇: 0 ≤ 𝜇𝜇 ≤ 4.3 , and, if 𝑝𝑝 denotes the proportion of students who works part-time, then
the parameter space is Ω = 𝑝𝑝: 0 ≤ 𝑝𝑝 ≤ 1 .
Definition. A statistic is a function of 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 , write 𝑢𝑢 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 . A point
estimator 𝜃𝜃� to estimate 𝜃𝜃 is a statistic of the random sample 𝜃𝜃� = 𝜃𝜃� 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 .

STAT 2006 - Jan 2022 3

Point Estimation

For example,
𝑛𝑛
1
• The function 𝑋𝑋� = � 𝑋𝑋𝑗𝑗 is a point estimator of the population mean 𝜇𝜇.
𝑛𝑛 𝑗𝑗=1
𝑛𝑛
1
• Let 𝑋𝑋𝑖𝑖 = 0 or 1. The function 𝑝𝑝̂ = � 𝑋𝑋𝑗𝑗 is a point estimator of the population
𝑛𝑛 𝑗𝑗=1
proportion 𝑝𝑝.
𝑛𝑛 2
1
• The function 𝑆𝑆 2 = � 𝑋𝑋𝑗𝑗 − 𝑋𝑋� is a point estimator of the population variance
𝑛𝑛−1 𝑗𝑗=1
𝜎𝜎 2 .

STAT 2006 - Jan 2022 4

Point Estimation

Definition. A point estimate of 𝜃𝜃 is computed from the observed sample 𝑢𝑢 𝑥𝑥1 , 𝑥𝑥2 , … , 𝑥𝑥𝑛𝑛 ,
where 𝑢𝑢 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 is a point estimator of 𝜃𝜃.
For example, if 𝑥𝑥𝑖𝑖 s are the observed GPA of a random sample of 88 students, then
88
1
𝑥𝑥̅ = � 𝑥𝑥𝑗𝑗
88
𝑗𝑗=1
is a point estimate of the mean GPA 𝜇𝜇 of all students in the population. Define 𝑥𝑥𝑖𝑖 = 0 if a
student does not work part-time and 𝑥𝑥𝑖𝑖 = 1 if a student works part-time, then 𝑝𝑝̂ = 0.11 is
a point estimate of 𝑝𝑝, the proportion of all students in the population who works part-
time.

STAT 2006 - Jan 2022 5

Maximum Likelihood Estimation

Basic idea. A good estimate of the unknown parameter 𝜃𝜃 is the value of 𝜃𝜃 which maximizes
the likelihood of getting the data we observed.
Suppose that 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 is a random sample for which the pdf of each 𝑋𝑋𝑖𝑖 is 𝑓𝑓 𝑥𝑥𝑖𝑖 , 𝜃𝜃 .
Then, the joint pdf of 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 is, as a function of 𝜃𝜃,
𝑛𝑛

𝐿𝐿 𝜃𝜃 = 𝑃𝑃 𝑋𝑋1 = 𝑥𝑥1 , 𝑋𝑋2 = 𝑥𝑥2 , … , 𝑋𝑋𝑛𝑛 = 𝑥𝑥𝑛𝑛 = � 𝑓𝑓 𝑥𝑥𝑗𝑗 , 𝜃𝜃 ,

𝑗𝑗=1

where, by definition, 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 are independent. Therefore, our objective is to find 𝜃𝜃�
such that
𝜃𝜃� = argmax 𝐿𝐿 𝜃𝜃 .
𝜃𝜃∈Ω
We call 𝐿𝐿 𝜃𝜃 the likelihood function of 𝜃𝜃.

STAT 2006 - Jan 2022 6

Maximum Likelihood Estimation
For example. Suppose that 𝑋𝑋 = 0 if a randomly selected student does not possess a car and
𝑋𝑋 = 1 if a randomly selected student possesses a car. Let 𝑝𝑝 be the probability that 𝑋𝑋 = 1.
Note that 𝑝𝑝 = 𝐸𝐸 𝑋𝑋 . Now we draw a random sample 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 of 𝑛𝑛 students. Find the
maximum likelihood estimator of 𝑝𝑝.
Answer. The pdf of 𝑋𝑋 is 𝑓𝑓 𝑥𝑥, 𝑝𝑝 = 𝑝𝑝 𝑥𝑥 1 − 𝑝𝑝 1−𝑥𝑥 , where 𝑥𝑥 = 0 or 𝑥𝑥 = 1 and 0 < 𝑝𝑝 < 1.
Hence, the likelihood function of 𝑝𝑝 is
𝑛𝑛 𝑛𝑛 𝑛𝑛
� 𝑥𝑥𝑗𝑗 𝑛𝑛−� 𝑥𝑥𝑗𝑗
𝑥𝑥𝑗𝑗 1−𝑥𝑥𝑗𝑗
𝐿𝐿 𝑝𝑝 = � 𝑝𝑝 1 − 𝑝𝑝 = 𝑝𝑝 𝑗𝑗=1 1 − 𝑝𝑝 𝑗𝑗=1 .
𝑗𝑗=1
Since the log function is an increasing function, the 𝑝𝑝̂ which maximizes 𝐿𝐿 𝑝𝑝 also maximizes
ln 𝐿𝐿 𝑝𝑝 . We have
𝑛𝑛 𝑛𝑛

ln 𝐿𝐿 𝑝𝑝 = � 𝑥𝑥𝑗𝑗 ln 𝑝𝑝 + 𝑛𝑛 − � 𝑥𝑥𝑗𝑗 ln 1 − 𝑝𝑝 .
𝑗𝑗=1 𝑗𝑗=1

STAT 2006 - Jan 2022 7

Maximum Likelihood Estimation
Taking the first derivative of the log-likelihood with respect to 𝑝𝑝, we have
𝑛𝑛 𝑛𝑛
� 𝑥𝑥𝑗𝑗 𝑛𝑛 − � 𝑥𝑥𝑗𝑗
𝜕𝜕 ln 𝐿𝐿 𝑝𝑝 𝑗𝑗=1 𝑗𝑗=1
= − .
𝜕𝜕𝑝𝑝 𝑝𝑝 1 − 𝑝𝑝
𝜕𝜕 ln 𝐿𝐿 𝑝𝑝
The 𝑝𝑝̂ which maximizes ln 𝐿𝐿 𝑝𝑝 satisfies � = 0. By setting the first derivative to 0,
𝜕𝜕𝜕𝜕 𝑝𝑝=𝑝𝑝�
𝑛𝑛
1
we find that an estimate of 𝑝𝑝 is 𝑝𝑝̂ = � 𝑥𝑥𝑗𝑗 . To verify that 𝑝𝑝̂ is a maximum, we take the
𝑛𝑛 𝑗𝑗=1
second derivative with respect to 𝑝𝑝 to obtain
𝜕𝜕 2 ln 𝐿𝐿 𝑝𝑝 𝑛𝑛𝑥𝑥̅ 𝑛𝑛 1 − 𝑥𝑥̅
2
=− − 2
< 0.
𝜕𝜕𝑝𝑝 𝑝𝑝 1 − 𝑝𝑝
�
It follows that 𝑝𝑝̂ which maximizes ln 𝐿𝐿 𝑝𝑝 . Thus, the mle of 𝑝𝑝 is 𝑝𝑝̂ = 𝑋𝑋.

STAT 2006 - Jan 2022 8

Maximum Likelihood Estimation
Let 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 be a random sample of size 𝑛𝑛 from a distribution that depends on one or
more parameters 𝜃𝜃1 , 𝜃𝜃2 , … , 𝜃𝜃𝑚𝑚 with pdf 𝑓𝑓 𝑥𝑥, 𝜃𝜃1 , 𝜃𝜃2 , … , 𝜃𝜃𝑚𝑚 . Suppose that the parameter
space is Ω. We have the following definitions.
• Regarded as a function of 𝜃𝜃1 , 𝜃𝜃2 , … , 𝜃𝜃𝑚𝑚 , the likelihood function for 𝜃𝜃1 , 𝜃𝜃2 , … , 𝜃𝜃𝑚𝑚 is
defined as
𝑛𝑛

𝐿𝐿 𝜃𝜃1 , 𝜃𝜃2 , … , 𝜃𝜃𝑚𝑚 = � 𝑓𝑓 𝑥𝑥𝑖𝑖 , 𝜃𝜃1 , 𝜃𝜃2 , … , 𝜃𝜃𝑚𝑚 ,

𝑗𝑗=1
where 𝜃𝜃1 , 𝜃𝜃2 , … , 𝜃𝜃𝑚𝑚 ∈ Ω.
• Let 𝜃𝜃�𝑖𝑖 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 , 𝑖𝑖 = 1,2, … , 𝑚𝑚 be 𝑚𝑚 statistics for the random sample 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 .
If 𝜃𝜃�1 𝑥𝑥1 , 𝑥𝑥2 , … , 𝑥𝑥𝑛𝑛 , … , 𝜃𝜃�𝑚𝑚 𝑥𝑥1 , 𝑥𝑥2 , … , 𝑥𝑥𝑛𝑛 maximizes the likelihood function, then
𝜃𝜃�𝑖𝑖 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 is the maximum likelihood estimator (mle) of 𝜃𝜃𝑖𝑖 for 𝑖𝑖 = 1,2, … , 𝑚𝑚. The
corresponding observed values, 𝜃𝜃�𝑖𝑖 𝑥𝑥1 , 𝑥𝑥2 , … , 𝑥𝑥𝑛𝑛 , is called the maximum likelihood
estimate of 𝜃𝜃𝑖𝑖 for 𝑖𝑖 = 1,2, … , 𝑚𝑚.
STAT 2006 - Jan 2022 9
Maximum Likelihood Estimation

Example. Suppose the weights of randomly selected female college students are normally
distributed with unknown mean 𝜇𝜇 and variance 𝜎𝜎 2 . A random sample of 10 female college
students have the following weights (in pounds): 115, 122, 130, 127, 149, 160, 152, 138,
149, 180. Identify the likelihood function and the mle of 𝜇𝜇, the mean weight of all female
college students, and that of 𝜎𝜎 2 . Using the given sample, find a maximum likelihood
estimate of 𝜇𝜇 and 𝜎𝜎 2 .
Answer. Let 𝑋𝑋 be the weight in pounds of a randomly selected female college student. The
1 𝑥𝑥−𝜇𝜇 2
pdf of 𝑋𝑋 is 𝑓𝑓 𝑥𝑥, 𝜇𝜇, 𝜎𝜎 2 = exp − , for 𝑥𝑥 ∈ ℝ. The parameter space is Ω =
2𝜎𝜎2
2𝜋𝜋𝜎𝜎2
𝜇𝜇, 𝜎𝜎 , 𝜇𝜇 ∈ ℝ, 𝜎𝜎 > 0 . Thus, the likelihood function of 𝜇𝜇 and 𝜎𝜎 2 is given by
𝑛𝑛
−
𝑛𝑛 1 2
𝐿𝐿 𝜇𝜇, 𝜎𝜎 2 = 2𝜋𝜋𝜋𝜋 2 2 exp − 2 � 𝑥𝑥𝑗𝑗 − 𝜇𝜇 .
2𝜎𝜎
𝑗𝑗=1

STAT 2006 - Jan 2022 10

Maximum Likelihood Estimation
To find mle of 𝜇𝜇, we find 𝜇𝜇̂ which maximizes the log-likelihood function, that is log 𝐿𝐿 𝜇𝜇, 𝜎𝜎 2
assuming that 𝜎𝜎 is a known constant. Ignoring the constant terms,
𝑛𝑛
2
𝑛𝑛 2
1 2
log 𝐿𝐿 𝜇𝜇, 𝜎𝜎 = − log 𝜎𝜎 − 2 � 𝑥𝑥𝑗𝑗 − 𝜇𝜇 .
2 2𝜎𝜎
𝑗𝑗=1

Differentiating log 𝐿𝐿 𝜇𝜇, 𝜎𝜎 2 with respect to 𝜇𝜇, we have

𝑛𝑛
𝜕𝜕 2
1
log 𝐿𝐿 𝜇𝜇, 𝜎𝜎 = 2 � 2 𝑥𝑥𝑗𝑗 − 𝜇𝜇 .
𝜕𝜕𝜕𝜕 2𝜎𝜎
𝑗𝑗=1
Also, differentiating log 𝐿𝐿 𝜇𝜇, 𝜎𝜎 2 with respect to 𝜎𝜎 2 , we
have
𝑛𝑛
𝜕𝜕 2
𝑛𝑛 1 2
2
log 𝐿𝐿 𝜇𝜇, 𝜎𝜎 = − 2 + 2 2
� 𝑥𝑥𝑗𝑗 − 𝜇𝜇 .
𝜕𝜕𝜎𝜎 2𝜎𝜎 2 𝜎𝜎
𝑗𝑗=1

STAT 2006 - Jan 2022 11

Maximum Likelihood Estimation
The 𝜇𝜇,̂ 𝜎𝜎� 2 which maximizes the log 𝐿𝐿 satisfies that
𝜕𝜕 𝜕𝜕
log 𝐿𝐿 𝜇𝜇, 𝜎𝜎 = 0 and 2
log 𝐿𝐿 𝜇𝜇, 𝜎𝜎 = 0.
𝜕𝜕𝜕𝜕 𝜕𝜕𝜎𝜎
𝑛𝑛 2
1
It follows that 𝜇𝜇̂ = 𝑋𝑋� and 𝜎𝜎� 2 = � �
𝑋𝑋𝑗𝑗 − 𝑋𝑋 . To show that 𝜇𝜇,̂ 𝜎𝜎� 2 is a maximum, we
𝑛𝑛 𝑗𝑗=1
differentiate log 𝐿𝐿 𝜇𝜇, 𝜎𝜎 twice with respect to 𝜇𝜇 and with respect to 𝜎𝜎 2 respectively. We
have
𝜕𝜕 2 𝑛𝑛
2
log 𝐿𝐿 𝜇𝜇, 𝜎𝜎 � =− 2<0
𝜕𝜕𝜇𝜇 2
𝜎𝜎�
� ,�
𝜇𝜇 𝜎𝜎
𝜕𝜕 2 𝑛𝑛
log 𝐿𝐿 𝜇𝜇, 𝜎𝜎 � = − 2 < 0.
𝜕𝜕 𝜎𝜎 2 2 2𝜎𝜎�
𝜎𝜎 2
� ,�
𝜇𝜇

STAT 2006 - Jan 2022 12

Maximum Likelihood Estimation
𝑛𝑛
2 2
𝜕𝜕 log 𝐿𝐿 𝑛𝑛 𝜕𝜕 log 𝐿𝐿 𝑛𝑛 1 2
2
= − 2, 2 2
= − 2 � 𝑥𝑥𝑗𝑗 − 𝜇𝜇
𝜕𝜕𝜇𝜇 𝜎𝜎 𝜕𝜕 𝜎𝜎 2 𝜎𝜎 2 2 𝜎𝜎 3
𝑗𝑗=1
and
𝜕𝜕 2 log 𝐿𝐿 𝜕𝜕 2 log 𝐿𝐿 𝑛𝑛
2
= 2
=− 2 2
𝑋𝑋� − 𝜇𝜇
𝜕𝜕𝜎𝜎 𝜕𝜕𝜕𝜕 𝜕𝜕𝜕𝜕𝜕𝜕𝜎𝜎 𝜎𝜎
We have
2
𝜕𝜕 2 log 𝐿𝐿 𝜕𝜕 2 log 𝐿𝐿 𝜕𝜕 2 log 𝐿𝐿 𝑛𝑛2
𝐷𝐷 𝜇𝜇,̂ 𝜎𝜎� 2 = 2 � 2 2 � − 2 � = >0
𝜕𝜕𝜇𝜇 𝜕𝜕 𝜎𝜎 𝜕𝜕𝜎𝜎 𝜕𝜕𝜕𝜕 2 𝜎𝜎� 2 3
𝜇𝜇 𝜎𝜎 2
� ,� 𝜎𝜎 2
� ,�
𝜇𝜇 𝜎𝜎 2
� ,�
𝜇𝜇
Ref:
http://sites.science.oregonstate.edu/math/home/programs/undergrad/CalculusQuestSt
udyGuides/vcalc/min_max/min_max.html
STAT 2006 - Jan 2022 13
Maximum Likelihood Estimation
Based on the given sample, a maximum likelihood estimate of 𝜇𝜇 is
𝑛𝑛
1 115+122+130+127+149+160+152+138+149+180
𝜇𝜇̂ = � 𝑥𝑥𝑗𝑗 = = 142.2 pounds,
𝑛𝑛 𝑗𝑗=1 10

and that of 𝜎𝜎 2 is
𝑛𝑛 𝑛𝑛
1 1 1152 + ⋯ + 1802
2
2
𝜎𝜎� = � 𝑥𝑥𝑗𝑗 − 𝑥𝑥̅ = � 𝑥𝑥𝑗𝑗2 − 𝑥𝑥̅ 2 = − 142.22 = 347.96.
𝑛𝑛 𝑛𝑛 10
𝑗𝑗=1 𝑗𝑗=1

Note that
• the estimator is defined using capital letters to reflect that it is a random variable. But an
estimate is defined using lowercase letters to reflect that its value is fixed based on the
given random sample.
• the mle of 𝜎𝜎 2 is different from the sample variance 𝑆𝑆 2 .

STAT 2006 - Jan 2022 14

Unbiased Estimation
We show that if 𝑋𝑋 is a Bernoulli random variable with parameter 𝑝𝑝, then 𝑝𝑝̂ = 𝑋𝑋� is the mle
𝑛𝑛 2
1
of 𝑝𝑝, and if 𝑋𝑋~𝑛𝑛 𝜇𝜇, 𝜎𝜎 2
random variable, then 𝜇𝜇̂ = 𝑋𝑋� and 𝜎𝜎� 2 = � 𝑋𝑋𝑗𝑗 − 𝑋𝑋� are the
𝑛𝑛 𝑗𝑗=1
mle of 𝜇𝜇, 𝜎𝜎 2 respectively. However, are the mles good in any sense? A measure of
goodness is unbiasedness.
Definition. An estimator 𝜃𝜃� 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 of 𝜃𝜃 is unbiased if 𝐸𝐸 𝜃𝜃� 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 = 𝜃𝜃.
Otherwise, it is called a biased estimator of 𝜃𝜃 and the quantity 𝐸𝐸 𝜃𝜃� − 𝜃𝜃 ≠ 0 is called the
� .
Bias of 𝜃𝜃.
Example. If 𝑋𝑋 is a Bernoulli random variable with parameter 𝑝𝑝, then the mle of 𝑝𝑝, that is
� is an unbiased estimator of 𝑝𝑝.
𝑝𝑝̂ = 𝑋𝑋,
𝑛𝑛
1
𝐸𝐸 𝑝𝑝̂ = � 𝐸𝐸 𝑋𝑋𝑗𝑗 = 𝑝𝑝.
𝑛𝑛
𝑗𝑗=1

STAT 2006 - Jan 2022 15

Unbiased Estimation

Example. If 𝑋𝑋~𝑛𝑛 𝜇𝜇, 𝜎𝜎 2 random variable, then are the mle of 𝜇𝜇, 𝜎𝜎 2 , that is 𝜇𝜇̂ = 𝑋𝑋� and
𝑛𝑛 2
1
2
𝜎𝜎� = � �
𝑋𝑋𝑗𝑗 − 𝑋𝑋 , unbiased?
𝑛𝑛 𝑗𝑗=1
𝑛𝑛
1
Answer. 𝐸𝐸 𝜇𝜇̂ = � 𝐸𝐸 𝑋𝑋𝑗𝑗 = 𝜇𝜇
𝑛𝑛 𝑗𝑗=1
𝑛𝑛 𝑛𝑛
2
1 1 𝜎𝜎 1 2
𝐸𝐸 𝜎𝜎� = � 𝐸𝐸 𝑋𝑋𝑗𝑗2 − 𝐸𝐸
2
𝑋𝑋� 2 2 2
= � 𝜎𝜎 + 𝜇𝜇 − 2
+ 𝜇𝜇 = 1 − 𝜎𝜎 ≠ 𝜎𝜎 2 .
𝑛𝑛 𝑛𝑛 𝑛𝑛 𝑛𝑛
𝑗𝑗=1 𝑗𝑗=1

Hence, 𝜇𝜇̂ is an unbiased estimator of 𝜇𝜇 but 𝜎𝜎� 2 is a biased estimator of 𝜎𝜎 2 . Note also that
𝜎𝜎 2
𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋� = and 𝐸𝐸 𝑋𝑋 2 = 𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋 + 𝐸𝐸 𝑋𝑋 2 .
𝑛𝑛

STAT 2006 - Jan 2022 16

Unbiased Estimation

Example. If 𝑋𝑋~𝑛𝑛 𝜇𝜇, 𝜎𝜎 2 random variable, find an unbiased estimator of 𝜎𝜎 2 .

Answer. Recall that
𝑛𝑛 − 1 𝑆𝑆 2 2
2
~𝜒𝜒 𝑛𝑛−1 .
𝜎𝜎
𝑛𝑛−1 𝑆𝑆 2 2 = 𝜎𝜎 2 . It follows that an unbiased estimator of 𝜎𝜎 2 is
Hence, 𝐸𝐸 = 𝑛𝑛 − 1 ⟹ 𝐸𝐸 𝑆𝑆
𝜎𝜎 2
𝑆𝑆 2 . However, 𝑆𝑆 is not an unbiased estimator of 𝜎𝜎.

STAT 2006 - Jan 2022 17

Unbiased Estimation

Definition. 𝜃𝜃� is an asymptotically unbiased estimator of 𝜃𝜃 if

lim 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝜃𝜃� = lim 𝐸𝐸 𝜃𝜃� − 𝜃𝜃 = 0,
𝑛𝑛→∞ 𝑛𝑛→∞
where 𝑛𝑛 is the sample size.
𝑛𝑛 2
1
Example. If 𝑋𝑋~𝑛𝑛 𝜇𝜇, 𝜎𝜎 2 random variable, then 𝜎𝜎� 2 = � �
𝑋𝑋𝑗𝑗 − 𝑋𝑋 , the mle of 𝜎𝜎 2 , is a
𝑛𝑛 𝑗𝑗=1
1
biased estimator of 𝜎𝜎 2 .
But, since 𝐸𝐸 𝜎𝜎� 2 = 1 − 𝜎𝜎 2 ⟶ 𝜎𝜎 2 , as 𝑛𝑛 tends to infinity, 𝜎𝜎� 2 is
𝑛𝑛
asymptotically unbiased for 𝜎𝜎 2 .

STAT 2006 - Jan 2022 18

Method of Moments

Definition. Let 𝑋𝑋 be a random variable with finite mean 𝜇𝜇.

• 𝐸𝐸 𝑋𝑋 𝑘𝑘 is the 𝑘𝑘 𝑡𝑡𝑡 theoretical moment of the distribution about the origin, for 𝑘𝑘 = 1,2, …
• 𝐸𝐸 𝑋𝑋 − 𝜇𝜇 𝑘𝑘 is the 𝑘𝑘 𝑡𝑡𝑡 theoretical moment of the distribution about the mean, for 𝑘𝑘 =
1,2, …
𝑛𝑛
1
• 𝑀𝑀𝑘𝑘 = � 𝑋𝑋𝑗𝑗𝑘𝑘 is the 𝑘𝑘 𝑡𝑡𝑡 sample moment, for 𝑘𝑘 = 1,2, …
𝑛𝑛 𝑗𝑗=1
𝑛𝑛 𝑘𝑘
1
• 𝑀𝑀𝑘𝑘∗ = � 𝑋𝑋𝑗𝑗 − 𝑋𝑋� is the 𝑘𝑘 𝑡𝑡𝑡 sample moment about the sample mean, for 𝑘𝑘 =
𝑛𝑛 𝑗𝑗=1
1,2, …

STAT 2006 - Jan 2022 19

Method of Moments
Suppose that the pdf of 𝑋𝑋 is characterized by 𝑚𝑚 parameters, 𝜃𝜃1 , 𝜃𝜃2 , … , 𝜃𝜃𝑚𝑚 , the theoretical moments
should be functions of the parameters, that is 𝐸𝐸 𝑋𝑋 𝑘𝑘 = 𝑔𝑔𝑘𝑘 𝜃𝜃1 , 𝜃𝜃2 , … , 𝜃𝜃𝑚𝑚 , for 𝑘𝑘 = 1,2, … Since the
empirical distribution of 𝑋𝑋 converges to its pdf as the sample size increases to infinity, equating the
first 𝑚𝑚 sample moments with the corresponding theoretical moments create 𝑚𝑚 moment equations
to solve for 𝜃𝜃1 , 𝜃𝜃2 , … , 𝜃𝜃𝑚𝑚 . The values 𝜃𝜃�1 , 𝜃𝜃�2 , … , 𝜃𝜃�𝑚𝑚 that satisfies the moment equations are called
the moment estimators for 𝜃𝜃1 , 𝜃𝜃2 , … , 𝜃𝜃𝑚𝑚 respectively. The steps are
𝑛𝑛
1
• Equate the first sample moment about the origin 𝑀𝑀1 = � 𝑋𝑋𝑗𝑗 = 𝑋𝑋� to the first theoretical
𝑛𝑛 𝑗𝑗=1
moment 𝐸𝐸 𝑋𝑋 .
𝑛𝑛
1
• Equate the second sample moment about the origin 𝑀𝑀2 = � 𝑋𝑋𝑗𝑗2 = 𝑋𝑋 2 to the second
𝑛𝑛 𝑗𝑗=1
theoretical moment 𝐸𝐸 𝑋𝑋 2 .
• Continue equating sample moments about the origin, 𝑀𝑀𝑘𝑘 , with the corresponding theoretical
moments 𝐸𝐸 𝑋𝑋 𝑘𝑘 , for 𝑘𝑘 = 3,4, … until we have as many equations as we have parameters.
• Solve for the parameters.

STAT 2006 - Jan 2022 20

Method of Moments

Let 𝑋𝑋 be a Bernoulli random variable with parameter 𝑝𝑝. Find the moment estimator of 𝑝𝑝.
Answer. Since 𝐸𝐸 𝑋𝑋 = 𝑝𝑝, equating the first sample moment 𝑋𝑋� to the first theoretical
� Hence the moment estimator of 𝑝𝑝 is 𝑋𝑋.
moment 𝑝𝑝 to have 𝑝𝑝� = 𝑋𝑋. �
Let 𝑋𝑋~𝑛𝑛 𝜇𝜇, 𝜎𝜎 2 random variable. Find the moment estimators of 𝜇𝜇 and 𝜎𝜎 2 .
Answer. Since 𝐸𝐸 𝑋𝑋 = 𝜇𝜇 and 𝐸𝐸 𝑋𝑋 2 = 𝜇𝜇2 + 𝜎𝜎 2 , equating the first and the second sample
moment to the corresponding theoretical moment to obtain 𝜇𝜇� = 𝑋𝑋� and 𝜇𝜇� 2 + 𝜎𝜎� 2 = 𝑋𝑋 2 . We
can solve for 𝜇𝜇� and 𝜎𝜎� 2 to obtain 𝜇𝜇� = 𝑋𝑋� and 𝜎𝜎� 2 = 𝑋𝑋 2 −𝑋𝑋� 2 .

STAT 2006 - Jan 2022 21

Method of Moments

Another form of the method.

𝑛𝑛
1
• Equate the first sample moment about the origin 𝑀𝑀1 = � 𝑋𝑋𝑗𝑗 = 𝑋𝑋� to the first
𝑛𝑛 𝑗𝑗=1
theoretical moment 𝐸𝐸 𝑋𝑋 .
𝑛𝑛 2
1
• Equate the second sample moment about the mean 𝑀𝑀2∗ = � 𝑋𝑋𝑗𝑗 − 𝑋𝑋� to the
𝑛𝑛 𝑗𝑗=1
second theoretical moment about the mean 𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋 = 𝐸𝐸 𝑋𝑋 − 𝜇𝜇 2 .
• Continue equating sample moments about the mean, 𝑀𝑀𝑘𝑘∗ , with the corresponding
theoretical moments about the mean 𝐸𝐸 𝑋𝑋 − 𝜇𝜇 𝑘𝑘 , for 𝑘𝑘 = 3,4, … until we have as many
equations as we have parameters.
• Solve for the parameters.

STAT 2006 - Jan 2022 22

Method of Moments

Example. Let 𝑋𝑋 be a Gamma random variable with parameters 𝛼𝛼 and 𝜃𝜃, where its pdf is
1 −
𝑥𝑥
𝛼𝛼−1 𝑒𝑒 𝜃𝜃 , 𝑥𝑥 > 0.
𝑓𝑓 𝑥𝑥 = 𝑥𝑥
Γ 𝛼𝛼 𝜃𝜃 𝛼𝛼
Find the moment estimators of 𝛼𝛼 and 𝜃𝜃.
Answer. The first theoretical moment about the origin is 𝐸𝐸 𝑋𝑋 = 𝛼𝛼𝛼𝛼, and the second
theoretical moment about the mean is 𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋 = 𝛼𝛼𝜃𝜃 2 . By equating the first sample
moment about the origin and the second sample moment about the mean to the
𝑛𝑛 2
1
corresponding theoretical moments, we have 𝛼𝛼� 𝜃𝜃� = 𝑋𝑋� and 𝛼𝛼� 𝜃𝜃� 2 = � 𝑋𝑋𝑗𝑗 − 𝑋𝑋� . It
𝑛𝑛 𝑗𝑗=1
𝑛𝑛
1 2 𝑋𝑋� 𝑛𝑛𝑋𝑋� 2
follows that 𝜃𝜃� = � 𝑋𝑋𝑗𝑗 − 𝑋𝑋� and 𝛼𝛼� = � = 𝑛𝑛 .
𝑛𝑛𝑋𝑋� 𝑗𝑗=1 𝜃𝜃 � 𝑋𝑋𝑗𝑗 −𝑋𝑋�
2
𝑗𝑗=1

STAT 2006 - Jan 2022 23

Method of Moments

Example. Let 𝑋𝑋~𝑛𝑛 𝜇𝜇, 𝜎𝜎 2 random variable. Find the moment estimators of 𝜇𝜇 and 𝜎𝜎 2 .
Answer.
The first theoretical moment about the origin is 𝐸𝐸 𝑋𝑋 = 𝜇𝜇, and the second theoretical
moment about the mean is 𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋 = 𝜎𝜎 2 . By equating the first sample moment about the
origin and the second sample moment about the mean to the corresponding theoretical
𝑛𝑛 2
1
moments, we have 𝜇𝜇� = 𝑋𝑋� and 𝜎𝜎� 2 = � �
𝑋𝑋𝑗𝑗 − 𝑋𝑋 .
𝑛𝑛 𝑗𝑗=1

STAT 2006 - Jan 2022 24

UMVUE

Definition. An estimator 𝜃𝜃� of 𝜃𝜃 is called the uniformly minimum variance unbiased

estimator (UMVUE) if 𝜃𝜃� = argmin 𝑉𝑉𝑉𝑉𝑉𝑉 𝜃𝜃� .
� 𝑖𝑖𝑖𝑖 𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢
𝜃𝜃
Definition. The Fisher information of size 𝒏𝒏 about 𝜃𝜃 is defined as
2
𝜕𝜕𝜕 𝜃𝜃
𝐼𝐼𝑛𝑛 𝜃𝜃 = 𝐸𝐸 ,
𝜕𝜕𝜕𝜕
where ℓ 𝜃𝜃 = log 𝐿𝐿 𝜃𝜃 is the log-likelihood function of a random sample of size 𝑛𝑛.

STAT 2006 - Jan 2022 25

UMVUE

Theorem. Let 𝑋𝑋1 , … , 𝑋𝑋𝑛𝑛 be a random sample from a population with pdf 𝑓𝑓 𝑥𝑥; 𝜃𝜃 . Then,
under certain regularity conditions, we have the following results.
• 𝐼𝐼𝑛𝑛 𝜃𝜃 = 𝑛𝑛𝑛𝑛 𝜃𝜃 , where
2
𝜕𝜕 log 𝑓𝑓 𝑋𝑋; 𝜃𝜃
𝐼𝐼 𝜃𝜃 = 𝐸𝐸 ,
𝜕𝜕𝜕𝜕
is the Fisher information of size 1 about 𝜃𝜃.
𝜕𝜕2 log 𝑓𝑓 𝑋𝑋;𝜃𝜃
• 𝐼𝐼 𝜃𝜃 = 𝐸𝐸 − .
𝜕𝜕𝜃𝜃2
� the Cramer-Rao inequality is 𝑉𝑉𝑉𝑉𝑉𝑉 𝜃𝜃� ≥ 𝐼𝐼𝑛𝑛 𝜃𝜃
• For an unbiased estimator 𝜃𝜃, −1 .

STAT 2006 - Jan 2022 26

UMVUE
If 𝜃𝜃� is an unbiased estimator of 𝜃𝜃 and 𝑉𝑉𝑉𝑉𝑉𝑉 𝜃𝜃� = 𝐼𝐼𝑛𝑛 𝜃𝜃 −1 , then, 𝜃𝜃� is the UMVUE of 𝜃𝜃.
Example. Show that 𝑋𝑋�𝑛𝑛 is the UMVUE of the mean of a normal population 𝑛𝑛 𝜇𝜇, 𝜎𝜎 2 .
2 2
1 𝑥𝑥 − 𝜇𝜇 1 2
𝑥𝑥 − 𝜇𝜇
𝑓𝑓 𝑥𝑥; 𝜇𝜇 = exp − ⟹ log 𝑓𝑓 𝑥𝑥; 𝜇𝜇 = − log 2𝜋𝜋𝜎𝜎 − .
2𝜋𝜋𝜎𝜎 2 2𝜎𝜎 2 2 2𝜎𝜎 2
2 2
𝜕𝜕 log 𝑓𝑓 𝑥𝑥; 𝜇𝜇 𝑥𝑥 − 𝜇𝜇 𝜕𝜕 log 𝑓𝑓 𝑋𝑋; 𝜇𝜇 𝑋𝑋 − 𝜇𝜇 1
= 2
⟹ 𝐼𝐼 𝜇𝜇 = 𝐸𝐸 = 𝐸𝐸 = 2.
𝜕𝜕𝜕𝜕 𝜎𝜎 𝜕𝜕𝜕𝜕 𝜎𝜎 4 𝜎𝜎
Alternatively,
𝜕𝜕 2 log 𝑓𝑓 𝑋𝑋; 𝜇𝜇 1
𝐼𝐼 𝜇𝜇 = −𝐸𝐸 2
= 2.
𝜕𝜕𝜇𝜇 𝜎𝜎
Hence, the Cramer-Rao lower bound is
2
1 𝜎𝜎
𝑉𝑉𝑉𝑉𝑉𝑉 𝜇𝜇� ≥ 𝐼𝐼𝑛𝑛 𝜇𝜇 −1 = = .
𝑛𝑛𝑛𝑛 𝜇𝜇 𝑛𝑛
𝜎𝜎2
Since 𝐸𝐸 𝑋𝑋�𝑛𝑛 = 𝜇𝜇 and 𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋�𝑛𝑛 = , 𝑋𝑋�𝑛𝑛 is the UMVUE of 𝜇𝜇.
𝑛𝑛
STAT 2006 - Jan 2021 27
UMVUE

Example. Show that 𝑋𝑋�𝑛𝑛 is the UMVUE of the parameter 𝜃𝜃 of a Bernoulli population.
𝑓𝑓 𝑥𝑥; 𝜃𝜃 = 𝜃𝜃 𝑥𝑥 1 − 𝜃𝜃 1−𝑥𝑥 , 𝑥𝑥 ∈ 0,1
𝜕𝜕 log 𝑓𝑓 𝑥𝑥; 𝜃𝜃 𝜕𝜕 𝑥𝑥 1 − 𝑥𝑥 𝑥𝑥 1
= 𝑥𝑥 log 𝜃𝜃 + 1 − 𝑥𝑥 log 1 − 𝜃𝜃 = − = −
𝜕𝜕𝜃𝜃 𝜕𝜕𝜕𝜕 𝜃𝜃 1 − 𝜃𝜃 𝜃𝜃 1 − 𝜃𝜃 1 − 𝜃𝜃
2
𝜕𝜕 log 𝑓𝑓 𝑋𝑋; 𝜃𝜃 𝑋𝑋 1
𝐼𝐼 𝜃𝜃 = 𝐸𝐸 = 𝑉𝑉𝑉𝑉𝑉𝑉 = .
𝜕𝜕𝜃𝜃 𝜃𝜃 1 − 𝜃𝜃 𝜃𝜃 1 − 𝜃𝜃
The Cramer-Rao lower bound is
1 𝜃𝜃 1 − 𝜃𝜃
𝑉𝑉𝑉𝑉𝑉𝑉 𝜃𝜃� ≥ 𝐼𝐼𝑛𝑛 𝜃𝜃 −1 = = .
𝑛𝑛𝑛𝑛 𝜃𝜃 𝑛𝑛
𝜃𝜃 1−𝜃𝜃
� �
Since 𝐸𝐸 𝑋𝑋𝑛𝑛 = 𝜃𝜃 and 𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋𝑛𝑛 = , 𝑋𝑋�𝑛𝑛 is the UMVUE of 𝜃𝜃.
𝑛𝑛

STAT 2006 - Jan 2021 28

Consistency

𝑝𝑝
𝜃𝜃� is a consistent estimator of 𝜃𝜃 if 𝜃𝜃� 𝜃𝜃, that is, for any 𝜀𝜀 > 0,
lim 𝑃𝑃 𝜃𝜃� − 𝜃𝜃 > 𝜀𝜀 = 0.
𝑛𝑛→∞
Property. If 𝜃𝜃�𝑛𝑛 is a sequence of estimators of a parameter 𝜃𝜃 satisfying
• lim 𝑉𝑉𝑉𝑉𝑉𝑉 𝜃𝜃�𝑛𝑛 = 0
𝑛𝑛→∞
• lim 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝜃𝜃�𝑛𝑛 = 0
𝑛𝑛→∞
then, 𝜃𝜃�𝑛𝑛 is a consistent sequence of estimators of 𝜃𝜃.
2
Proof. 𝐸𝐸 𝜃𝜃�𝑛𝑛 − 𝜃𝜃 = 𝑉𝑉𝑉𝑉𝑉𝑉 𝜃𝜃�𝑛𝑛 + 𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝜃𝜃�𝑛𝑛 .

STAT 2006 - Jan 2021 29

Consistency

Property. If 𝜃𝜃� ⟶𝑝𝑝 𝜃𝜃 and 𝜃𝜃� ⟶𝑝𝑝 𝜃𝜃 ′ , then

• 𝜃𝜃� ± 𝜃𝜃� ⟶𝑝𝑝 𝜃𝜃 ± 𝜃𝜃 ′
• 𝜃𝜃�𝜃𝜃� ⟶𝑝𝑝 𝜃𝜃𝜃𝜃 ′
• 𝜃𝜃�⁄𝜃𝜃� ⟶𝑝𝑝 𝜃𝜃⁄𝜃𝜃 ′ , assuming that 𝜃𝜃� ≠ 0 and 𝜃𝜃 ′ ≠ 0
• If 𝑔𝑔 is any real-valued function that is continuous at 𝜃𝜃, 𝑔𝑔 𝜃𝜃� ⟶𝑝𝑝 𝑔𝑔 𝜃𝜃 .

STAT 2006 - Jan 2021 30

Consistency

Example. Suppose that 𝑋𝑋1 , … , 𝑋𝑋𝑛𝑛 are independent random variables having the same
finite mean 𝜇𝜇 = 𝐸𝐸 𝑋𝑋1 , finite variance 𝜎𝜎 2 = 𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋1 , and finite fourth moment 𝜇𝜇4 =
𝐸𝐸 𝑋𝑋14 . Show that 𝑋𝑋� is a consistent estimator of 𝜇𝜇 and 𝑆𝑆 2 is a consistent estimator of 𝜎𝜎 2 .
Answer. Note that 𝐸𝐸 𝑋𝑋� = 𝜇𝜇 (unbiased) and 𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋� = 𝜎𝜎 2 ⁄𝑛𝑛 → 0, as 𝑛𝑛 → ∞. Hence, 𝑋𝑋� is
a consistent estimator of 𝜇𝜇. (We can also prove this using weak law of large numbers)
For 𝑆𝑆 2 , we have
𝑛𝑛 𝑛𝑛
1 𝑛𝑛 1
2
𝑆𝑆 = �
� 𝑋𝑋𝑖𝑖 − 𝑋𝑋 =2 � 𝑋𝑋𝑖𝑖2 − 𝑋𝑋� 2 .
𝑛𝑛 − 1 𝑛𝑛 − 1 𝑛𝑛
𝑖𝑖=1 𝑖𝑖=1

By the weak law of large numbers, we have 𝑋𝑋 2 ⟶𝑝𝑝 𝜇𝜇2 = 𝐸𝐸 𝑋𝑋12 and 𝑋𝑋� ⟶𝑝𝑝 𝜇𝜇. Since
𝑔𝑔 𝑥𝑥 = 𝑥𝑥 2 is continuous, we have 𝑋𝑋� 2 ⟶ 𝜇𝜇2 . Hence, 𝑆𝑆 2 ⟶ 𝜇𝜇2 − 𝜇𝜇2 = 𝜎𝜎 2 .

STAT 2006 - Jan 2021 31

Consistency

Note that when 𝑋𝑋1 , … , 𝑋𝑋𝑛𝑛 are independent 𝑛𝑛 𝜇𝜇, 𝜎𝜎 2 random variables. We understand
that
𝑛𝑛 − 1 𝑆𝑆 2 2
2
~𝜒𝜒𝑛𝑛−1 ,
𝜎𝜎
2
where 𝜒𝜒𝑛𝑛−1 is the chi-square distribution with 𝑛𝑛 − 1 degrees of freedom. Therefore,
𝑛𝑛 − 1 𝑆𝑆 2 2 = 𝜎𝜎 2 (𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢𝑢)
𝐸𝐸 = 𝑛𝑛 − 1 ⟹ 𝐸𝐸 𝑆𝑆
𝜎𝜎 2
𝑛𝑛−1 𝑆𝑆 2
Since 𝑉𝑉𝑉𝑉𝑉𝑉 = 2 𝑛𝑛 − 1 ,
𝜎𝜎 2
2𝜎𝜎 4
𝑉𝑉𝑉𝑉𝑉𝑉 𝑆𝑆 2 = ⟶ 0,
𝑛𝑛 − 1
as 𝑛𝑛 ⟶ ∞. Thus, 𝑆𝑆 2 is a consistent estimator of 𝜎𝜎 2 .

STAT 2006 - Jan 2021 32

Asymptotic normal of MLE

Theorem. Let 𝑋𝑋1 , … , 𝑋𝑋𝑛𝑛 be an independent random sample from a population with pdf
𝑓𝑓 𝑥𝑥; 𝜃𝜃 . Suppose that 𝜃𝜃� is the MLE of the true parameter 𝜃𝜃0 . Then, under certain
regularity conditions, as 𝑛𝑛 ⟶ ∞,
𝑑𝑑
𝑛𝑛 𝜃𝜃� − 𝜃𝜃0 𝑛𝑛 0, 𝐼𝐼 𝜃𝜃0 −1
.
1
In other words, the large sample distribution of the mle 𝜃𝜃� is 𝑛𝑛 𝜃𝜃0 , 𝐼𝐼 𝜃𝜃0 −1
.
𝑛𝑛

Remark: 𝐼𝐼 𝜃𝜃0 corresponds to Fisher information of size 1.

STAT 2006 - Jan 2021 33

Asymptotic normal of MLE

Example. Suppose that 𝑋𝑋1 , … , 𝑋𝑋𝑛𝑛 be an independent random sample from a Poisson
distribution with pdf
𝜆𝜆 𝑥𝑥
𝑓𝑓 𝑥𝑥, 𝜆𝜆 = 𝑒𝑒 −𝜆𝜆 , 𝑥𝑥 = 0,1,2, … ; 𝜆𝜆 > 0.
𝑥𝑥!
Then, the maximum likelihood estimator of 𝜆𝜆 is 𝜆𝜆̂ = 𝑋𝑋. � The log-likelihood function is
given by
log 𝐿𝐿 𝜆𝜆, 𝑋𝑋1 , … , 𝑋𝑋𝑛𝑛 = 𝑛𝑛𝑋𝑋� log 𝜆𝜆 − 𝑛𝑛𝜆𝜆.
Thus,
𝜕𝜕 log 𝐿𝐿 𝜆𝜆 𝑛𝑛𝑋𝑋� 𝜕𝜕 2 log 𝐿𝐿 𝜆𝜆 𝑛𝑛𝑋𝑋� 𝜕𝜕 2 log 𝐿𝐿 𝜆𝜆 𝑛𝑛
= − 𝑛𝑛; 2
= − 2 ⟹ 𝐼𝐼𝑛𝑛 𝜆𝜆 = −𝐸𝐸 2
= .
𝜕𝜕𝜕𝜕 𝜆𝜆 𝜕𝜕𝜆𝜆 𝜆𝜆 𝜕𝜕𝜆𝜆 𝜆𝜆
𝜆𝜆
The large sample distribution of 𝜆𝜆̂ = 𝑋𝑋� is 𝑛𝑛 𝜆𝜆, .
𝑛𝑛

STAT 2006 - Jan 2021 34

Interval Estimation

Point estimates, such as the sample proportion 𝑝𝑝,̂ the sample mean 𝑥𝑥̅ or the sample
variance 𝜎𝜎� 2 depends on the particular sample. When we use the sample mean 𝑥𝑥̅ to
estimate the population mean 𝜇𝜇, can we be confident that 𝑥𝑥̅ is close to 𝜇𝜇? Can we have a
measure as to how close the sample estimator 𝜃𝜃� is to the population parameter 𝜃𝜃?
An approach is to find an upper bound 𝑈𝑈 and a lower bound 𝐿𝐿 such that the value of the
population parameter 𝜃𝜃 is between 𝐿𝐿 and 𝑈𝑈 with probability 1 − 𝛼𝛼, that is
𝑃𝑃 𝐿𝐿 < 𝜃𝜃 < 𝑈𝑈 = 1 − 𝛼𝛼,
where 1 − 𝛼𝛼 ∈ 0,1 is called the confidence coefficient or the confidence level of the
interval 𝐿𝐿, 𝑈𝑈 .
Typical confidence coefficients are 90%, 95%, and 99%. For example, we are 95% confident
that the population mean is between 𝐿𝐿 and 𝑈𝑈.

STAT 2006 - Jan 2022 35

Interval Estimation

Example. For an independent random sample 𝑋𝑋1 , 𝑋𝑋2 , 𝑋𝑋3 , 𝑋𝑋4 from 𝑛𝑛 𝜇𝜇, 1 , consider an
interval estimator of 𝜇𝜇 by 𝑋𝑋� − 1, 𝑋𝑋� + 1 . Then, the probability that 𝜇𝜇 ∈ 𝑋𝑋� − 1, 𝑋𝑋� + 1
is given by
𝑃𝑃 𝜇𝜇 ∈ 𝑋𝑋� − 1, 𝑋𝑋� + 1 = 𝑃𝑃 𝑋𝑋� − 1 ≤ 𝜇𝜇 ≤ 𝑋𝑋� + 1 = 𝑃𝑃 −1 ≤ 𝑋𝑋� − 𝜇𝜇 ≤ 1
𝑋𝑋� − 𝜇𝜇
= 𝑃𝑃 −2 ≤ ≤ 2 = 𝑃𝑃 −2 ≤ 𝑍𝑍 ≤ 2 ≈ 0.9544,
1
4
where 𝑍𝑍~𝑛𝑛 0,1 . Note that we have over a 95% chance of covering 𝜃𝜃.

STAT 2006 - Jan 2021 36

Interval Estimation

Definition. Let 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 be an independent sample from a distribution with pdf 𝑓𝑓𝜃𝜃 . The
confidence coefficient 1 − 𝛼𝛼 of an interval estimator 𝐿𝐿 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 , 𝑈𝑈 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 of
𝜃𝜃 is given by
1 − 𝛼𝛼 = 𝑃𝑃 𝜃𝜃 ∈ 𝐿𝐿 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 , 𝑈𝑈 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 .

An interval estimator, together with a measure of confidence, is known as confidence

interval.

STAT 2006 - Jan 2021 37

Interval Estimation

Definition. The value 𝑧𝑧𝛼𝛼/2 is the 𝑍𝑍-value (obtained from a standard normal table) such that
𝛼𝛼 𝛼𝛼
the area to the right of it under the standard normal curve is , that is 𝑃𝑃 𝑍𝑍 ≥ 𝑧𝑧𝛼𝛼/2 = . By
2 2
𝛼𝛼
symmetry of the normal distribution, 𝑃𝑃 𝑍𝑍 ≤ −𝑧𝑧𝛼𝛼/2 = .
2

STAT 2006 - Jan 2021 38

Confidence intervals for means – One sample
Definition. A random variable 𝑄𝑄 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 , 𝜃𝜃 is a pivotal quantity if its distribution is
free of 𝜃𝜃.
Example. Let 𝑋𝑋~𝑛𝑛 𝜇𝜇, 𝜎𝜎 2 random variable, where 𝜎𝜎 2 is a known quantity. Then, since
𝜎𝜎 2
�
𝑋𝑋~𝑛𝑛 𝜇𝜇, , we have
𝑛𝑛
𝑋𝑋� − 𝜇𝜇
𝑍𝑍 = ~𝑛𝑛 0,1 .
𝜎𝜎/ 𝑛𝑛
𝜎𝜎
Note that 𝑍𝑍 is a pivotal quantity. A 1 − 𝛼𝛼 confidence interval for the mean 𝜇𝜇 is 𝑋𝑋� ± 𝑧𝑧𝛼𝛼/2 .
𝑛𝑛
𝑋𝑋� − 𝜇𝜇
1 − 𝛼𝛼 = 𝑃𝑃 −𝑧𝑧𝛼𝛼 ≤ 𝑍𝑍 ≤ 𝑧𝑧𝛼𝛼 = 𝑃𝑃 −𝑧𝑧𝛼𝛼 ≤ 𝜎𝜎 ≤ 𝑧𝑧𝛼𝛼
2 2 2 2
𝑛𝑛
𝜎𝜎 𝜎𝜎
= 𝑃𝑃 𝑋𝑋� − 𝑧𝑧𝛼𝛼 ≤ 𝜇𝜇 ≤ 𝑋𝑋� + 𝑧𝑧𝛼𝛼 .
2 𝑛𝑛 2 𝑛𝑛

STAT 2006 - Jan 2021 39

Confidence intervals for means – One sample
A random sample of 126 people subjected to constant inhalation of automobile exhaust
fumes in cities of Hong Kong had an average blood lead level concentration of 29.2 𝜇𝜇𝑔𝑔/𝑑𝑑𝑑𝑑.
Assume that 𝑋𝑋, the blood lead level of a randomly selected person, is normally distributed
with a standard deviation of 𝜎𝜎 = 7.5 𝜇𝜇𝜇𝜇/𝑑𝑑𝑑𝑑. From past data, it is known that the average
blood lead level concentration of humans with no exposure to automobile exhaust is 18.2
𝜇𝜇𝜇𝜇/𝑑𝑑𝑑𝑑. Is there convincing evidence the people in Hong Kong who exposed to constant
auto exhaust have elevated blood lead level concentration?
Answer.
𝜎𝜎
A 95% confidence interval for the mean 𝜇𝜇 is 𝑥𝑥̅ ± 𝑧𝑧𝛼𝛼/2 = 27.89,30.15 , where 𝑥𝑥̅ = 29.2,
𝑛𝑛
𝜎𝜎 = 7.5, 𝑛𝑛 = 126, and 𝑧𝑧0.025 = 1.96.
Since the entire 95% confidence interval is above 18.2, the mean
blood lead concentration, there is convincing evidence that the
people exposed to constant auto exhaust have elevated blood lead
level concentration.
STAT 2006 - Jan 2021 40
Confidence intervals for means – One sample

Example. A publishing company has just published a new college textbook. Before the
company decides the price of the book, it wants to know the average price of all such
textbooks in the market. The research department at the company took a sample of 36
such textbooks and collected information on their prices. This information produced a
mean price of $48.40 for this sample. It is known that the standard deviation of the
prices of all such textbooks is $4.50. Construct a 90% confidence interval for the mean
price of all such college textbooks assuming that the underlying population is normal.
Answer. Given 𝑛𝑛 = 36, 𝑥𝑥̅ = 48.4 and 𝜎𝜎 = 4.50. The 90% confidence interval for the mean
price of all such college textbooks is given by
4.5 4.5
𝑥𝑥̅ − 1.645 × , 𝑥𝑥̅ + 1.645 × = 47.1662,49.6338 .
36 36

STAT 2006 - Jan 2021 41

Confidence intervals for means – One sample

𝜎𝜎 𝜎𝜎
The statement 𝑃𝑃 𝑋𝑋� − 𝑧𝑧𝛼𝛼 ≤ 𝜇𝜇 ≤ 𝑋𝑋� + 𝑧𝑧𝛼𝛼 = 1 − 𝛼𝛼 cannot be interpreted to say that
2 𝑛𝑛 2 𝑛𝑛
𝜎𝜎 𝜎𝜎
the probability that the population mean 𝜇𝜇 falls inside the interval 𝑋𝑋� − 𝑧𝑧𝛼𝛼 , 𝑋𝑋� + 𝑧𝑧𝛼𝛼 is
2 𝑛𝑛 2 𝑛𝑛
1 − 𝛼𝛼. The correct interpretation is the followings:
• Suppose we take a large number of samples.
• We calculate a 95% confidence interval for each sample.
• Then, we expect that 95% of the intervals contain the actual unknown 𝜇𝜇.

STAT 2006 - Jan 2021 42

Confidence intervals for means – One sample

If a confidence interval for a parameter 𝜃𝜃 is 𝐿𝐿, 𝑈𝑈 , then the length of the interval is 𝑈𝑈 − 𝐿𝐿.
We are interested in obtaining intervals that are as narrow as possible. For example,
consider the following two cases:
• We can be 95% confident that the average amount of money spent monthly on housing
in the U.S. is between $300 and $3300.
• We can be 95% confident that the average amount of money spent monthly on housing
in the U.S. is between $1100 and $1300.
In the first statement, the average amount of money spent monthly can be anywhere
between $300 and $3300, whereas, for the second statement, the average amount has
been narrowed down to somewhere between $1100 and $1300. So, of course, we would
prefer to make the second statement, because it gives us a more specific range of the
magnitude of the population mean.

STAT 2006 - Jan 2021 43

Confidence intervals for means – One sample
For a normal distribution with known variance 𝜎𝜎 2 , the length of the 1 − 𝛼𝛼 confidence
𝜎𝜎 𝜎𝜎
�
interval 𝑋𝑋 − 𝑧𝑧 𝛼𝛼 �
, 𝑋𝑋 + 𝑧𝑧 𝛼𝛼 is
2 𝑛𝑛 2 𝑛𝑛
𝜎𝜎 𝜎𝜎 𝜎𝜎
𝑋𝑋� + 𝑧𝑧𝛼𝛼 �
− 𝑋𝑋 − 𝑧𝑧𝛼𝛼 = 2𝑧𝑧𝛼𝛼 .
2 𝑛𝑛 2 𝑛𝑛 2 𝑛𝑛
It tells us that
• As the population standard deviation 𝜎𝜎 decreases, the length of the interval decreases.
We have no control over the population standard deviation , so this factor doesn't help
us all that much.
• As the sample size 𝑛𝑛 increases, the length of the interval decreases. We should select as
large of a sample as we can afford.
• As the confidence level decreases, the length of the interval decreases. For example, that
for a 95% interval, 𝑧𝑧 = 1.96, whereas for a 90% interval, 𝑧𝑧 = 1.645. We want a high
confidence level, but not so high as to produce such a wide interval as to be useless.
STAT 2006 - Jan 2021 44
Confidence intervals for means – One sample
When 𝜎𝜎 is not known, we estimate the population variance 𝜎𝜎 2 with the sample variance
𝑆𝑆 2 .
Theorem. If 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 are a random sample from 𝑛𝑛 𝜇𝜇, 𝜎𝜎 2 distribution, then
𝑋𝑋� − 𝜇𝜇
𝑇𝑇 =
𝑆𝑆/ 𝑛𝑛
has a t-distribution with 𝑛𝑛 − 1 degrees of freedom.
Proof. By the definition of t-distribution, if 𝑍𝑍~𝑛𝑛 0,1 , 𝑈𝑈~𝜒𝜒𝑟𝑟2 and 𝑍𝑍 and 𝑈𝑈 are independent, then
𝑍𝑍 �
𝑋𝑋−𝜇𝜇
𝑇𝑇 = has a t-distribution with 𝑟𝑟 degrees of freedom. Now, we have 𝑍𝑍 = ~𝑛𝑛 0,1 ,
𝑈𝑈/𝑟𝑟 𝜎𝜎/ 𝑛𝑛
𝑛𝑛−1 𝑆𝑆 2
~𝜒𝜒 2
𝑛𝑛−1 , and 𝑋𝑋� and 𝑆𝑆 2 are independent. Hence,
𝜎𝜎2
𝑋𝑋� − 𝜇𝜇
𝜎𝜎/ 𝑛𝑛 𝑋𝑋̅ − 𝜇𝜇
𝑇𝑇 = = ~𝑡𝑡𝑛𝑛−1 .
𝑛𝑛 − 1 𝑆𝑆 2 𝑆𝑆/ 𝑛𝑛
/ 𝑛𝑛 − 1
𝜎𝜎 2
STAT 2006 - Jan 2021 45
Confidence intervals for means – One sample

Theorem. If 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 are a random sample from 𝑛𝑛 𝜇𝜇, 𝜎𝜎 2 distribution, then a 1 − 𝛼𝛼
𝑆𝑆
�
confidence interval for the population mean 𝜇𝜇 is 𝑋𝑋 ± 𝑡𝑡𝛼𝛼,𝑛𝑛−1 .
2 𝑛𝑛
�
𝑋𝑋−𝜇𝜇
Note that 𝑇𝑇 = is a pivotal quantity.
𝑆𝑆/ 𝑛𝑛

STAT 2006 - Jan 2021 46

Confidence intervals for means – One sample
𝑆𝑆
Terminology. With the t-interval 𝑋𝑋� ± 𝑡𝑡𝛼𝛼,𝑛𝑛−1 , we say that
2 𝑛𝑛

• 𝑋𝑋� is a point estimator of 𝜇𝜇. 𝑥𝑥̅ is a point estimate of 𝜇𝜇.

𝑆𝑆 𝑠𝑠
• �
𝑋𝑋 ± 𝑡𝑡 ,𝑛𝑛−1 is an interval estimator of 𝜇𝜇. 𝑥𝑥̅ ± 𝑡𝑡𝛼𝛼,𝑛𝑛−1
𝛼𝛼 is an interval estimate of 𝜇𝜇.
2 𝑛𝑛 2 𝑛𝑛
𝑆𝑆
• is the standard error of the mean.
𝑛𝑛
𝑆𝑆
• 𝑡𝑡𝛼𝛼,𝑛𝑛−1 is the margin of error.
2 𝑛𝑛

STAT 2006 - Jan 2021 47

Confidence intervals for means – One sample
Example. A random sample of 16 people yielded the following data on the number of
pounds of beef consumed per year:
118, 115, 125, 110, 112, 130, 117, 112, 115, 120, 113, 118, 119, 122, 123, 126
What is the average number of pounds of beef consumed each year per person?
The first step of analysis is to check that the data follows a normal distribution. This can
be done using a normal probability plot. Procedure for creating a normal probability
plot:
𝑖𝑖
• Use the cut-off points , 𝑖𝑖 = 1,2, … , 𝑛𝑛 to find the corresponding quantiles of normal
𝑛𝑛
𝑖𝑖
distribution with mean 𝑥𝑥̅ and standard deviation 𝑠𝑠, that is Φ−1
, 𝑖𝑖 = 1,2, … , 𝑛𝑛 .
𝑛𝑛
• Sort the data in ascending order, that is 𝑥𝑥𝑖𝑖 , 𝑖𝑖 = 1,2, … , 𝑛𝑛 ⟶ 𝑦𝑦1 ≤ 𝑦𝑦2 ≤ ⋯ ≤ 𝑦𝑦𝑛𝑛 .
𝑖𝑖
• Scatter plot between Φ−1 , 𝑖𝑖 = 1,2, … , 𝑛𝑛 and 𝑦𝑦𝑖𝑖 , 𝑖𝑖 = 1,2, … , 𝑛𝑛 .
𝑛𝑛

STAT 2006 - Jan 2021 48

Confidence intervals for means – One sample
Example.
If the points scatter around the 𝑥𝑥 = 𝑦𝑦 line, then the data has a normal distribution.

Since the data points (the right diagram) falls at least

approximately on a straight line, there's no reason to
conclude that the data are not normally distributed. Note
that 𝑥𝑥̅ = 118.44 and 𝑠𝑠 = 5.66. For a 95% confidence interval
with 𝑛𝑛 = 16 data points, we have 𝑡𝑡0.025,15 = 2.1314. The
𝑠𝑠
95% confidence interval for the mean 𝜇𝜇 is 𝑥𝑥̅ ± 𝑡𝑡𝛼𝛼,𝑛𝑛−1 =
2 𝑛𝑛
115.42,121.46 . That is, we have 95% confident that the
average amount of beef consumed each year per person is
between 115.42 and 121.46 pounds.

STAT 2006 - Jan 2021 49

Confidence intervals for means – One sample

Non-normal data
𝜎𝜎 2
By the Central Limit Theorem, the large sample distribution of 𝑋𝑋� is 𝑛𝑛 , irrespective
𝜇𝜇,
𝑛𝑛
of whether the data 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 are normally distributed. Since 𝑡𝑡𝑛𝑛 ⟶ 𝑛𝑛 0,1 , as 𝑛𝑛 →
�
𝑋𝑋−𝜇𝜇
∞, the large sample distribution of 𝑇𝑇 = can be approximated by the standard
𝑆𝑆/ 𝑛𝑛
𝑠𝑠 𝑠𝑠
normal distribution. Thus, the two intervals 𝑥𝑥̅ ± 𝑡𝑡 𝛼𝛼
,𝑛𝑛−1 and 𝑥𝑥̅ ± 𝑧𝑧 𝛼𝛼 give similar
2 𝑛𝑛 2 𝑛𝑛
results for large samples.
Therefore, a rule of thumb is that we should use the t-interval for the mean, 𝑥𝑥̅ ±
𝑠𝑠
𝑡𝑡𝛼𝛼,𝑛𝑛−1 if the sample size is large enough.
2 𝑛𝑛

STAT 2006 - Jan 2021 50

Confidence intervals for means – One sample

Example. A random sample of 64 guinea pigs yielded the following survival times (in
days):
36,18,91,89,87,86,52,50,149,120,
119,118,115,114,114,108,102,189,178,173,
167,167,166,165,160,216,212,209,292,279,
278,273,341,382,380,367,355,446,432,421,
641,638,637,634,621,608,607,603,688,685,
663,650,735,725
What is the mean survival time (in days) of the population of guinea pigs?

STAT 2006 - Jan 2021 51

Confidence intervals for means – One sample
Example.
The normal probability plot indicates that the data does not adhere well to the 𝑥𝑥 = 𝑦𝑦
straight line. It suggest that the survival times are not normally distributed.
Since the sample size 𝑛𝑛 = 54 is large, we can use the t-interval for the
𝑠𝑠
mean, 𝑥𝑥̅ ± 𝑡𝑡𝛼𝛼,𝑛𝑛−1 . The 95% confidence interval for the mean survival
2 𝑛𝑛
times is 251.8444, 375.9705 (in days).
𝑠𝑠
If we use the normal interval for the mean, 𝑥𝑥̅ ± 𝑧𝑧 𝛼𝛼 , the 95%
2 𝑛𝑛
confidence interval for the mean survival times is 253.261,374.5538 .

STAT 2006 - Jan 2021 52

Confidence intervals for means – Two samples

The Two-sample t Test: The case of Equal Variances

Let 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑚𝑚 and 𝑌𝑌1 , 𝑌𝑌2 , … , 𝑌𝑌𝑛𝑛 be two independent samples drawn from 𝑛𝑛 𝜇𝜇𝑋𝑋 , 𝜎𝜎 2
𝜎𝜎 2 𝜎𝜎 2
2
and 𝑛𝑛 𝜇𝜇𝑌𝑌 , 𝜎𝜎 respectively. We have 𝑋𝑋~𝑛𝑛 � 𝜇𝜇𝑋𝑋 , �
, 𝑌𝑌~𝑛𝑛 𝜇𝜇𝑌𝑌 , , and 𝑋𝑋� and 𝑌𝑌� are
𝑚𝑚 𝑛𝑛
independent. Thus,
1 1
𝑋𝑋� − 𝑌𝑌~𝑛𝑛
� 𝜇𝜇𝑋𝑋 − 𝜇𝜇𝑌𝑌 , 𝜎𝜎 2
+ .
𝑚𝑚 𝑛𝑛

1 1
When 𝜎𝜎 is known, a 1 − 𝛼𝛼 confidence interval for 𝜇𝜇𝑋𝑋 − 𝜇𝜇𝑌𝑌 is 𝑋𝑋� − 𝑌𝑌� ± 𝑧𝑧𝛼𝛼 𝜎𝜎
2
+ .
2 𝑚𝑚 𝑛𝑛

STAT 2006 - Jan 2021 53

Confidence intervals for means – Two samples
The Two-sample t Test: The case of Equal Variances
2
𝑚𝑚−1 𝑆𝑆𝑋𝑋 2 𝑛𝑛−1 𝑆𝑆𝑌𝑌2 2
When 𝜎𝜎 2 is unknown, we need to estimate Since 𝜎𝜎 2 . ~𝜒𝜒𝑚𝑚−1 , ~𝜒𝜒𝑛𝑛−1 and
𝜎𝜎 2 𝜎𝜎 2
𝑆𝑆𝑋𝑋2 and 𝑆𝑆𝑌𝑌2 are independent, we can conclude that
𝑚𝑚 − 1 𝑆𝑆𝑋𝑋2 + 𝑛𝑛 − 1 𝑆𝑆𝑌𝑌2 2
2
~𝜒𝜒𝑚𝑚+𝑛𝑛−2 .
𝜎𝜎
2
𝑚𝑚−1 𝑆𝑆𝑋𝑋 + 𝑛𝑛−1 𝑆𝑆𝑌𝑌2
It follows that 𝑆𝑆𝑝𝑝2 = is an unbiased estimator of 𝜎𝜎 2 .
𝑚𝑚+𝑛𝑛−2
𝑘𝑘 1 ℎ 1
Note that we have used the following result: when 𝑈𝑈~𝜒𝜒𝑘𝑘2 =Γ , and 𝑉𝑉~𝜒𝜒ℎ2 =Γ ,
2 2 2 2
are independent,

2 𝑘𝑘 + ℎ 1
𝑈𝑈 + 𝑉𝑉~𝜒𝜒𝑘𝑘+ℎ =Γ , .
2 2
STAT 2006 - Jan 2021 54
Confidence intervals for means – Two samples
The Two-sample t Test: The case of Equal Variances
By the definition of t-distribution, the random variable
𝑋𝑋� − 𝑌𝑌� − 𝜇𝜇𝑋𝑋 − 𝜇𝜇𝑌𝑌
1 1
𝜎𝜎 𝑛𝑛 + 𝑚𝑚 𝑋𝑋� − 𝑌𝑌� − 𝜇𝜇𝑋𝑋 − 𝜇𝜇𝑌𝑌
𝑇𝑇 = = ~𝑡𝑡𝑚𝑚+𝑛𝑛−2 .
𝑚𝑚 − 1 𝑆𝑆𝑋𝑋2 + 𝑛𝑛 − 1 𝑆𝑆𝑌𝑌2 𝑆𝑆𝑝𝑝
1 1
+
𝜎𝜎 2 𝑛𝑛 𝑚𝑚
𝑚𝑚 + 𝑛𝑛 − 2
1 1
Thus, a 1 − 𝛼𝛼 confidence interval for for 𝜇𝜇𝑋𝑋 − 𝜇𝜇𝑌𝑌 is 𝑋𝑋� − 𝑌𝑌� ± 𝑡𝑡𝛼𝛼,𝑚𝑚+𝑛𝑛−2 𝑆𝑆𝑝𝑝 + .
2 𝑚𝑚 𝑛𝑛

STAT 2006 - Jan 2021 55

Confidence intervals for means – Two samples
The Two-sample t Test: The case of Equal Variances
Example. The feeding habits of two species of net-casting spiders are studied. The species,
the deinopis and menneus, coexist in eastern Australia. The following data were obtained
on the size, in mm, of the prey of random samples of the two species.

dinopis 12.9, 10.2, 7.4, 7.0, 10.5, 11.9, 7.1, 9.9, 14.4, 11.3
menneus 11.2, 6.5, 10.9, 13.0, 10.1, 5.3, 7.5, 10.3, 9.2, 8.8

What is the difference, if any, in the mean size of the prey (of the entire populations) of
the two species?
The standard deviation of dinopis is 2.5136 and that of menneus is 2.3294. Since the two
standard deviations are close, we can assume that the variances of the two populations
are similar.

STAT 2006 - Jan 2021 56

Confidence intervals for means – Two samples

The Two-sample t Test: The case of Equal Variances

Example. The feeding habits of two species of net-casting spiders are studied. The species,
the deinopis and menneus, coexist in eastern Australia. The following data were obtained
on the size, in mm, of the prey of random samples of the two species.

10 − 1 𝑠𝑠𝑋𝑋2 + 10 − 1 𝑠𝑠𝑌𝑌2 𝑠𝑠𝑋𝑋2 + 𝑠𝑠𝑌𝑌2

𝑠𝑠𝑝𝑝 = = = 2.4233.
20 − 2 2

A 95% confidence interval for the difference in the population means is

1 1 2
𝑋𝑋� − 𝑌𝑌� ± 𝑡𝑡𝛼𝛼,𝑚𝑚+𝑛𝑛−2 𝑆𝑆𝑝𝑝 + = 10.26 − 9.28 ± 2.101 × 2.4233 × = −1.2968,3.2568 .
2 𝑚𝑚 𝑛𝑛 10
Because the interval contains the value 0, we cannot conclude that the population means
differ.
STAT 2006 - Jan 2021 57
Confidence intervals for means – Two samples

Welch’s t-interval for 𝜇𝜇𝑋𝑋 − 𝜇𝜇𝑌𝑌

When the data are normally distributed, and 𝑛𝑛 and 𝑚𝑚 are large enough, but the population
variances 𝜎𝜎𝑋𝑋2 and 𝜎𝜎𝑌𝑌2 are not equal, then, a 1 − 𝛼𝛼 confidence interval for 𝜇𝜇𝑋𝑋 − 𝜇𝜇𝑌𝑌 is
𝑆𝑆𝑋𝑋2 𝑆𝑆𝑌𝑌2
𝑋𝑋� − 𝑌𝑌� ± 𝑡𝑡𝛼𝛼,𝑟𝑟 + ,
2 𝑚𝑚 𝑛𝑛
where 𝑟𝑟 is the integer part of
2 2
𝑆𝑆𝑋𝑋2 𝑆𝑆𝑌𝑌
+
𝑚𝑚 𝑛𝑛
.
2 2 2
𝑆𝑆𝑋𝑋 𝑆𝑆𝑌𝑌2
𝑚𝑚 𝑛𝑛
+
𝑚𝑚 − 1 𝑛𝑛 − 1

STAT 2006 - Jan 2021 58

Confidence intervals for means – Two samples

Welch’s t-interval for 𝜇𝜇𝑋𝑋 − 𝜇𝜇𝑌𝑌

Example. The weekend athlete often incurs an injury due to not having the most
appropriate equipment. For example, tennis elbow is an injury that is the result of the
stress encountered by the elbow when striking a tennis ball. To investigate whether the
new oversized racket delivered less stress to the elbow than a more conventionally sized
racket, a group of 70 tennis players was participated in the study. 40 players were randomly
assigned to use the oversized racket and the remaining 30 players used the conventionally
sized racket.
The force on the elbow just after the impact of a forehand strike of a tennis ball was
measured five times for each of the 70 tennis players. The mean force was then taken of
the five force readings; the summary of these 70 force readings is given in the following
table.

STAT 2006 - Jan 2021 59

Confidence intervals for means – Two samples
Welch’s t-interval for 𝜇𝜇𝑋𝑋 − 𝜇𝜇𝑌𝑌 Oversized Conventional
Sample Size 40 30
Sample Mean 33.9 25.2
Sample Standard Deviation 17.4 8.6
Since the sample variances of the two groups differ, we use Welch’s t-interval. We can calculate
2 2
𝑆𝑆𝑋𝑋2 𝑆𝑆𝑌𝑌
+
𝑚𝑚 𝑛𝑛
𝑟𝑟 = 2 2 = 59.
𝑆𝑆𝑋𝑋2 𝑆𝑆𝑌𝑌2
𝑚𝑚 𝑛𝑛
+
𝑚𝑚 − 1 𝑛𝑛 − 1
2
𝑆𝑆𝑋𝑋 𝑆𝑆𝑌𝑌2
Thus, a 95% confidence interval for 𝜇𝜇𝑋𝑋 − 𝜇𝜇𝑌𝑌 is 𝑋𝑋� − 𝑌𝑌� ± 𝑡𝑡𝛼𝛼,𝑟𝑟 + = 33.9 − 25.2 ±
2 𝑚𝑚 𝑛𝑛
17.4 2 8.62
2.001 + = 2.361,15.039 .
40 30
STAT 2006 - Jan 2021 60
Confidence intervals for means – Two pair samples

Let 𝑋𝑋1 , 𝑌𝑌1 , 𝑋𝑋2 , 𝑌𝑌2 , … , 𝑋𝑋𝑛𝑛 , 𝑌𝑌𝑛𝑛 be 𝑛𝑛 pairs of dependent measurements. Example. the
weight before and after an exercise. The objective is to construct a 1 − 𝛼𝛼 confidence
interval for 𝜇𝜇𝑋𝑋 − 𝜇𝜇𝑌𝑌 .
Then 𝐷𝐷𝑖𝑖 = 𝑋𝑋𝑖𝑖 − 𝑌𝑌𝑖𝑖 , 𝑖𝑖 = 1,2, … , 𝑛𝑛 form a random sample from 𝑛𝑛 𝜇𝜇𝑋𝑋 − 𝜇𝜇𝑌𝑌 , 𝜎𝜎 2 .
� − 𝜇𝜇𝑋𝑋 − 𝜇𝜇𝑌𝑌
𝐷𝐷
𝑇𝑇 = ~𝑡𝑡𝑛𝑛−1 .
𝑆𝑆𝐷𝐷
𝑛𝑛
Hence, a 1 − 𝛼𝛼 confidence interval for 𝜇𝜇𝑋𝑋 − 𝜇𝜇𝑌𝑌 is given by
𝑆𝑆𝐷𝐷 𝑆𝑆𝐷𝐷
� − 𝑡𝑡𝛼𝛼
𝐷𝐷 �
, 𝐷𝐷 + 𝑡𝑡𝛼𝛼,𝑛𝑛−1 .
2 ,𝑛𝑛−1 𝑛𝑛 2 𝑛𝑛

STAT 2006 - Jan 2021 61

Confidence intervals for means – Pair sample
Example. An experiment was conducted to compare people's reaction times to a red light
versus a green light. When signaled with either the red or the green light, the subject was
asked to hit a switch to turn off the light. When the switch was hit, a clock was turned off
and the reaction time in seconds was recorded. The following results give the reaction
times for eight subjects.
Subject Red (𝑋𝑋) Green (𝑌𝑌) 𝐷𝐷 = 𝑋𝑋 − 𝑌𝑌
1 0.30 0.43 -0.13
2 0.23 0.36 -0.13
3 0.41 0.58 -0.17
4 0.53 0.46 0.07
5 0.24 0.27 -0.03
6 0.36 0.41 -0.05
7 0.38 0.38 0.00
8 0.51 0.61 -0.10

STAT 2006 - Jan 2021 62

Confidence intervals for means – Pair sample

Example. An experiment was conducted to compare people's reaction times to a red light
versus a green light. When signaled with either the red or the green light, the subject was
asked to hit a switch to turn off the light. When the switch was hit, a clock was turned off
and the reaction time in seconds was recorded. The following results give the reaction
times for eight subjects.
From the data, 𝐷𝐷 � = −0.0675 and 𝑆𝑆𝐷𝐷 = 0.0798. Hence, a 95% confidence interval for
𝜇𝜇𝐷𝐷 = 𝜇𝜇𝑋𝑋 − 𝜇𝜇𝑌𝑌 is
0.0798 0.0798
−0.0675 − 2.365 , −0.0675 + 2.365 = −0.1342, −0.00081 .
8 8
Since the entire 95% confidence interval is negative, we are 95% confidence that people
react faster to a red light.

STAT 2006 - Jan 2021 63

Pair 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Unaffect 1.94 1.44 1.56 1.58 2.06 1.66 1.75 1.77 1.78 1.92 1.25 1.93 2.04 1.62 2.08
Affect 1.27 1.63 1.47 1.39 1.93 1.26 1.71 1.67 1.28 1.85 1.02 1.34 2.02 1.59 1.97

We calculate the difference 𝑈𝑈 − 𝐴𝐴:

0.67, -0.19, 0.09, 0.19, 0.13, 0.40, 0.04, 0.10, 0.50, 0.07, 0.23, 0.59, 0.02, 0.03, 0.11

STAT 2006 - Jan 2021 64

Confidence intervals for means – Pair sample
Example. Are there physiological indicators associated with schizophrenia? In a 1990
article, researchers reported the results of a study that controlled for genetic and
socioeconomic differences by examining 15 pairs of identical twins, where one of the twins
was schizophrenic and the other not. The researchers used magnetic resonance imaging to
measure the volumes (in cubic centimeters) of several regions and sub-regions inside the
twins' brains. The following data came from one of the sub-regions, the left hippocampus:
� = 0.199 and 𝑆𝑆𝐷𝐷 = 0.2383. Hence, a 95% confidence interval for 𝜇𝜇𝐷𝐷 =
From the data, 𝐷𝐷
𝜇𝜇𝑋𝑋 − 𝜇𝜇𝑌𝑌 is
0.2383 0.2383
0.199 − 2.1448 , 0.199 + 2.1448 = 0.0667,0.3306 .
15 15
That is, we can be 95% confident that the mean size for unaffected individuals is between
0.067 and 0.33 cubic centimeters larger than the mean size for affected individuals.

STAT 2006 - Jan 2021 65

Confidence intervals for means – Pair sample

Common use of Pair t-interval

• A person is matched with a similar person. For example, a person is matched to another
person with a similar intelligence (IQ scores, for example) to compare the effects of two
educational programs on test scores.
• Before and after studies. For example, a person is weighed, and then put on a diet, and
weighed again.
• A person serves as his or her own control. For example, a person takes an asthma drug
called GoodLungs to assess the improvement on lung function, has a period of 8-weeks
in which no drugs are taken (known as a washout period), and then takes a second
asthma drug called EvenBetterLungs to again assess the improvement on lung function.

STAT 2006 - Jan 2021 66

Confidence intervals for variances – One sample
The pivotal quantity involving 𝜎𝜎 2 is
𝑛𝑛 − 1 𝑆𝑆 2 2
2
~𝜒𝜒𝑛𝑛−1 .
𝜎𝜎
We have
2 𝑛𝑛 − 1 𝑆𝑆 2 2 2
𝑛𝑛 − 1 𝑆𝑆 2 𝑛𝑛 − 1 𝑆𝑆 2
1 − 𝛼𝛼 = 𝑃𝑃 𝜒𝜒1− 𝛼𝛼 ≤ 2
≤ 𝜒𝜒𝛼𝛼 = 𝑃𝑃 𝜎𝜎 ∈ 2 , 2 .
2,𝑛𝑛−1 𝜎𝜎 2 ,𝑛𝑛−1 𝜒𝜒𝛼𝛼 𝜒𝜒 𝛼𝛼
2 ,𝑛𝑛−1 1− 2 ,𝑛𝑛−1

The 1 − 𝛼𝛼 confidence interval for 𝜎𝜎 2 is given by

𝑛𝑛 − 1 𝑆𝑆 2 𝑛𝑛 − 1 𝑆𝑆 2
2 , 2
𝜒𝜒𝛼𝛼 𝜒𝜒 𝛼𝛼
,𝑛𝑛−1 1− 2 ,𝑛𝑛−1
2

STAT 2006 - Jan 2021 67

Confidence intervals for variances – One sample

A large candy manufacturer produces, packages and sells packs of candy targeted to
weigh 52 grams. A quality control manager working for the company was concerned that
the variation in the actual weights of the targeted 52-gram packs was larger than
acceptable. That is, he was concerned that some packs weighed significantly less than 52-
grams and some weighed significantly more than 52 grams. In an attempt to estimate 𝜎𝜎,
the standard deviation of the weights of all of the 52-gram packs the manufacturer
makes, he took a random sample of 𝑛𝑛 = 10 packs off of the factory line. The random
sample yielded a sample variance of 4.2 grams. Use the random sample to derive a 95%
confidence interval for 𝜎𝜎.
2 2
Answer. 𝑆𝑆 = 4.2, 𝑛𝑛 = 10, 𝜒𝜒0.025,𝑛𝑛−1 = 19.02, and 𝜒𝜒0.975,𝑛𝑛−1 = 2.70. A 95% confidence
𝑛𝑛−1 𝑛𝑛−1
interval is 𝑆𝑆 2
𝜒𝜒𝛼𝛼
, 𝑆𝑆
𝜒𝜒2 𝛼𝛼
= 1.41,3.74 .
2 ,𝑛𝑛−1
1− ,𝑛𝑛−1
2

STAT 2006 - Jan 2021 68

Confidence intervals for variances – Two sample
Let 𝑋𝑋 = 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 and 𝑌𝑌 = 𝑌𝑌1 , 𝑌𝑌2 , … , 𝑌𝑌𝑚𝑚 be random samples from independent
distributions 𝑛𝑛 𝜇𝜇𝑋𝑋 , 𝜎𝜎𝑋𝑋2 and 𝑛𝑛 𝜇𝜇𝑌𝑌 , 𝜎𝜎𝑌𝑌2 , respectively. We are of interest to construct the
2
𝜎𝜎𝑋𝑋
confidence interval for .
𝜎𝜎𝑌𝑌2
Definition. Suppose that 𝑈𝑈~𝜒𝜒𝑟𝑟21 and 𝑉𝑉~𝜒𝜒𝑟𝑟22 are independent. Then,
𝑈𝑈⁄𝑟𝑟1
𝐹𝐹 =
𝑉𝑉 ⁄𝑟𝑟2
has a 𝐹𝐹𝑟𝑟1,𝑟𝑟2 distribution with 𝑟𝑟1 and 𝑟𝑟2 degrees of freedom.
1
Note that has a 𝐹𝐹𝑟𝑟2 ,𝑟𝑟1 distribution with 𝑟𝑟2 and 𝑟𝑟1
𝐹𝐹
degrees of freedom.
1 1 1
1 − 𝛼𝛼 = 𝑃𝑃 𝐹𝐹 > 𝐹𝐹1−𝛼𝛼,𝑟𝑟1 ,𝑟𝑟2 = 𝑃𝑃 < ⟹ 𝐹𝐹𝛼𝛼,𝑟𝑟2 ,𝑟𝑟1 = .
𝐹𝐹 𝐹𝐹1−𝛼𝛼,𝑟𝑟1 ,𝑟𝑟2 𝐹𝐹1−𝛼𝛼,𝑟𝑟1 ,𝑟𝑟2
STAT 2006 - Jan 2021 69
Confidence intervals for variances – Two sample

Note that
𝑛𝑛 − 1 𝑆𝑆𝑋𝑋2 2
𝑚𝑚 − 1 𝑆𝑆 2
𝑌𝑌 2
2 ~𝜒𝜒𝑛𝑛−1 and 2 ~𝜒𝜒𝑚𝑚−1 .
𝜎𝜎𝑋𝑋 𝜎𝜎𝑌𝑌
Since 𝑆𝑆𝑋𝑋2 and 𝑆𝑆𝑌𝑌2 are independent,
𝑚𝑚 − 1 𝑆𝑆𝑌𝑌2
𝑆𝑆𝑌𝑌2 𝜎𝜎𝑋𝑋2 𝜎𝜎𝑌𝑌2 𝑚𝑚 − 1
2� 2 = 2 ~𝐹𝐹𝑚𝑚−1,𝑛𝑛−1 .
𝑆𝑆𝑋𝑋 𝜎𝜎𝑌𝑌 𝑛𝑛 − 1 𝑆𝑆𝑋𝑋
𝜎𝜎𝑋𝑋2 𝑛𝑛 − 1
𝑆𝑆𝑌𝑌2 2
𝜎𝜎𝑋𝑋
Hence, 2 � is a pivotal quantity.
𝑆𝑆𝑋𝑋 𝜎𝜎𝑌𝑌2

STAT 2006 - Jan 2021 70

Confidence intervals for variances – Two sample

𝑆𝑆𝑌𝑌2 𝜎𝜎𝑋𝑋2
1 − 𝛼𝛼 = 𝑃𝑃 𝐹𝐹1−𝛼𝛼,𝑚𝑚−1,𝑛𝑛−1 ≤ 2 � 2 ≤ 𝐹𝐹𝛼𝛼,𝑚𝑚−1,𝑛𝑛−1
2 𝑆𝑆𝑋𝑋 𝜎𝜎𝑌𝑌 2
𝑆𝑆𝑋𝑋2 𝜎𝜎𝑋𝑋2 𝑆𝑆𝑋𝑋2
= 𝑃𝑃 𝐹𝐹1−𝛼𝛼,𝑚𝑚−1,𝑛𝑛−1 2 ≤ 2 ≤ 𝐹𝐹𝛼𝛼,𝑚𝑚−1,𝑛𝑛−1 2
2 𝑆𝑆𝑌𝑌 𝜎𝜎𝑌𝑌 2 𝑆𝑆𝑌𝑌
1 𝑆𝑆𝑋𝑋2 𝜎𝜎𝑋𝑋2 𝑆𝑆𝑋𝑋2
= 𝑃𝑃 2 ≤ 2 ≤ 𝐹𝐹𝛼𝛼,𝑚𝑚−1,𝑛𝑛−1 2
𝐹𝐹𝛼𝛼,𝑛𝑛−1,𝑚𝑚−1 𝑆𝑆𝑌𝑌 𝜎𝜎𝑌𝑌 2 𝑆𝑆𝑌𝑌
2
2
𝜎𝜎𝑋𝑋
A 1 − 𝛼𝛼 confidence interval for is given by
𝜎𝜎𝑌𝑌2

1 𝑆𝑆𝑋𝑋2 𝜎𝜎𝑋𝑋2 𝑆𝑆𝑋𝑋2

2 ≤ 2 ≤ 𝐹𝐹𝛼𝛼,𝑚𝑚−1,𝑛𝑛−1 2
𝐹𝐹𝛼𝛼,𝑛𝑛−1,𝑚𝑚−1 𝑆𝑆𝑌𝑌 𝜎𝜎𝑌𝑌 2 𝑆𝑆𝑌𝑌
2

STAT 2006 - Jan 2021 71

Confidence intervals for variances – Two sample
Example. The feeding habits of two-species of net-casting spiders are studied. The species,
the deinopis and menneus, coexist in eastern Australia. The following summary statistics
were obtained on the size, in millimeters, of the prey of the two species:
Adult DEINOPIS Adult MENNEUS
𝑛𝑛 = 10 𝑚𝑚 = 10
𝑥𝑥̅ = 10.26 𝑚𝑚𝑚𝑚 𝑦𝑦� = 9.28 𝑚𝑚𝑚𝑚
𝑠𝑠𝑋𝑋 = 2.51 𝑠𝑠𝑌𝑌 = 1.9
Estimate, with 95% confidence, the ratio of the two population standard deviations.
Since 𝐹𝐹0.025,9,9 = 4.03, a 95% confidence interval for the ratio of two population standard
deviations is
1 2.51 𝜎𝜎𝑋𝑋 2.51
0.6584 = ≤ ≤ 4.03 = 2.6507.
4.03 1.90 𝜎𝜎𝑌𝑌 1.90
STAT 2006 - Jan 2021 72
Confidence intervals for proportion
Let 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 be a random sample of size 𝑛𝑛 of a Bernoulli random variable 𝑋𝑋 with a
success probability 𝑝𝑝 = 𝐸𝐸 𝑋𝑋 . Let 𝑥𝑥1 , 𝑥𝑥2 , … , 𝑥𝑥𝑛𝑛 denote the realization of 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 .
Since 𝑌𝑌 = ∑𝑛𝑛𝑖𝑖=1 𝑋𝑋𝑖𝑖 ~𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑛𝑛, 𝑝𝑝 , the large sample distribution of 𝑍𝑍 is
𝑋𝑋� − 𝑝𝑝
𝑍𝑍 = ~𝑛𝑛 0,1 .
𝑝𝑝 1 − 𝑝𝑝
𝑛𝑛
Wald confidence interval
A 1 − 𝛼𝛼 confidence interval for 𝑝𝑝 is given by
𝑋𝑋� 1 − 𝑋𝑋� 𝑋𝑋� 1 − 𝑋𝑋�
𝑋𝑋� − 𝑧𝑧𝛼𝛼 , 𝑋𝑋� + 𝑧𝑧𝛼𝛼 .
2 𝑛𝑛 2 𝑛𝑛
A drawback of the Wald CI is that the lower bound can be beyond zero when the true
value of 𝑝𝑝 is small and the upper bound can be greater than 1 when the true value of 𝑝𝑝 is
near one.
STAT 2006 - Jan 2021 73
Confidence intervals for proportion

Example. We surveyed 𝑛𝑛 = 418 Hong Kong citizens about their opinions on insurance
rates. Of the 418 surveyed, 𝑌𝑌 = 218 blamed rising private health premiums. The sample
proportion is
280
𝑝𝑝̂ = = 0.67.
418
Estimate, with 95% confidence, the proportion of all Hong Kong citizens who blame rising
health insurance premiums.
Answer. With 𝑧𝑧0.025 = 1.96, a 95% confidence interval for 𝑝𝑝 is
𝑝𝑝̂ 1 − 𝑝𝑝̂ 𝑝𝑝̂ 1 − 𝑝𝑝̂
𝑝𝑝̂ − 𝑧𝑧0.025 , 𝑝𝑝̂ + 𝑧𝑧0.025 = 0.625,0.715 .
𝑛𝑛 𝑛𝑛

STAT 2006 - Jan 2021 74

Confidence intervals for difference in proportions

Suppose that 𝑌𝑌1 ~Binomial 𝑛𝑛1 , 𝑝𝑝1 , 𝑌𝑌2 ~Binomial 𝑛𝑛2 , 𝑝𝑝2 , and 𝑌𝑌1 and 𝑌𝑌2 are independent
Define 𝛿𝛿 = 𝑝𝑝1 − 𝑝𝑝2 . Then
𝑌𝑌1 𝑌𝑌2
𝛿𝛿̂ = 𝑝𝑝̂1 − 𝑝𝑝̂ 2 = − .
𝑛𝑛1 𝑛𝑛2
Normal approximation to Binomial implies that
𝑝𝑝1 1 − 𝑝𝑝1 𝑝𝑝2 1 − 𝑝𝑝2
̂
𝛿𝛿~𝑛𝑛 𝛿𝛿, +
𝑛𝑛1 𝑛𝑛2
A 1 − 𝛼𝛼 confidence interval for 𝛿𝛿 is given by
𝑝𝑝̂1 1 − 𝑝𝑝̂1 𝑝𝑝̂ 2 1 − 𝑝𝑝̂ 2
𝛿𝛿̂ ± 𝑧𝑧𝛼𝛼 +
2 𝑛𝑛1 𝑛𝑛2

STAT 2006 - Jan 2021 75

Confidence intervals for difference in proportions

Example. Two detergents were tested for their ability to remove stains of a certain type.
An inspector judged the first one to be successful on 63 out of 91 independent trials and
the second one to be successful on 42 out of 79 independent trials. The respective relative
frequencies of success are 0.692 and 0.532. An approximate 90% confidence interval for
the difference 𝑝𝑝1 − 𝑝𝑝2 of the two detergents is

0.692 × 0.308 0.532 × 0.468

0.692 − 0.532 ± 1.645 + = 0.038,0.282 .
91 79
Since this interval does not include zero, the first detergent is better than the second one
for removing the type of stains in question.

STAT 2006 - Jan 2021 76

Confidence intervals for difference in proportions
Example. What is the prevalence of anemia in developing countries?

Women in Developing Women in Developed

Countries Countries
Sample Size 2100 1900
Number with Anemia 840 323

Find a 95% confidence interval for the difference in proportions of all women with anemia
in developing countries and all women from developed countries with anemia.
Answer. A 95% confidence interval for the difference in proportions is
0.4 × 0.6 0.17 × 0.83
0.40 − 0.17 ± 1.96 + = 0.203,0.257 .
2100 1900

STAT 2006 - Jan 2021 77

Sample Size – Estimating a mean

To estimate 𝜇𝜇 with a maximum error 𝜀𝜀 > 0. That is 𝑥𝑥̅ − 𝜇𝜇 < 𝜀𝜀. Since a 1 − 𝛼𝛼 confidence
𝑠𝑠 𝑠𝑠
interval for 𝜇𝜇 is 𝑥𝑥̅ ± 𝑡𝑡 ,𝑛𝑛−1 , it follows that 𝑡𝑡 ,𝑛𝑛−1
𝛼𝛼 𝛼𝛼 < 𝜀𝜀. Thus,
2 𝑛𝑛 2 𝑛𝑛
𝑠𝑠 2
𝑛𝑛 ≥ 𝑡𝑡𝛼𝛼2,𝑛𝑛−1 2 .
2 𝜀𝜀
If 𝑛𝑛 is a large value, 𝑡𝑡𝛼𝛼,𝑛𝑛−1 ≈ 𝑧𝑧𝛼𝛼 . We have
2 2
𝑠𝑠 2
𝑛𝑛 ≥ 𝑧𝑧𝛼𝛼2 2 .
2 𝜀𝜀

STAT 2006 - Jan 2021 78

Sample Size – Estimating a mean

Example. A team of researchers wants to estimate the mean IQ of students enrolled at

one prestigious university. Previous research studies have examined samples of students
from other similar universities and usually find results around 𝑥𝑥̅ = 120 and 𝑠𝑠 = 10. In
order to construct a 90% confidence interval with a margin of error of ±2 IQ points, what
sample size should be obtained?
𝑠𝑠 2 10 2
2 2×
𝑛𝑛 ≥ 𝑧𝑧0.05 2
= 1.645 2
= 67.64.
𝜀𝜀 2
The research team should attempt to obtain a sample of at least 68 individuals.

STAT 2006 - Jan 2021 79

Sample Size – Estimating a proportion

To estimate 𝑝𝑝 with a maximum error 𝜀𝜀 > 0. That is 𝑥𝑥̅ − 𝑝𝑝 < 𝜀𝜀. Since a 1 − 𝛼𝛼 confidence
𝑝𝑝� 1−𝑝𝑝� 𝑝𝑝� 1−𝑝𝑝�
interval for 𝑝𝑝 is 𝑝𝑝̂ ± 𝑧𝑧𝛼𝛼 , it follows that 𝑧𝑧𝛼𝛼 < 𝜀𝜀. Thus,
2 𝑛𝑛 2 𝑛𝑛

𝑝𝑝̂ 1 − 𝑝𝑝̂
𝑛𝑛 ≥ 𝑧𝑧𝛼𝛼2 2
.
2 𝜀𝜀
We want to construct a 95% confidence interval for with a margin of error equal to 4%.
Because there is no estimate of the proportion given, we use 𝑝𝑝̂ = 0.5 for a conservative
estimate.
2
𝑝𝑝̂ 1 − 𝑝𝑝̂ 2
0.5 × 0.5
𝑛𝑛 ≥ 𝑧𝑧0.025 2
= 1.96 × 2
= 600.23.
𝜀𝜀 0.04
We should obtain a sample of at least 𝑛𝑛 = 601.

STAT 2006 - Jan 2021 80

Sample Size – Estimating a proportion for a small, finite population

Consider a population of size 𝑁𝑁. Suppose that there are 𝑁𝑁1 respondents (𝑁𝑁1 is unknown)
in the population who would answer yes to a particular question. The true proportion of
yes respondents is
𝑁𝑁1
𝑝𝑝 = .
𝑁𝑁
Let 𝑋𝑋1 , 𝑋𝑋2 , … , 𝑋𝑋𝑛𝑛 be a random sample of size 𝑛𝑛 without replacement from the population.
𝑛𝑛
Define 𝑋𝑋𝑖𝑖 = 1 if respondent 𝑖𝑖 answers yes to a particular question. Then, 𝑋𝑋 = � 𝑋𝑋𝑗𝑗 is
𝑗𝑗=1
the number of respondents in the sample who answers yes to the question. It is known
that 𝑋𝑋 has a hypergeometric distribution with mean 𝐸𝐸 𝑋𝑋 = 𝑛𝑛𝑛𝑛 and 𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋 =
𝑁𝑁−𝑛𝑛
𝑛𝑛𝑛𝑛 1 − 𝑝𝑝 .
𝑁𝑁−1

STAT 2006 - Jan 2021 81

Sample Size – Estimating a proportion for a small, finite population

Hence, using the Central Limit Theorem, an approximate 1 − 𝛼𝛼 confidence interval for 𝑝𝑝 is
𝑝𝑝̂ 1 − 𝑝𝑝̂ 𝑁𝑁 − 𝑛𝑛
𝑝𝑝̂ ± 𝑧𝑧𝛼𝛼 � ,
2 𝑛𝑛 𝑁𝑁 − 1
�
where 𝑝𝑝̂ = 𝑋𝑋.
If we want to determine the sample size 𝑛𝑛 so that the error in estimating 𝑝𝑝 is no larger
than 𝜀𝜀 > 0, we require that
𝑝𝑝̂ 1 − 𝑝𝑝̂ 𝑁𝑁 − 𝑛𝑛 2 𝑝𝑝̂ 1 − 𝑝𝑝̂ 𝑁𝑁 − 𝑛𝑛
𝑧𝑧𝛼𝛼 � ≤ 𝜀𝜀 ⟹ 𝑧𝑧𝛼𝛼 � ≤ 𝜀𝜀 2 .
2 𝑛𝑛 𝑁𝑁 − 1 2 𝑛𝑛 𝑁𝑁 − 1
𝑝𝑝� 1−𝑝𝑝�
Let 𝑚𝑚 = 𝑧𝑧𝛼𝛼2 . Solving for 𝑛𝑛, we get
2
𝜀𝜀 2
𝑚𝑚𝑚𝑚
𝑛𝑛 ≥ .
𝑁𝑁 − 1 + 𝑚𝑚

STAT 2006 - Jan 2021 82

Sample Size – Estimating a proportion for a small, finite population

A researcher is studying the population of a small town in India of 𝑁𝑁 = 2000 people. She's
interested in estimating 𝑝𝑝 for several yes/no questions on a survey. How many people 𝑛𝑛
does she have to randomly sample (without replacement) to ensure that her estimates 𝑝𝑝̂
are within 𝜀𝜀 = 0.04 of the true proportion 𝑝𝑝?
Answer.
1 2 𝑝𝑝� 1−𝑝𝑝�
2× 0.25
Since 𝑝𝑝̂ 1 − 𝑝𝑝̂ ≤ , we calculate 𝑚𝑚 as 𝑚𝑚 = 𝑧𝑧0.025 = 1.96 = 600.25 =
4 𝜀𝜀 2 0.04 2
601. For 95% confidence, the sample size 𝑛𝑛 we require is that
𝑚𝑚𝑚𝑚 601 × 2000
𝑛𝑛 ≥ = = 462.3.
𝑁𝑁 − 1 + 𝑚𝑚 2000 − 1 + 601
Thus, we require 463 people to estimate 𝑝𝑝 with 95% confidence.

STAT 2006 - Jan 2021 83

Capstone Online Agricultural Product Store
100% (3)
Capstone Online Agricultural Product Store
48 pages
The Role of Involvement in Attention and Comprehension Processes
No ratings yet
The Role of Involvement in Attention and Comprehension Processes
16 pages
Main Parameterestimation PDF
No ratings yet
Main Parameterestimation PDF
73 pages
Chap - 2point - Estimation
No ratings yet
Chap - 2point - Estimation
11 pages
ACFrOgDxHI9RLajsdAAleI AMD3fD8GMumHY4hP954G9Nc5wG y r Km6yewAtD6KPaLn4JtmlryIevFHyE5hLCpCG9kYiN y2aUEiWWoofQYGd7Z10 ETX5BGeaw6ImvJ9HjlO8aNIJuqL7FlX9wq3pZ2PgZnbra RuhNZrYg==
No ratings yet
ACFrOgDxHI9RLajsdAAleI AMD3fD8GMumHY4hP954G9Nc5wG y r Km6yewAtD6KPaLn4JtmlryIevFHyE5hLCpCG9kYiN y2aUEiWWoofQYGd7Z10 ETX5BGeaw6ImvJ9HjlO8aNIJuqL7FlX9wq3pZ2PgZnbra RuhNZrYg==
16 pages
Chapter 3 - Statistical Inference (Point Estimation
No ratings yet
Chapter 3 - Statistical Inference (Point Estimation
15 pages
Frequentist Estimation: 4.1 Likelihood Function
No ratings yet
Frequentist Estimation: 4.1 Likelihood Function
6 pages
CHAPTER 3
No ratings yet
CHAPTER 3
9 pages
NOTES
No ratings yet
NOTES
14 pages
Slides Estimation PDF
No ratings yet
Slides Estimation PDF
17 pages
X X X F X F X F X: Likelihood Function
No ratings yet
X X X F X F X F X: Likelihood Function
12 pages
Maximum Likelihood
No ratings yet
Maximum Likelihood
7 pages
Topic 14: Maximum Likelihood Estimation: 1 Examples
No ratings yet
Topic 14: Maximum Likelihood Estimation: 1 Examples
6 pages
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
7 pages
Additionalexamples
No ratings yet
Additionalexamples
29 pages
sta255 Week 11-1 pre
No ratings yet
sta255 Week 11-1 pre
37 pages
stat100b_maximum_likelihood
No ratings yet
stat100b_maximum_likelihood
9 pages
Chapter 5
No ratings yet
Chapter 5
60 pages
Lectura 1 Point Estimation
No ratings yet
Lectura 1 Point Estimation
47 pages
MLE_Assingnment (1)
No ratings yet
MLE_Assingnment (1)
7 pages
TD2
No ratings yet
TD2
4 pages
Unbiased Estimation - of Mean and Variance
No ratings yet
Unbiased Estimation - of Mean and Variance
4 pages
Handout 6 (Chapter 6) : Point Estimation: Unbiased Estimator: A Point Estimator
No ratings yet
Handout 6 (Chapter 6) : Point Estimation: Unbiased Estimator: A Point Estimator
9 pages
Imp - Maximum Likelihood Estimation - STAT 414 - 415
No ratings yet
Imp - Maximum Likelihood Estimation - STAT 414 - 415
8 pages
Point Estimation: Institute of Technology of Cambodia
No ratings yet
Point Estimation: Institute of Technology of Cambodia
22 pages
Maximum Likelihood
No ratings yet
Maximum Likelihood
7 pages
Week 9-12 GA Solutions madras univ
No ratings yet
Week 9-12 GA Solutions madras univ
43 pages
Statistical Inference
No ratings yet
Statistical Inference
55 pages
Week 1 1720465962 Estimation Hour 2
No ratings yet
Week 1 1720465962 Estimation Hour 2
14 pages
Stats 2 GA Week 9 Sols
No ratings yet
Stats 2 GA Week 9 Sols
8 pages
Inf 2
No ratings yet
Inf 2
37 pages
STAT2602 Tutorial 5
No ratings yet
STAT2602 Tutorial 5
7 pages
Statistical Inference: Classical and Bayesian Methods
No ratings yet
Statistical Inference: Classical and Bayesian Methods
22 pages
ch7
No ratings yet
ch7
29 pages
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
No ratings yet
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
15 pages
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
No ratings yet
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
15 pages
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
No ratings yet
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
15 pages
7 Mle
No ratings yet
7 Mle
31 pages
Statistics I: Parameter Estimation, Part I
No ratings yet
Statistics I: Parameter Estimation, Part I
24 pages
Statistical Inference 2 Note 1
No ratings yet
Statistical Inference 2 Note 1
3 pages
TD2
No ratings yet
TD2
4 pages
TD1 PointEstimation
No ratings yet
TD1 PointEstimation
5 pages
12_MLEFilled (1)
No ratings yet
12_MLEFilled (1)
8 pages
MLE Dan Bayesian Estimation From Walpole Book
No ratings yet
MLE Dan Bayesian Estimation From Walpole Book
13 pages
HW 7 Solutions
No ratings yet
HW 7 Solutions
7 pages
Lecture 5
No ratings yet
Lecture 5
5 pages
Exercises Solutions Based on Estimation 173927212199862208967ab2fb9139dd
No ratings yet
Exercises Solutions Based on Estimation 173927212199862208967ab2fb9139dd
9 pages
Chap 10
No ratings yet
Chap 10
7 pages
msqe_metrics_1_ps2
No ratings yet
msqe_metrics_1_ps2
11 pages
HW 2 Solutions
No ratings yet
HW 2 Solutions
6 pages
Beamer 7
100% (1)
Beamer 7
92 pages
Maximum
No ratings yet
Maximum
3 pages
Notes For Lectures 1 To 10 - 2024
No ratings yet
Notes For Lectures 1 To 10 - 2024
39 pages
A Guide To Modern Econometrics by Verbeek 181 190
No ratings yet
A Guide To Modern Econometrics by Verbeek 181 190
10 pages
Unit5 Updated
No ratings yet
Unit5 Updated
69 pages
STA 303 Lec 1
No ratings yet
STA 303 Lec 1
5 pages
Session 32 - Point Estimate
No ratings yet
Session 32 - Point Estimate
53 pages
Squalsoln
No ratings yet
Squalsoln
61 pages
Calculus Volume1
From Everand
Calculus Volume1
Ming Yao Tsai
No ratings yet
Application of Derivatives Tangents and Normals (Calculus) Mathematics E-Book For Public Exams
From Everand
Application of Derivatives Tangents and Normals (Calculus) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
5/5 (1)
Calculus Super Review
From Everand
Calculus Super Review
Editors of REA
No ratings yet
Applications of Derivatives Errors and Approximation (Calculus) Mathematics Question Bank
From Everand
Applications of Derivatives Errors and Approximation (Calculus) Mathematics Question Bank
Mohmmad Khaja Shareef
No ratings yet
wp1 For Portfolio 1
No ratings yet
wp1 For Portfolio 1
7 pages
Nuziveedu Seeds Limited-English
No ratings yet
Nuziveedu Seeds Limited-English
4 pages
Exercise Dependence and Muscle Dysmorphia in Novice
No ratings yet
Exercise Dependence and Muscle Dysmorphia in Novice
5 pages
G8 Review of Related Literatures
No ratings yet
G8 Review of Related Literatures
3 pages
Unit - I: Nature and Scope of Managerial Economics Important Notes
No ratings yet
Unit - I: Nature and Scope of Managerial Economics Important Notes
27 pages
ALX - Advancing To Your Tech Specialization - 2023
No ratings yet
ALX - Advancing To Your Tech Specialization - 2023
11 pages
Action Research For Grade 10 Students' Decision-Making Behavior Towards Choosing Senior High School Track
No ratings yet
Action Research For Grade 10 Students' Decision-Making Behavior Towards Choosing Senior High School Track
8 pages
Interval Estimates: Ed Neil O. Maratas Bs Statistics, Ma Mathematics
No ratings yet
Interval Estimates: Ed Neil O. Maratas Bs Statistics, Ma Mathematics
10 pages
AB1202 Quiz 3 Prep Special R-Skills v1 Nov'20oubhjnl
No ratings yet
AB1202 Quiz 3 Prep Special R-Skills v1 Nov'20oubhjnl
2 pages
Malhotra MR6e 5
No ratings yet
Malhotra MR6e 5
50 pages
Lotfi 2021
No ratings yet
Lotfi 2021
5 pages
RAVE Final Report
No ratings yet
RAVE Final Report
314 pages
Post Marketing Surveillance
100% (1)
Post Marketing Surveillance
15 pages
Int J Consumer Studies - 2025 - Herjanto - Should I Use ChatGPT Travel Insurance Recommendations A Dual‐Process Theory
No ratings yet
Int J Consumer Studies - 2025 - Herjanto - Should I Use ChatGPT Travel Insurance Recommendations A Dual‐Process Theory
17 pages
Training and Development Questionnaire
100% (3)
Training and Development Questionnaire
7 pages
Devpsych Week 1 Discussion
No ratings yet
Devpsych Week 1 Discussion
23 pages
1 On 1 Research Mentorship
No ratings yet
1 On 1 Research Mentorship
19 pages
The Application of TRIZ To Technology Forecasting A Case Study: Brassiere Strap Technology
No ratings yet
The Application of TRIZ To Technology Forecasting A Case Study: Brassiere Strap Technology
16 pages
Chapter3 R&D Introduction
No ratings yet
Chapter3 R&D Introduction
28 pages
HIST6150 Dissertation Handbook 2023-24
No ratings yet
HIST6150 Dissertation Handbook 2023-24
26 pages
Feduc 04 00100
No ratings yet
Feduc 04 00100
9 pages
p041 049 PDF
No ratings yet
p041 049 PDF
9 pages
Marketing Research
No ratings yet
Marketing Research
4 pages
Department of Chemistry Jadavpur University
No ratings yet
Department of Chemistry Jadavpur University
1 page
ACI 369.1R-17
No ratings yet
ACI 369.1R-17
116 pages
Spark Lab
No ratings yet
Spark Lab
6 pages
Consumer Awareness and Attitude Towards The Recycled Packaging
50% (4)
Consumer Awareness and Attitude Towards The Recycled Packaging
74 pages
Jurnal Mual Gumoh
No ratings yet
Jurnal Mual Gumoh
7 pages