Estimation
Estimation
INTERVAL ESTIMATION
By
DR. G.O. NWAFOR
4.1 INTRODUCTION
An inference method is developed for studying the behavior of random variables under study. In
point estimation, an estimate of a parameter 𝜃, where the inference is a guess of a single value as
the value of 𝜃 is made. In interval estimation, an interval estimate of real-valued parameter 𝜃 is
any pair of functions, L(x1, x2, …xn), and U (x1, x2, …xn), of a sample that satisfy L(X)≤ 𝜃 ≤ U(X)
for all x ∈X.
If the random sample x is analysed, the inference L(x) ≤ 𝜃 ≤ U(x) is made and random interval
[𝐿 (𝑥), 𝑈 (𝑥)] is called an interval estimator.
It is important to keep in mind that the interval is the random quantity.
where ∝ is the level of significance. The interval [a, b] within which the unknown value of the
parameter 𝜃 is expected to lie is known as a confidence interval.
value of 𝑡 distribution with degree of freedom 𝑛 − 1 for confidence interval of unknown 𝜇 (mean)
and S (standard deviation) of a sample random sampling with replacement (SRSWR).
In the case of simple random sampling without replacement from a finite population of size N, the
100 (1-∝)% confidence internal for 𝜇 are given by
𝑁−𝑛 𝑁−𝑛
𝑋̅ − 𝑍∝⁄2 𝜎⁄ √ ̅ + 𝑍∝⁄ 𝜎⁄
≤𝜇≤ X √
√𝑛 𝑁 − 1 2 √𝑛 𝑁 − 1
It should be noted that in case of either measurement or attribute data, if the sample size exceeds
5 percent of the population, we should use what is referred to as the finite correction factor; that
𝑁−𝑛
is, we should multiply the appropriate standard error formula by√ 𝑁−1 , finite correction (Dick,
1968). Obviously, where the sample size is small relative to the population, this correction factor
will make a minor increase in the standard error. However, where the sample size exceeds 5 percent
of the population, this correction factor would tend to reduce the size of the standard error.
Example 1
The Department of Statistics in 2019/2021 Session carried out a survey, on the following human
characteristics. The study provided the information below:
Table 1:
Student’s Heights Student’s weight Student’s Age
Male Female Male Female Age Age
5.7 7.3 47.5 60.7 19 20
6.5 3.4 60.5 50.3 21 17
4.5 5.2 30.4 60.1 24 18
7.3 5.3 69.1 47.5 23 21
6.5 4.7 70.1 50.7 21 23
7.1 4.8 80.4 39.4 18 27
4.5 6.5 61.5 60.5 19 31
4.7 7.1 35.2 70.1 23 18
7.4 4.8 90.4 80.8 25 19
6.7 5.1 80.7 91.7 27 21
5.6 5.8 50.4 49.6 29 23
5.4 5.4 56.2 55.7 31 24
Find a 95% confidence interval for the mean of male students’ height.
Solution
∑x
𝑋̅ = = 5.992, 𝑡0.025,11 = 2.201
𝑛
∑(𝑥 − 𝑥̅ )2
𝑆2 = = 1.0723, 𝑑𝑓 = 𝑛 − 1 = 12 − 1 = 11
𝑛−1
𝑆 1.0723
𝑆. 𝐸 = = = 0.3096, where S.E is the standard error.
√𝑛 √12
x̅ − 𝑡∝⁄2 𝑆⁄ x + 𝑡∝⁄2 𝑆⁄ .
< 𝜇< ̅
√𝑛 √𝑛
= 5.992 − 2.201 (0.3096) < 𝜇 < 5.992 + 2.201 (0.3096)
= 5.311 < 𝜇 < 6.673
The 95% confidence interval for the male students’ height lies on 5.311 < 𝜇 < 6.673.
Example 2
A random sample of grades of 40 statistics students, out of a total of 300 showed a mean of 84 and
a standard deviation of 16, find the 95% confidence interval for the estimate of the mean of 300
grades for sample random sampling with replacement.
Solution
𝑛 = 40, 𝑥̅ = 84, 𝑁 = 300, 𝑆 = 16
𝑁−𝑛 𝑁−𝑛
̅ − 𝑍∝⁄ 𝜎⁄ √
X ≤ 𝜇 ≤ x
̅ + 𝑍∝⁄ 𝜎⁄ √
2 √𝑛 𝑁 − 1 2 √𝑛 𝑁 − 1
16 300 − 40 16 300 − 40
= 84 − 1.96 √ ≤ 𝜇 ≤ 84 − 1.96 √
√40 300 − 1 √40 300 − 1
= 84 − 196 (2.5298)(0,9325) ≤ 𝜇 ≤ 84 + 1.96 (2.5298)(0.9325)
= 79.376 ≤ 𝜇 ≤ 88.624, is 95% confidence interval for the estimate of mean of 300 grades.
Solution
The case 1, where 𝜎 is known (n large or small) the confidence is given as
𝜎 𝜎
̅ − 𝑍∝⁄
X ≤ 𝜇 ≤ x̅ + 𝑍∝⁄2
2 √𝑛 √𝑛
𝑠
The case 2, where 𝜎 is unknown (n large i.e n ≥30) the confidence is given as ̅
X − 𝑍∝⁄2 ≤ 𝜇≤
√𝑛
𝑠
x̅ + 𝑍∝⁄2
√𝑛
𝑠
̅ − 𝑡∝⁄
The case 3, where 𝜎 is unknown (n large i.e n <30) the confidence is given as X ≤ 𝜇≤
2 √𝑛
𝑠
x̅ + 𝑡∝⁄2
√𝑛
3 3
= 18 − 1.65 ( ) ≤ μ ≤ 18 + 1.65 ( )
√36 √36
=17.8 ≤ 𝜇 ≤ 18.83.
Example 4
If a random sample of size 𝑛 = 20 from a normal population with σ2 = 225 has a mean of 6.43
construct 95% confidence interval for the population mean.
Solution
̅−
The case 1 is applied, where 𝜎 is known (n large or small) the confidence is given as X
𝜎 𝜎
𝑍∝⁄2 ≤ 𝜇 ≤ x̅ + 𝑍∝⁄2
√𝑛 √𝑛
𝑛 = 20, x̅ = 64.3, σ2 = 225, σ = 15
𝜎 𝜎
̅
X − 𝑍∝⁄2 ≤ 𝜇 ≤ x̅ + 𝑍∝⁄2
√𝑛 √𝑛
15 15
= 64.3 − 1.96 ( ) ≤ μ ≤ 64.3 + 1.96 ( )
√20 √20
= 57.7 ≤ μ ≤ 70.9
[57.7, 70.9] is the 95% confidence interval for the population mean.
4.3.2 Interval Estimation for Difference between Two Means for 𝒏𝟏 and 𝒏𝟐 less than 30
Considering the sampling distribution of two population characteristics, if 𝜎1 2 and 𝜎2 2 are not
known, but can be estimated from corresponding samples i.e. estimate of 𝑆1 2 and 𝑆2 2 are used
then:
𝜎̂1 2 = 𝑆1 2 and 𝜎̂2 2 = 𝑆1 2 and the
100 (1−∝)% confidence intervals for 𝜇1 and 𝜇2 when either both sample sizes 𝑛1 and 𝑛2 are < 30
is given by
population/sample characteristics with the usual ∝ as the level of significance. Note that
𝑠1 2 𝑎𝑛𝑑 𝑠2 2 are the usual variance of first and second sample from the random samples. The
𝑛 +𝑛2 −2
𝑡∝⁄1 𝑖𝑠 𝑡ℎ𝑒 𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑟𝑒𝑔𝑖𝑜𝑛 𝑤𝑖𝑡ℎ 𝑛1 + 𝑛2 − 2 as the degree of freedom.
2
Example 5
Nwafor Consultancy and Educational Services carried out a survey on the test performances of
students in two different courses. The findings are reported in table 2. Obtain the confidence
interval for the sample sizes of 𝑛1 and 𝑛2 = 15 at 95% level of significance.
Solution
x1 = 21.73 x̅2 = 24.67
𝑛1 = 15 𝑛2 = 15
𝑆1 = 3.751 𝑆2 = 4.577
𝑆1 2 = 14.07 𝑆2 2 = 20.95
d. f = 𝑛1 + 𝑛2 − 2 = 28
𝑠𝑝 (√1⁄𝑛1 + 1⁄𝑛2 )
𝑛 +𝑛2 −2
𝑡∝⁄1
2
15+15−2 1 1
C.I = (21.73 − 24.67) ± 𝑡0.05⁄
(4.18)√ +
2 15 15
28 1 1
= (21.73 − 24.67) ± 𝑡0.05 (4.18) √ +
⁄2 15 15
Example 6
The study of a nicotine content of two brands of 10 cigarette of brand A has an average 3.1mg and
standard deviation of 0.5mg while 8 cigarette had an average nicotine content of 2.7mg. Assume
that the two set of that are independent random variable from normal population with equal
variance. Construct a 95% confidence interval for the difference between the two nicotine content.
Solution
𝑛1 = 10 𝑛2 = 8
𝑆1 = 0.5 𝑆2 = 0.7
x̅1 = 3.1 x̅2 = 2.7
(10−1) (0.5)2 +(8−1) (10.7)2
𝑆𝑝 = √ = 0.596
10 + 8 −2
𝑆𝑝√1⁄𝑛1 + 1⁄𝑛2
𝑛 +𝑛2 −2
𝐶𝐼 = x̅1 − x̅2 ± 𝑡∝⁄1
2
10+8−2
= 3.1 − 2.7 ± 𝑡0.05⁄
(0.596) √1⁄10 + 1⁄8
2
16 (0.2699)
= 0.4 ± 𝑡0.025 = 0.4 ± 2.120 (0.2699)
= 0.4 ± 0.5722
=−0.172 ≤ 𝜇1 − 𝜇2 ≤ 0.9722 Satisfies the 95% confidence interval for difference of the two
nicotine contents.
4.3.3 Interval Estimation of Difference between Two Means for 𝑛1 𝑎𝑛𝑑 𝑛2 greater than 30
For independent random sample from the normal population
(𝑋̅1− 𝑋̅2 ) − (𝜇1 − 𝜇2 )
𝑍=
𝜎1 2 𝜎2 2
√
𝑛1 + 𝑛2
so, if x̅1 and x̅2 are the values of 𝜇 independent random sample of size 𝑛1 𝑎𝑛𝑑 𝑛2 from normal
population with known variance 𝜎1 2 𝑎𝑛𝑑 𝜎2 2 then the confidence Interval is given by:
𝜎1 2 𝜎2 2 𝜎1 2 𝜎2 2
(x̅1 − x̅2 ) − 𝑍∝⁄ √ + ≤ 𝜇1 − 𝜇 2 ≤ (x
̅ 1 − x
̅ 2 ) + 𝑍∝⁄2 √ +
2 𝑛1 𝑛2 𝑛1 𝑛2
If 𝜎1 2 and 𝜎2 2 are unknown, then their estimates provided by the corresponding sample variances
𝑠1 2 𝑎𝑛𝑑 𝑠2 2 for large sample (𝑛1 𝑎𝑛𝑑 𝑛2 ≥ 30) then 100 (1−∝)% confidence interval for
𝜇1 𝑎𝑛𝑑 𝜇2 are given by
𝑠1 2 𝑠2 2 𝑠1 2 𝑠2 2
(x̅1 − x̅2 ) − 𝑍∝⁄ √ + ≤ 𝜇1 − 𝜇 2 ≤ (x
̅ 1 − x
̅ 2 ) + 𝑍∝⁄2 √ +
2 𝑛1 𝑛2 𝑛1 𝑛2
Example 6
Construct a 94% confidence interval for the difference between the mean life time of two kinds of
light bulbs given that a random sample of 40 life bulbs of the 1st kind lasted on the average of 418
of continuous use. Also 50 life bulbs of the second kind lasted on the average of 402 of continuous
use. The population standard deviation are given as 𝜎1 = 26 𝑎𝑛𝑑 𝜎2 = 22.
Solution
x̅1 = 418 x̅2 = 402 ∝= 0.06
𝜎1 = 26 𝜎2 = 22
𝑛1 = 40 𝑛2 = 50
𝜎1 2 𝜎2 2 𝜎1 2 𝜎2 2
𝐶𝐼 = (x̅1 − x̅2 ) − 𝑍∝⁄2 √ + < 𝜇1 − 𝜇2 < (x̅1 − x̅2 ) + 𝑍∝⁄2 √ +
𝑛1 𝑛2 𝑛1 𝑛2
2 (22)2
𝑍0.06⁄ √(26)
𝐶𝐼 = (418 − 402) ± 2 +
40 50
(26)2 (22)2
𝐶𝐼 = 16 ± 1.8√ +
40 50
= 6.3 ≤ 𝜇1 − 𝜇2 ≤ 25.7
We have 94% confidence that the interval 6.3 and 25.7 hrs contains the actual difference between
the mean life times of the two kinds of bulbs.
Example 7
From Table 2, construct the 90% confidence interval for difference in the two means for the study.
Solution
𝑛1 = 40 𝑛2 = 50
x̅1 = 22.5 x̅2 = 22.88
𝑆1 = 3.87 𝑆2 = 3.92
The 100 (1−∝) % confidence interval for 𝜇1 𝑎𝑛𝑑 𝜇2 are given by:
𝑆1 2 𝑆2 2 𝑆1 2 𝑆2 2
(x̅1 − x̅2 ) − 𝑍∝⁄ √ + < 𝜇1 − 𝜇2 < (x̅1 − x̅2 ) 𝑍∝⁄2 √ +
2 𝑛1 𝑛2 𝑛1 𝑛2
(3.87)2 (3.92)2
= (22.5 − 22.88) ± 𝑍0.10 √ +
2 40 40
= −0.38 ± 1.64 (0.871)
= −1.466 < 𝜇1 − 𝜇2 < 1.048.
Hence [−1.466, 1.048] satisfies the 90% confidence interval for the study.
𝑍 𝜃(1 − 𝜃) 𝑍 𝜃(1 − 𝜃)
𝜃̂ − ∝⁄2 √ < 𝜃 < 𝜃̂ + ∝⁄2 √
𝑛 𝑛
𝑍 𝜃(1 − 𝜃) 𝑁 − 𝑛 𝑍 𝜃(1 − 𝜃) 𝑁 − 𝑛)
𝜃̂ − ∝⁄2 √ ( ) ≤ 𝜃 ≤ 𝜃̂ + ∝⁄2 √ ( )
𝑛 𝑁−1 𝑛 𝑁−1
Example 1
A random sample of 400 persons given of fluvicine showed that 136 persons expressed some
discomfort. Construct a 95% confidence interval for the true population of persons who expressed
discomfort.
Solution
136
𝜃 = X⁄𝑛 = = 0.34
400
𝑍∝⁄
2 = 𝑍0.05
2
= 1.96
𝑍 𝜃(1 − 𝜃) 𝑍 𝜃(1 − 𝜃)
= 𝜃̂ − ∝⁄2 √ < 𝜃 < 𝜃̂ + ∝⁄2 √
𝑛 𝑛
(0.34)(0.66)
= 0.34 ± 1.96 √
400
Example 2
A random sample of 100 items taken from a large batch of articles contains 5 defective items (a)
set up 96 percent confidence limits for the proportion of defective items in the batch.
(b) If the batch contains 2669 items. Set up 95% confidence interval of simple random sampling
without replacement for the proportion of defective items.
Solution
𝑝𝑞 0.05 × 0.95
𝑆𝐸 (𝑃) = √ =√ = 0.022
𝑛 100
𝑍∝ = 𝑍0.04 = 2.05
2 2
𝑝𝑞
= 𝑝 ± 𝑍∝ √ = 0.05 ± 2.05 × 0.022
2 𝑛
(2669 − 100)(0.05)(0.95)
= 0.050 ± 1.96√
(2669)(99)
= 0.050 ± 1.96√(0.00046)
= 0.05 ± 0.042 = [0.008, 0.092] is the 95% confidence interval from (srswor).
Alternatively
̂
𝜃 (1−𝜃 ) ̂ ̂2 (1−𝜃
𝜃 ̂2 )
𝜃̂1 – 𝜃̂2 ±𝑍∝ √ 1 𝑛 1 +
2 1 𝑛2
Example
If 132 of 200 male voters and 90 of 159 female voters follows a certain candidate contesting for
FUTO Council election in 2020 Academic Session. Find the 99% confidence interval for the
differences between the actual proportion of male and female voters who favoured the candidate.
Solution
132 90
𝜃̂1 = = 0.66, 𝜃̂2 = = 0.60
200 159
𝑍∝ = 𝑍0.01 = 2.575
2 2
̂
𝜃 (1−𝜃 ) ̂ ̂2 (1−𝜃
𝜃 ̂2 )
CI = 𝜃̂1 – 𝜃̂2 ± 𝑍∝ √ 1 𝑛 1 +
2 1 𝑛2
(0.66)(0.34) (0.60)(0.40)
= (0.66 − 0.60) − 2.575√ + ≤ 𝜃1 − 𝜃2
200 59
(0.66)(0.34) (0.60)(0.40)
≤ (0.66 − 0.60) + 2.575√ +
200 59
= −0.074 ≤ 𝜃1 − 𝜃2 ≤ 0.194
Thus, we are 99% confidence that interval from – 0.074 to 0.194 contains the difference between
the actual proportion of male and female voters who favours the candidate.
(n − 1)S 2
= χ2 0.025 ≤ ≤ χ2 0.975
σ2
1 σ2 1
= 2 ≤ ≤
χ (0.975 ) (n − 1)S 2 χ2 (0.025 )
Theoritically, we set
(𝑛 − 1)𝑆 2 2
(n − 1)S 2
≤σ ≤ 2
χ2 (0.975 ) χ (0.025 )
(𝑛−1)𝑆 2 (n−1)S2
Limit [χ2 ( ≤ σ2 ≤ χ2 ( ]
0.975 ) 0.025 )
Example
For 𝑛 = 8, and 𝑆 2 = 23.78, construct a 95% confidence interval for the variance.
χ2 (1−∝⁄ = χ2 (0.975)7 = 16.01
2, 𝑛−1)
(23.78) (23.78)
=7 ≤ 𝜎2 ≤ 7
16.01 1.69
= 10.4 ≤ 𝜎 2 ≤ 98.5
(𝑛1 −1)𝑆1 2
but ~χ2 (𝑛1 −1) and
𝜎2 2
(𝑛2 − 1)𝑆2 2 2
~χ (𝑛2 −1)
𝜎2 2
then
(𝑛1 −1) 𝑆1 2
( ⁄𝑛 −1)
𝜎1 2 1
(𝑛2 −1) 𝑆2 2
~ 𝐹(𝑛1 −1) (𝑛2 −1)
( ⁄𝑛 −1)
𝜎2 2 2
𝑆 2
( 1 ⁄ 2)
𝜎1
= 𝑆2 2⁄
~ 𝐹(𝑛1 −1) (𝑛2 −1)
( 2 )
𝜎2
The ratio of sample variance to the population variance. The 99% confidence interval is then given
by
𝑆1 2⁄
𝜎2
𝑃 [𝐹0.005 ≤ 2 1 ≤ 𝐹0.995 ] = 0.99
𝑆2 ⁄
𝜎2 2
𝑆1 2⁄
𝜎1 2
From 𝐹0.005 ≤ 𝑆2 2⁄
, we obtain
𝜎2 2
𝑆1 2 𝑆1 2
𝐹 ≤
𝜎1 2 0.005 𝜎1 2
𝑆 2 𝜎 2
∴ 𝑆2 2 𝐹0.005 ≤ 𝜎2 2
1 1
Hence
𝑆1 2 𝜎 2
≥ 𝜎1 2
𝑆2 2 𝐹 2
0.005
𝑆 2 𝑆 2
Similarly (𝜎1 2 ) / (𝜎2 2 ) ≤ 𝐹0.005 will give us
1 2
𝜎1 2 𝑆1 2
≥ 𝐹
𝜎2 2 𝑆2 2 0.005
𝜎1 2 1 𝑆1 2 1 𝑆1 2
( 2 is ( )> ( ))
𝜎2 𝐹0.005 𝑆2 2 𝐹0.005 𝑆2 2
Generally, the 100 (1−∝)% confidence interval for the variance ratio
𝑆1 2 1 𝑆 2 1 𝑆 2
2 is given by [𝐹 (𝑆1 2 ) , 𝐹 (𝑆1 2 )]
𝑆2 1−∝⁄2 2 ∝⁄ 2
2
Exercises
1. Construct a 99 percent confidence interval for the mean for the following using Table 1.
a. the mean height of female students
b. the mean age of male students
c. the difference in mean of students weight
2. The data below are weights of random sample of ten Happy Bake Wheat of Flour:
50.3 50.4 49.3 48.7 59.3
49.7 39.5 60.7 50.9 47.9 (in Kg)
Construct a 95% confidence interval for the population mean weight, assuming the weights
are approximately normally distributed.
3. Nwafor Consultancy and Educational Services carried out a survey on the test performance
of students in two different courses STA 311 and STA 321. The findings are reported in
Table 2 below.
STA 311 STA 321
x̅ 283 254
𝜎 7 6
𝑛 27 24
4. A random sample of 800 units from a large consignment showed that 300 were damaged.
Find:
a. 95% and
b. 99% confidence limits for the subject matter for mean
c. 96% confidence units for a finite population mean
5. The mean and variance of a random sample of 64 observations were computed as 160 and
100 respectively. Compute the 95% confidence interval for the population mean.
6. A Census is taken among the residents of FUTO Community and the surrounding
Community to determine the feasibility of a proposal to construct a University quarter. If
2450 of 6000 staff favour the proposal and 1570 of 3000 non staff favour it. Construct a
90% confidence interval for the time difference in the fractions proposal of University
quarter.