0% found this document useful (0 votes)
26 views

Estimation

Chapter Four discusses interval estimation methods for random variables, distinguishing between point and interval estimation. It details techniques for finding interval estimators, including confidence intervals for both small and large samples, and provides examples to illustrate the calculations involved. The chapter emphasizes the importance of sample size and the known or unknown nature of standard deviation in determining the appropriate statistical methods.

Uploaded by

jeygerome
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Estimation

Chapter Four discusses interval estimation methods for random variables, distinguishing between point and interval estimation. It details techniques for finding interval estimators, including confidence intervals for both small and large samples, and provides examples to illustrate the calculations involved. The chapter emphasizes the importance of sample size and the known or unknown nature of standard deviation in determining the appropriate statistical methods.

Uploaded by

jeygerome
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

CHAPTER FOUR

INTERVAL ESTIMATION
By
DR. G.O. NWAFOR

4.1 INTRODUCTION
An inference method is developed for studying the behavior of random variables under study. In
point estimation, an estimate of a parameter 𝜃, where the inference is a guess of a single value as
the value of 𝜃 is made. In interval estimation, an interval estimate of real-valued parameter 𝜃 is
any pair of functions, L(x1, x2, …xn), and U (x1, x2, …xn), of a sample that satisfy L(X)≤ 𝜃 ≤ U(X)
for all x ∈X.
If the random sample x is analysed, the inference L(x) ≤ 𝜃 ≤ U(x) is made and random interval
[𝐿 (𝑥), 𝑈 (𝑥)] is called an interval estimator.
It is important to keep in mind that the interval is the random quantity.

4.2 Methods of Finding Interval Estimators


The technique of interval estimation pioneered by Neyman consists in the determination of two
constants a and b such that:
𝑃[𝑎 < 𝜃 < 𝑏] = 1−∝

where ∝ is the level of significance. The interval [a, b] within which the unknown value of the
parameter 𝜃 is expected to lie is known as a confidence interval.

4.2.1 Inverting a Test Statistics


The concept of inverting a test statistic is a very strong correspondence between hypothesis testing
and interval estimation. The usefulness of the test statistic is necessary for both small and large
samples. It is important to note the various conditions on the application of a particular test statistic.
Inverting a normal test i.e. interval estimation for large samples. For a fixed ∝ level and most
powerful unbiased test for sample points with
|x̅ − 𝜇𝑜 | ≤ 𝑍𝛼⁄2 𝜎⁄ is given by
√𝑛
x̅ − 𝑍𝛼⁄2 𝜎⁄ x + 𝑍𝛼⁄2 𝜎⁄ .
≤ 𝜇𝑜 ≤ ̅
√𝑛 √𝑛
Also for inverting a test statistic for small samples for a fixed ∝ we apply
x̅ − 𝑡 ∝⁄2 𝑆⁄ ≤ 𝜇 ≤ x̅ + 𝑡 ∝⁄2 𝑆⁄
√𝑛 √𝑛
The conditions that necessitate the choice of a test statistic is dependent on sample size.
Case 1, when 𝑛 𝑖𝑠 < 30 (for example 𝑛 = 2,3,4, … ,29) and standard deviation unknown, apply
student “t” test, also
Case 2, when 𝑛 𝑖𝑠 ≥ 30 (for example 𝑛 = 30, 31, 32, … , ∞) and standard deviation known, apply
asymptotically normal (N (0,1) ~ Z)
The case 1 and 2 are relative to the understanding of known and unknown measures of central
tendency and dispersion. Case 1 and 2 can further be discussed when the variance is known or
unknown.

4.3 Interval Estimation of Mean for Small Sample


The estimation of mean for one population case when n less than 30 is is stated below:
x̅ − 𝑡∝⁄2 𝑆⁄ x + 𝑡∝⁄2 𝑆⁄ .
< 𝜇< ̅
√𝑛 √𝑛
where
𝑥̅ 𝑎𝑛𝑑 𝑆 are the mean and standard deviation, respectively of a sample of size 𝑛 < 30 and 𝑡∝⁄2 is

value of 𝑡 distribution with degree of freedom 𝑛 − 1 for confidence interval of unknown 𝜇 (mean)
and S (standard deviation) of a sample random sampling with replacement (SRSWR).
In the case of simple random sampling without replacement from a finite population of size N, the
100 (1-∝)% confidence internal for 𝜇 are given by

𝑁−𝑛 𝑁−𝑛
𝑋̅ − 𝑍∝⁄2 𝜎⁄ √ ̅ + 𝑍∝⁄ 𝜎⁄
≤𝜇≤ X √
√𝑛 𝑁 − 1 2 √𝑛 𝑁 − 1

It should be noted that in case of either measurement or attribute data, if the sample size exceeds
5 percent of the population, we should use what is referred to as the finite correction factor; that
𝑁−𝑛
is, we should multiply the appropriate standard error formula by√ 𝑁−1 , finite correction (Dick,

1968). Obviously, where the sample size is small relative to the population, this correction factor
will make a minor increase in the standard error. However, where the sample size exceeds 5 percent
of the population, this correction factor would tend to reduce the size of the standard error.

Example 1
The Department of Statistics in 2019/2021 Session carried out a survey, on the following human
characteristics. The study provided the information below:
Table 1:
Student’s Heights Student’s weight Student’s Age
Male Female Male Female Age Age
5.7 7.3 47.5 60.7 19 20
6.5 3.4 60.5 50.3 21 17
4.5 5.2 30.4 60.1 24 18
7.3 5.3 69.1 47.5 23 21
6.5 4.7 70.1 50.7 21 23
7.1 4.8 80.4 39.4 18 27
4.5 6.5 61.5 60.5 19 31
4.7 7.1 35.2 70.1 23 18
7.4 4.8 90.4 80.8 25 19
6.7 5.1 80.7 91.7 27 21
5.6 5.8 50.4 49.6 29 23
5.4 5.4 56.2 55.7 31 24

Find a 95% confidence interval for the mean of male students’ height.
Solution
∑x
𝑋̅ = = 5.992, 𝑡0.025,11 = 2.201
𝑛
∑(𝑥 − 𝑥̅ )2
𝑆2 = = 1.0723, 𝑑𝑓 = 𝑛 − 1 = 12 − 1 = 11
𝑛−1

𝑆 1.0723
𝑆. 𝐸 = = = 0.3096, where S.E is the standard error.
√𝑛 √12

x̅ − 𝑡∝⁄2 𝑆⁄ x + 𝑡∝⁄2 𝑆⁄ .
< 𝜇< ̅
√𝑛 √𝑛
= 5.992 − 2.201 (0.3096) < 𝜇 < 5.992 + 2.201 (0.3096)
= 5.311 < 𝜇 < 6.673
The 95% confidence interval for the male students’ height lies on 5.311 < 𝜇 < 6.673.
Example 2
A random sample of grades of 40 statistics students, out of a total of 300 showed a mean of 84 and
a standard deviation of 16, find the 95% confidence interval for the estimate of the mean of 300
grades for sample random sampling with replacement.
Solution
𝑛 = 40, 𝑥̅ = 84, 𝑁 = 300, 𝑆 = 16

𝑁−𝑛 𝑁−𝑛
̅ − 𝑍∝⁄ 𝜎⁄ √
X ≤ 𝜇 ≤ x
̅ + 𝑍∝⁄ 𝜎⁄ √
2 √𝑛 𝑁 − 1 2 √𝑛 𝑁 − 1

16 300 − 40 16 300 − 40
= 84 − 1.96 √ ≤ 𝜇 ≤ 84 − 1.96 √
√40 300 − 1 √40 300 − 1
= 84 − 196 (2.5298)(0,9325) ≤ 𝜇 ≤ 84 + 1.96 (2.5298)(0.9325)
= 79.376 ≤ 𝜇 ≤ 88.624, is 95% confidence interval for the estimate of mean of 300 grades.

4.3.1 Interval Estimation of Mean for Large Samples


Here, the underlying distribution of the standardized variate to the sampling distribution of the test
statistics “t” will be asymptotically normal.
i.e.
X−𝐸(x)
𝑍= ~𝑁(0,1) as 𝑛 → ∞ and the 100(1−∝)% confidence intervals for 𝜇, with unknown
𝑆.𝐸(X)

standard deviation is given by:


𝜎 𝜎
̅ − 𝑍∝⁄
X < 𝜇 < x̅ + 𝑍∝⁄2
2
√𝑛 √𝑛
With the usual notation/definition, 𝑛 is greater than 30 and variance known.
The 100 (1−∝)% confidence intervals for 𝜇 with unknown standard deviation/variance and 𝑛 ≥
30 is given by:
𝑆 𝑆
̅ − 𝑍∝⁄
X ≤ 𝜇 ≤ x̅ + 𝑍∝⁄2
2
√𝑛 √𝑛
where ∝ is the level of significance.
Example 3
36 Automobiles of the same model are driven and the gas usage in their operation is recorded as
follows:
𝑋̅ = 18 miles per galon
𝑆 = 3 miles per galon,
Give your result at 90% confidence interval for the mean of all the Automobiles of this
model.

Solution
The case 1, where 𝜎 is known (n large or small) the confidence is given as
𝜎 𝜎
̅ − 𝑍∝⁄
X ≤ 𝜇 ≤ x̅ + 𝑍∝⁄2
2 √𝑛 √𝑛
𝑠
The case 2, where 𝜎 is unknown (n large i.e n ≥30) the confidence is given as ̅
X − 𝑍∝⁄2 ≤ 𝜇≤
√𝑛
𝑠
x̅ + 𝑍∝⁄2
√𝑛
𝑠
̅ − 𝑡∝⁄
The case 3, where 𝜎 is unknown (n large i.e n <30) the confidence is given as X ≤ 𝜇≤
2 √𝑛
𝑠
x̅ + 𝑡∝⁄2
√𝑛

So in the above question, we have that:


̅
X = 18, S = 3, 𝑛 = 36, since n is greater than 30 with known standard deviation, the confidence
𝜎 𝜎
̅ − 𝑍∝⁄
interval is given as X ≤ 𝜇 ≤ X̅ + 𝑍∝⁄2
2 √𝑛 √𝑛

3 3
= 18 − 1.65 ( ) ≤ μ ≤ 18 + 1.65 ( )
√36 √36
=17.8 ≤ 𝜇 ≤ 18.83.

Example 4
If a random sample of size 𝑛 = 20 from a normal population with σ2 = 225 has a mean of 6.43
construct 95% confidence interval for the population mean.
Solution
̅−
The case 1 is applied, where 𝜎 is known (n large or small) the confidence is given as X
𝜎 𝜎
𝑍∝⁄2 ≤ 𝜇 ≤ x̅ + 𝑍∝⁄2
√𝑛 √𝑛
𝑛 = 20, x̅ = 64.3, σ2 = 225, σ = 15
𝜎 𝜎
̅
X − 𝑍∝⁄2 ≤ 𝜇 ≤ x̅ + 𝑍∝⁄2
√𝑛 √𝑛
15 15
= 64.3 − 1.96 ( ) ≤ μ ≤ 64.3 + 1.96 ( )
√20 √20

= 57.7 ≤ μ ≤ 70.9
[57.7, 70.9] is the 95% confidence interval for the population mean.

4.3.2 Interval Estimation for Difference between Two Means for 𝒏𝟏 and 𝒏𝟐 less than 30
Considering the sampling distribution of two population characteristics, if 𝜎1 2 and 𝜎2 2 are not
known, but can be estimated from corresponding samples i.e. estimate of 𝑆1 2 and 𝑆2 2 are used
then:
𝜎̂1 2 = 𝑆1 2 and 𝜎̂2 2 = 𝑆1 2 and the
100 (1−∝)% confidence intervals for 𝜇1 and 𝜇2 when either both sample sizes 𝑛1 and 𝑛2 are < 30
is given by

(x̅1 − x̅2 ) − 𝑡∝𝑛⁄1 +𝑛2−2 𝑠𝑝 (√1⁄𝑛1 + 1⁄𝑛2 )


2

(𝑛1 −1) 𝑠1 2 + (𝑛2 −1) 𝑠2 2


Where 𝑠𝑝 = √ is the poll sample variance of the corresponding
𝑛1+ 𝑛2−2

population/sample characteristics with the usual ∝ as the level of significance. Note that
𝑠1 2 𝑎𝑛𝑑 𝑠2 2 are the usual variance of first and second sample from the random samples. The
𝑛 +𝑛2 −2
𝑡∝⁄1 𝑖𝑠 𝑡ℎ𝑒 𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑟𝑒𝑔𝑖𝑜𝑛 𝑤𝑖𝑡ℎ 𝑛1 + 𝑛2 − 2 as the degree of freedom.
2

Example 5
Nwafor Consultancy and Educational Services carried out a survey on the test performances of
students in two different courses. The findings are reported in table 2. Obtain the confidence
interval for the sample sizes of 𝑛1 and 𝑛2 = 15 at 95% level of significance.

Solution
x1 = 21.73 x̅2 = 24.67
𝑛1 = 15 𝑛2 = 15
𝑆1 = 3.751 𝑆2 = 4.577
𝑆1 2 = 14.07 𝑆2 2 = 20.95
d. f = 𝑛1 + 𝑛2 − 2 = 28

𝑠𝑝 (√1⁄𝑛1 + 1⁄𝑛2 ) ≤ 𝜇1 − 𝜇2 ≤ (x̅1 − x̅2 ) ±


𝑛 +𝑛2 −2
C.I = (x̅1 −x̅2 ) − 𝑡∝⁄1
2

𝑠𝑝 (√1⁄𝑛1 + 1⁄𝑛2 )
𝑛 +𝑛2 −2
𝑡∝⁄1
2

(𝑛1 −1) 𝑠1 2 + (𝑛2 −1) 𝑠2 2 14 (14.07)+14 (20.95)


where 𝑠𝑝 = √ =√ = 4.18.
𝑛1+ 𝑛2 −2 28

15+15−2 1 1
C.I = (21.73 − 24.67) ± 𝑡0.05⁄
(4.18)√ +
2 15 15

28 1 1
= (21.73 − 24.67) ± 𝑡0.05 (4.18) √ +
⁄2 15 15

= −2.94 ± (2.048)4.18 (0.378)


= −2.94 ± 3.24
= −6.18 ≤ 𝜇1 − 𝜇2 ≤ 0.3
The confidence interval for the different of two means is given by
= −6.18 ≤ 𝜇1 − 𝜇2 ≤ 0.3

Example 6
The study of a nicotine content of two brands of 10 cigarette of brand A has an average 3.1mg and
standard deviation of 0.5mg while 8 cigarette had an average nicotine content of 2.7mg. Assume
that the two set of that are independent random variable from normal population with equal
variance. Construct a 95% confidence interval for the difference between the two nicotine content.
Solution
𝑛1 = 10 𝑛2 = 8
𝑆1 = 0.5 𝑆2 = 0.7
x̅1 = 3.1 x̅2 = 2.7
(10−1) (0.5)2 +(8−1) (10.7)2
𝑆𝑝 = √ = 0.596
10 + 8 −2

𝑆𝑝√1⁄𝑛1 + 1⁄𝑛2
𝑛 +𝑛2 −2
𝐶𝐼 = x̅1 − x̅2 ± 𝑡∝⁄1
2

10+8−2
= 3.1 − 2.7 ± 𝑡0.05⁄
(0.596) √1⁄10 + 1⁄8
2
16 (0.2699)
= 0.4 ± 𝑡0.025 = 0.4 ± 2.120 (0.2699)
= 0.4 ± 0.5722
=−0.172 ≤ 𝜇1 − 𝜇2 ≤ 0.9722 Satisfies the 95% confidence interval for difference of the two
nicotine contents.

4.3.3 Interval Estimation of Difference between Two Means for 𝑛1 𝑎𝑛𝑑 𝑛2 greater than 30
For independent random sample from the normal population
(𝑋̅1− 𝑋̅2 ) − (𝜇1 − 𝜇2 )
𝑍=
𝜎1 2 𝜎2 2

𝑛1 + 𝑛2

has a standard normal distribution for the expression for Z.

i.e. P (−𝑍∝⁄2 < 𝑍 < 𝑍∝⁄2 ) = 1−∝

so, if x̅1 and x̅2 are the values of 𝜇 independent random sample of size 𝑛1 𝑎𝑛𝑑 𝑛2 from normal
population with known variance 𝜎1 2 𝑎𝑛𝑑 𝜎2 2 then the confidence Interval is given by:

𝜎1 2 𝜎2 2 𝜎1 2 𝜎2 2
(x̅1 − x̅2 ) − 𝑍∝⁄ √ + ≤ 𝜇1 − 𝜇 2 ≤ (x
̅ 1 − x
̅ 2 ) + 𝑍∝⁄2 √ +
2 𝑛1 𝑛2 𝑛1 𝑛2

If 𝜎1 2 and 𝜎2 2 are unknown, then their estimates provided by the corresponding sample variances
𝑠1 2 𝑎𝑛𝑑 𝑠2 2 for large sample (𝑛1 𝑎𝑛𝑑 𝑛2 ≥ 30) then 100 (1−∝)% confidence interval for
𝜇1 𝑎𝑛𝑑 𝜇2 are given by

𝑠1 2 𝑠2 2 𝑠1 2 𝑠2 2
(x̅1 − x̅2 ) − 𝑍∝⁄ √ + ≤ 𝜇1 − 𝜇 2 ≤ (x
̅ 1 − x
̅ 2 ) + 𝑍∝⁄2 √ +
2 𝑛1 𝑛2 𝑛1 𝑛2

Example 6
Construct a 94% confidence interval for the difference between the mean life time of two kinds of
light bulbs given that a random sample of 40 life bulbs of the 1st kind lasted on the average of 418
of continuous use. Also 50 life bulbs of the second kind lasted on the average of 402 of continuous
use. The population standard deviation are given as 𝜎1 = 26 𝑎𝑛𝑑 𝜎2 = 22.

Solution
x̅1 = 418 x̅2 = 402 ∝= 0.06
𝜎1 = 26 𝜎2 = 22
𝑛1 = 40 𝑛2 = 50

𝜎1 2 𝜎2 2 𝜎1 2 𝜎2 2
𝐶𝐼 = (x̅1 − x̅2 ) − 𝑍∝⁄2 √ + < 𝜇1 − 𝜇2 < (x̅1 − x̅2 ) + 𝑍∝⁄2 √ +
𝑛1 𝑛2 𝑛1 𝑛2

2 (22)2
𝑍0.06⁄ √(26)
𝐶𝐼 = (418 − 402) ± 2 +
40 50

(26)2 (22)2
𝐶𝐼 = 16 ± 1.8√ +
40 50

= 6.3 ≤ 𝜇1 − 𝜇2 ≤ 25.7
We have 94% confidence that the interval 6.3 and 25.7 hrs contains the actual difference between
the mean life times of the two kinds of bulbs.
Example 7
From Table 2, construct the 90% confidence interval for difference in the two means for the study.
Solution
𝑛1 = 40 𝑛2 = 50
x̅1 = 22.5 x̅2 = 22.88
𝑆1 = 3.87 𝑆2 = 3.92
The 100 (1−∝) % confidence interval for 𝜇1 𝑎𝑛𝑑 𝜇2 are given by:

𝑆1 2 𝑆2 2 𝑆1 2 𝑆2 2
(x̅1 − x̅2 ) − 𝑍∝⁄ √ + < 𝜇1 − 𝜇2 < (x̅1 − x̅2 ) 𝑍∝⁄2 √ +
2 𝑛1 𝑛2 𝑛1 𝑛2

(3.87)2 (3.92)2
= (22.5 − 22.88) ± 𝑍0.10 √ +
2 40 40
= −0.38 ± 1.64 (0.871)
= −1.466 < 𝜇1 − 𝜇2 < 1.048.
Hence [−1.466, 1.048] satisfies the 90% confidence interval for the study.

4.4 INTERVAL ESTIMATION OF POPULATION PROPORTION


In estimating proportion, probabilities, percentages etc. It is reasonable to assume that we are
sampling the binomial population and hence that our problem is to estimate the binomial parameter
(𝜃).
Thus, we can make use of the fact that for large (n), Binomial distribution can be large (n),
Binomial distribution can be approximated with the normal distribution.
X − np
𝑍=
√𝑛𝑝𝑞
This could be treated as a random valuable having approximately the standard normal distribution,
substituting for 𝑍.
𝑍∝⁄ 𝑍∝
𝑝 (− 2 < 𝑍 < ⁄2) = 1−∝
𝑍∝⁄ X − np 𝑍∝
𝑝 (− 2 < < ⁄2) = 1−∝
√𝑛𝑝𝑞
Then the confidence interval for 𝜃 is given by:

𝑍 𝜃(1 − 𝜃) 𝑍 𝜃(1 − 𝜃)
𝜃̂ − ∝⁄2 √ < 𝜃 < 𝜃̂ + ∝⁄2 √
𝑛 𝑛

If sampling is with replacement from a finite population.


Similarly, in SRSWOR (Simple Random Sampling Without Replacement) from a finite population
the 100 (1−∝) % confidence interval is given by:

𝑍 𝜃(1 − 𝜃) 𝑁 − 𝑛 𝑍 𝜃(1 − 𝜃) 𝑁 − 𝑛)
𝜃̂ − ∝⁄2 √ ( ) ≤ 𝜃 ≤ 𝜃̂ + ∝⁄2 √ ( )
𝑛 𝑁−1 𝑛 𝑁−1

Example 1
A random sample of 400 persons given of fluvicine showed that 136 persons expressed some
discomfort. Construct a 95% confidence interval for the true population of persons who expressed
discomfort.
Solution
136
𝜃 = X⁄𝑛 = = 0.34
400
𝑍∝⁄
2 = 𝑍0.05
2
= 1.96

𝑍 𝜃(1 − 𝜃) 𝑍 𝜃(1 − 𝜃)
= 𝜃̂ − ∝⁄2 √ < 𝜃 < 𝜃̂ + ∝⁄2 √
𝑛 𝑛

(0.34)(0.66)
= 0.34 ± 1.96 √
400

= 0.34 ± 1.96 (0.02369)


= 0.29 < 𝜃 < 0.386
So, [0.29, 0.386] is the 95% confidence interval for the true population.

Example 2
A random sample of 100 items taken from a large batch of articles contains 5 defective items (a)
set up 96 percent confidence limits for the proportion of defective items in the batch.
(b) If the batch contains 2669 items. Set up 95% confidence interval of simple random sampling
without replacement for the proportion of defective items.

Solution

a) 𝑛 = 100, 𝑝 = 5⁄100 = 0.05 𝑞 = 1 − 𝑝 = 0.95

𝑝𝑞 0.05 × 0.95
𝑆𝐸 (𝑃) = √ =√ = 0.022
𝑛 100

𝑍∝ = 𝑍0.04 = 2.05
2 2

𝑝𝑞
= 𝑝 ± 𝑍∝ √ = 0.05 ± 2.05 × 0.022
2 𝑛

[0.005, 0.095] is the 96% confidence interval.

b) 𝑁 = 2669, 𝑛 = 100, 𝑍∝ = 𝑍0.05 = 1.96


2 2
𝑝𝑞(𝑁 − 𝑛)
= P ± 𝑍∝ √
2 𝑁(𝑛 − 1)

(2669 − 100)(0.05)(0.95)
= 0.050 ± 1.96√
(2669)(99)

= 0.050 ± 1.96√(0.00046)
= 0.05 ± 0.042 = [0.008, 0.092] is the 95% confidence interval from (srswor).

4.4.1 Estimation of Difference between Proportions.


In most cases, we may want to study the difference between binomial parameter, 𝜃1 and 𝜃2 .
Suppose we want to estimate the difference between the proportion of male and female voters who
favours a particular candidate in the FUTO Governing Council 2021. Suppose the number of
successes are respectively X1 and X2 with respective sample proportions.
X X2
𝜃̂1 = 𝑛1 𝑎𝑛𝑑 𝜃2 then
1 𝑛2

𝜃̂1 − 𝜃̂2 are estimators of the difference between 𝜃1 𝑎𝑛𝑑 𝜃2 .


For large sample X1 𝑎𝑛𝑑 X2 of a binomial random variable with 𝑛1 𝑎𝑛𝑑 𝑛2 as the sample size,
with estimate of 𝜃̂1 = X1⁄n 𝑎𝑛𝑑 𝜃̂2 = X2⁄n the 100l (1−∝) % confidence interval is given by:
1 2

𝜃 (1−𝜃 ) 𝜃2 (1−𝜃2 ) 𝜃 (1−𝜃 ) 𝜃 (1−𝜃 )


(𝜃̂1 − 𝜃̂2 ) − 𝑍∝ √ 1 𝑛 1 + < 𝜃1 − 𝜃2 < 𝜃̂1 − 𝜃̂2 + 𝑍∝ √ 1 𝑛 1 + 2 𝑛 2
2 1 𝑛2 2 2

Alternatively
̂
𝜃 (1−𝜃 ) ̂ ̂2 (1−𝜃
𝜃 ̂2 )
𝜃̂1 – 𝜃̂2 ±𝑍∝ √ 1 𝑛 1 +
2 1 𝑛2

with ∝ as the level of significance.


Note 𝜃 = 𝑝 𝑎𝑛𝑑 1 − 𝜃 = 𝑞

Example
If 132 of 200 male voters and 90 of 159 female voters follows a certain candidate contesting for
FUTO Council election in 2020 Academic Session. Find the 99% confidence interval for the
differences between the actual proportion of male and female voters who favoured the candidate.
Solution
132 90
𝜃̂1 = = 0.66, 𝜃̂2 = = 0.60
200 159

𝑍∝ = 𝑍0.01 = 2.575
2 2

̂
𝜃 (1−𝜃 ) ̂ ̂2 (1−𝜃
𝜃 ̂2 )
CI = 𝜃̂1 – 𝜃̂2 ± 𝑍∝ √ 1 𝑛 1 +
2 1 𝑛2

(0.66)(0.34) (0.60)(0.40)
= (0.66 − 0.60) − 2.575√ + ≤ 𝜃1 − 𝜃2
200 59

(0.66)(0.34) (0.60)(0.40)
≤ (0.66 − 0.60) + 2.575√ +
200 59

= −0.074 ≤ 𝜃1 − 𝜃2 ≤ 0.194
Thus, we are 99% confidence that interval from – 0.074 to 0.194 contains the difference between
the actual proportion of male and female voters who favours the candidate.

4.0 INTERVAL ESTIMATION FOR VARIANCES

If S2 is a sample variance, we say that;


(𝑛 − 1)𝑠 2
~ χ2 (n−1)
𝜎

For 5% level of significance (95% Confidence interval)


P (χ2 0.025 ≤ χ2 ≤ χ2 0.975 ) = 0.95 and for 95%

(n − 1)S 2
= χ2 0.025 ≤ ≤ χ2 0.975
σ2

1 σ2 1
= 2 ≤ ≤
χ (0.975 ) (n − 1)S 2 χ2 (0.025 )
Theoritically, we set
(𝑛 − 1)𝑆 2 2
(n − 1)S 2
≤σ ≤ 2
χ2 (0.975 ) χ (0.025 )

(𝑛−1)𝑆 2 (n−1)S2
Limit [χ2 ( ≤ σ2 ≤ χ2 ( ]
0.975 ) 0.025 )

Generally, the 100(1−∝)% confidence interval for σ2 is given by


(𝑛 − 1)𝑆 2 2
(n − 1)S 2
[ ≤ σ ≤ ]
χ21−∝⁄ χ2 ∝⁄
2 2

Example
For 𝑛 = 8, and 𝑆 2 = 23.78, construct a 95% confidence interval for the variance.
χ2 (1−∝⁄ = χ2 (0.975)7 = 16.01
2, 𝑛−1)

χ2 (∝⁄ = χ2 (0.025,7) = 1.69


2, 𝑛−1)

So the confidence interval for 𝜎 2 is given by


(𝑛 − 1)𝑆 2 (𝑛 − 1)𝑆 2
≤ 𝜎2 ≤
χ2 (1−∝⁄ χ2 (∝⁄ ) (𝑛−1)
2, 𝑛−1) 2,

(23.78) (23.78)
=7 ≤ 𝜎2 ≤ 7
16.01 1.69

= 10.4 ≤ 𝜎 2 ≤ 98.5

Hence, the 95% confidence interval for 𝜎 2 is 10.4 ≤ 𝜎 2 ≤ 98.5

4.4.2 Confidence Interval for Ratio of Variances


Another non-symmetric distribution is the F-distribution defined by:
χ2 𝑣
~F𝑤, ∝ Variance ratio
χ2 𝑤

(𝑛1 −1)𝑆1 2
but ~χ2 (𝑛1 −1) and
𝜎2 2

(𝑛2 − 1)𝑆2 2 2
~χ (𝑛2 −1)
𝜎2 2
then
(𝑛1 −1) 𝑆1 2
( ⁄𝑛 −1)
𝜎1 2 1
(𝑛2 −1) 𝑆2 2
~ 𝐹(𝑛1 −1) (𝑛2 −1)
( ⁄𝑛 −1)
𝜎2 2 2

𝑆 2
( 1 ⁄ 2)
𝜎1
= 𝑆2 2⁄
~ 𝐹(𝑛1 −1) (𝑛2 −1)
( 2 )
𝜎2

The ratio of sample variance to the population variance. The 99% confidence interval is then given
by
𝑆1 2⁄
𝜎2
𝑃 [𝐹0.005 ≤ 2 1 ≤ 𝐹0.995 ] = 0.99
𝑆2 ⁄
𝜎2 2

𝑆1 2⁄
𝜎1 2
From 𝐹0.005 ≤ 𝑆2 2⁄
, we obtain
𝜎2 2

𝑆1 2 𝑆1 2
𝐹 ≤
𝜎1 2 0.005 𝜎1 2

𝑆 2 𝜎 2
∴ 𝑆2 2 𝐹0.005 ≤ 𝜎2 2
1 1

Hence
𝑆1 2 𝜎 2
≥ 𝜎1 2
𝑆2 2 𝐹 2
0.005
𝑆 2 𝑆 2
Similarly (𝜎1 2 ) / (𝜎2 2 ) ≤ 𝐹0.005 will give us
1 2

𝜎1 2 𝑆1 2
≥ 𝐹
𝜎2 2 𝑆2 2 0.005

Hence, the 99% confidence interval for

𝜎1 2 1 𝑆1 2 1 𝑆1 2
( 2 is ( )> ( ))
𝜎2 𝐹0.005 𝑆2 2 𝐹0.005 𝑆2 2

Generally, the 100 (1−∝)% confidence interval for the variance ratio
𝑆1 2 1 𝑆 2 1 𝑆 2
2 is given by [𝐹 (𝑆1 2 ) , 𝐹 (𝑆1 2 )]
𝑆2 1−∝⁄2 2 ∝⁄ 2
2

Exercises
1. Construct a 99 percent confidence interval for the mean for the following using Table 1.
a. the mean height of female students
b. the mean age of male students
c. the difference in mean of students weight

2. The data below are weights of random sample of ten Happy Bake Wheat of Flour:
50.3 50.4 49.3 48.7 59.3
49.7 39.5 60.7 50.9 47.9 (in Kg)
Construct a 95% confidence interval for the population mean weight, assuming the weights
are approximately normally distributed.

3. Nwafor Consultancy and Educational Services carried out a survey on the test performance
of students in two different courses STA 311 and STA 321. The findings are reported in
Table 2 below.
STA 311 STA 321
x̅ 283 254
𝜎 7 6
𝑛 27 24

Perform 99 percent confidence interval for mean.

4. A random sample of 800 units from a large consignment showed that 300 were damaged.
Find:
a. 95% and
b. 99% confidence limits for the subject matter for mean
c. 96% confidence units for a finite population mean
5. The mean and variance of a random sample of 64 observations were computed as 160 and
100 respectively. Compute the 95% confidence interval for the population mean.

6. A Census is taken among the residents of FUTO Community and the surrounding
Community to determine the feasibility of a proposal to construct a University quarter. If
2450 of 6000 staff favour the proposal and 1570 of 3000 non staff favour it. Construct a
90% confidence interval for the time difference in the fractions proposal of University
quarter.

7. Using Table 1, construct the following:


a. a 95% confidence interval for the variance of height of the male students.
b. a 99% confidence interval for the variance of weight of the male students.
8. Construct a 95% confidence interval for the variance ratio on the information in Question
3.
9. A final year student of Statistics reported the number of affected cases of Corona Virus in
two different state of the country namely; Imo and Anambra. Using the table below,
construct a 96% confidence interval for:
a. The population mean of Imo cases.
b. The Difference in the population mean.
Imo 40 30 37 50 94 100 67 54 37 81
103 54 31 49 72 92 35 76 87 95
Anambra 47 50 62 73 59 80 111 121 37 67
53 49 92 37 81 33 47 81 39 43
54 59 60 31 25 91 71 11 18 23

You might also like