Ch3 Prob II Anu Fall24 1
Ch3 Prob II Anu Fall24 1
INTERVAL ESTIMATION
3.1 Introduction
In many situations, a point estimate does not provide enough information about the
parameter of interest. For example, if we are interested in estimating the mean
compression strength of concrete, a single number may not be very meaningful. It would
be more desirable to determine an interval within which we would expect to find the value
of the parameter. An interval of the form
L≤μ≤U
might be more useful.
Such an interval is called an interval estimate. The end points of this interval will be
random quantities, since they are functions of the sample data.
In general, to construct an interval estimator of the unknown parameter θ, we must
find two statistics L and U such that
P( L ≤ θ ≤ U ) = 1-
where 0 < < 1. The resulting interval
L≤θ≤U
is called a 100(1- )% confidence interval for θ. L and U are called lower and upper
confidence limits, respectively and the probability, 1- , is called the confidence
coefficient (or the confidence level ). When 1- = 0.95, the interval is called 95%
confidence interval for θ.
The most common interpretation of the above confidence interval for θ is:
"We are 100 (1- )% confident that the single computed interval (L, U) contains the
parameter θ", or
"In repeated sampling, from a normally distributed population, 100(1-)% of all intervals
of the form (L, U) will in the long run include the population parameter θ".
(A) Confidence Interval for μ (case (i): σ2 is known)
Let us now consider the interval estimate of μ. Let X1 , X2 , ... , Xn be a random
sample of size n taken from a normal population with mean μ and variance σ2 (or from
non-normal population but n is sufficiently large ) , we can establish a confidence
- 46 -
interval for μ by considering the sampling distribution of X .
According to the central limit theorem, we can expect the sampling distribution of
X to be approximately normally distributed with mean X = and standard deviation
X = / n . Writing z 2 for the z-value above which we find an area of Zα/2, we
can see from Figure 3.1 that:
P ( z / 2 Z z / 2 ) 1
where
X-
Z= ~ N(0,1).
/ n
Figure 3.1 : P ( z / 2 Z z / 2 ) 1
Hence
X-
P( z / 2 z / 2 ) 1
/ n
Multiplying each term in the inequality by / n , and then subtracting X from each
term and multiplying by -1 (reversing the sense of the inequalities), we obtain
P x - z / 2 x + z / 2 1
n n
If X is the mean of a random sample of size n from a normal population with known
variance σ2 (or equivalently from any population when n is large enough) a 100(1-α)%
confidence interval for μ is given by
x - z / 2 x + z / 2
n n
where z 2 is the z-value leaving an area of α /2 to the right (i.e. ( z 2 ) = 1 - 2 ).
- 47 -
Example 3.1
A civil engineer is analyzing the compressive strength of concrete. Compressive
strength is approximately normally distributed with standard deviation 32 psi. A random
sample of 16 specimens has a mean compressive strength of 3250 psi. Construct 95%
and 99% confidence intervals on mean compressive strength. Compare between the
width of the two intervals.
Solution
The population is normal with known variance σ2 =322. The point estimate of μ is
X = 3250.
To find 95% C.I. for μ , put 1-α = 0.95 α = 0.05 ( z 2 ) = 1 - 2 0.975 and
from the standard normal distribution we have (1.96) = 0.975 implying that z 0.025 =1.96.
Hence the 95% confidence interval is
32 32
3250 - (1.96) 3250 (1.96)
16 16
which reduces to
3231.32 < μ < 3265.68.
To find a 99% confidence interval, we find the α =0.01 α /2 = 0.005
( z 2 ) = 1 - 2 0.995 .
Therefore, using the standard normal distribution table again, z 0.025 = 2.576, and the 99%
confidence interval is
32 32
3250 - (2.576) 3250 (2.576)
16 16
or simply
3229.4 < μ < 3270.6.
We now see that the CI with confidence level 0.99 is wider than that with the 0.95
confidence level.
Using Minitab:
Press Stat ⇨ Basic Statistics ⇨ 1 Sample Z
- 48 -
Press on Summarized data ⇨ Insert the sample size, Mean and Standard deviation
then press Options. In the options dialog box, type 99.0 for the confidence level.
- 49 -
Press OK , we obtain the following results in the session
Interpretation
The above confidence interval for μ is interpreted as :
"We are 100(1-α)% confident that the single computed interval X Z / 2 . X
contains the population mean μ"
or
"In repeated sampling, from a normally distributed population, 100(1-α)% of all
intervals of the form X Z / 2 . X will in the long run include the population
mean μ".
- 50 -
z / 2 / n . In other words, if we are going to use x as an estimate of μ, then the error in
estimating the mean μ is | X - | and from above we have
z / 2
P(|X-| ) =1-
n
The quantity
z / 2
E=
n
is called the maximum error occurred in estimating the population mean μ with
probability (1-).
The formula for E can also be used to determine the sample size that is needed to
attain a desired degree of precision. Solving the above equation of E for n we obtain
z
2
n = /2
E
Note that: If it is thought that the population from which the sample is to be drawn is
approximately normally distributed, one may use the fact that the range is approximately
equal to four standard deviations and compute R / 4 . This method requires some
knowledge of the smallest and largest value of the variable in the population.
Example 3.2
The life, in hours, of a 150-watt light bulb is known to be approximately normally
distributed with standard deviation 25 hours. What sample size should be taken in order to
be 95% confident that the error in estimating the mean life is less than 5 hours?
Solution
Since σ = 25, =0.05 zα/2 = z0.025 = 1.96 and E = 5, we may find the required sample
size from the formula of n as
z 1.96 x 25
2 2
n = /2 = 96
E 5
Example 3.3
An engineer intends to use the mean of a random sample of size 100 to estimate the
mean duration time of telephone calls directed by a local telephone company. If, based on
experience, he knows that the range of duration time calls is approximately 10 minutes,
- 51 -
what can he assert with probability 0.96 about the maximum error?
Solution
σ ≈ R/4 σ ≈ 2.5, n = 100, 1- α = 0.96 α = 0.04 Φ (z α/2) = 1- α/2 =
0.98 z0.02 = 2.054, then
z / 2 2.054 2.5
E= = 0.5153
n 10
Example 3.4
An article in Nuclear Engineering International describes several characteristics of
fuel rods used in a reactor owned by an electric utility in Norway. Measurements on the
percentage of enrichment of 10 rods were reported as follows:
2.94 3.00 2.90 2.75 2.90 2.75 2.95 2.82 2.81 3.05
Find a 98% confidence interval on the mean percentage of enrichment.
Solution
- 52 -
Since n (= 10) is small, the variance is unknown and assume that the population is
normal, then we use the t-distribution and the CI is given by:
s s
x - t / 2 x + t / 2
n n
The sample mean and standard deviation for the given data, are
1 n 1 n
x i = 2.887 and ( x i - x ) 0.10242
2
x = S=
n i =1 n-1 i=1
For = n - 1 = 9 , 0.02, t 0.01;9 = 2.82 . Hence the 98% confidence limits for μ are
0.10242 0.10242
0.887 - 2.82 0.887 2.82
10 10
which reduces to
2.796 2.978
Using Minitab:
Press Stat ⇨ Basic Statistics ⇨ 1 Sample t
Insert the data into column C1 and call it enrichment then press Options
- 53 -
In the options dialog box, type 98.0 for the confidence level.
- 54 -
3.8 Confidence Interval for the Difference of
Means of Two Populations (μ1-μ2)
Many problems arise where we wish to estimate the difference between two
population means by point estimate or by a confidence interval. For example, a farmer
may investigate a new variety of wheat by estimating the difference in the average yield of
the new variety he has planted in the past.
Case 1:
Suppose that our two populations have means μ1 and μ2 and variances σ12 and σ22, to
obtain a point estimate of (μ1-μ2), we select two independent random samples, one from
each population, of sizes n1 and n2, and compute the difference, ( X1 X 2 ) , of the
sample means. If our independent samples are selected from populations that are
approximately normally distributed, or failing this, or if n1 and n2 are both greater than 30,
we can use the sampling distribution of ( X1 X 2 ) to establish a 100(1 - α)% confidence
interval for (μ1 - μ2) of the form
1 2 1 2
2 2 2 2
(X1 X2 ) - z 2 + , (X1 X2 ) + z 2 +
n1 n 2 n1 n 2
where X1 and X 2 are means of independent random samples of size n1 and n2 from
populations with known variances σ12 and σ22, respectively, and zα/2 is defined by
( z 2 ) = 1 - 2
If σ12 and σ22 are unknown but n1 > 30 and n2 > 30, we may replace σ12 by S12 and σ22 by
S22 without appreciably affecting the confidence interval.
Example 3.5
A sample of 150 brand A light bulbs showed a mean lifetime of 1400 hours and
standard deviation of 120 hours. A sample of 200 brand B light bulbs showed a mean
lifetime of 1200 hours and standard deviation of 80 hours. Find 95% confidence limits for
the difference of the mean lifetime of the populations of brand A and brand B.
Solution
Here we have
n 1 = 150 , x1 = 1400 , S1 = 120 ( for b rand A ),
n 2 = 200 , x 2 = 1200 , S2 = 80 ( for brand B )
Since n1 and n2 are large, then the confidence interval for the difference (μ1-μ2) is given by the
- 55 -
above formula.
Let 1 -α = 0.95, then α = 0.05, so
( z 2 ) = 1 - 2 0.975 z 2 = 1.96
Thus
2 2
S1 S2
Lower limit = ( x1 - x 2 ) z / 2 +
n1 n 2
120 2 80 2
= ( 1400 - 1200 ) - 1.96 +
150 200
= 200 - 22.175 = 177.825
2 2
S1 S2
Upper limit = ( x1 - x 2 ) + z / 2 +
n1 n 2
= 200 + 22.175 = 222.175
Therefore (177.825, 222.175) is a 95% confidence interval for the difference (μ1 - μ2).
Case 2:
The above procedure for estimating the difference between two means is applicable
if σ12 and σ22 are known or can be estimated from large samples. If the sample sizes are
small and σ12 and σ22 are unknown, we can establish confidence intervals for (μ1 - μ2) by
using the t-distribution, provided that the unknown variances are equal. Thus, if
σ12 = σ22 = σ2,
we estimate σ2 by Sp2, where Sp2 obtained by combining or pooling the sample variances
according to the formula
2 (n 1 - 1) S12 + (n 2 - 1) S22
S =
p
n1 + n 2 - 2
Therefore, a 100(1-α)% confidence interval for small samples is established in the same
way as for large samples except the quantity
12 22 1 1
+ = +
n1 n 2 n1 n 2
is replaced by
1 1
Sp +
n1 n 2
and we use tα/2 with = n1 + n2 - 2 degrees of freedom in place of zα/2. The degrees of
- 56 -
freedom for tα/2 correspond to the divisor in the formula for Sp2.
Therefore, a(1-α) 100% confidence interval for the difference (μ1 - μ2) is
1 1 1 1
( x 1 - x 2 ) - t / 2 Sp + , ( x 1 - x 2 ) + t / 2 Sp +
n1 n 2 n1 n 2
where x 1 and x 2 are the means of small independent samples of sizes n1 and n2,
respectively, from approximate normal distributions, Sp is the pooled standard deviation,
and tα/2 is the value of the t-distribution with = n1 + n2 - 2 a degrees of freedom, leaving
an area of (1-α/2) to the left.
Example 3.6
Two catalysts are being analyzed to determine how they affect the mean yield of
a chemical process. Specifically, catalyst 1 is currently in use, but catalyst 2 is
acceptable. Since catalyst 2 is cheaper, it should be adopted, providing it does not
change the process yield. A test is run in the pilot plant and results in the data shown in
following table.
Observation
Catalyst 1 Catalyst 2
Number
1 91.50 89.19
2 94.18 90.95
3 92.18 90.46
4 95.39 93.21
5 91.79 97.19
6 89.07 97.04
7 94.72 91.07
8 89.21 92.75
Mean 92.255 92.733
S.D. 2.39 2.98
Find a 95% confidence interval on the difference in means, μ1-μ2, for the two catalysts.
Assume equal variances.
solution
Let μ1 and μ2 represent the mean yield of the chemical process using catalyst 1 and
catalyst 2 respectively. We wish to find 95% confidence interval for μ1-μ2. We will
assume that the yield of the chemical process is normally distributed. Furthermore, we
will assume that both normal populations have the same standard deviation.
Thus the pooled variance is
(n1 - 1) s12 + (n 2 - 1) s 22 7(2.39)2 + 7(2.98)2
2
s = = = 7.3 sp 2.7
882
p
n1 + n 2 - 2
- 57 -
Since α = 0.05 & = n1+ n2-2=14, we find from the t-distribution table that
t / 2,14 = t 0.025,14 = 2.145
Hence
1 1
Lower limit = (x1 - x2 ) - t / 2 Sp + = -3.37
n1 n2
1 1
Upper limit = (x1 - x2 ) t / 2 Sp + = 2.42
n1 n2
Therefore we are 95% confident that the interval (-3.37, 2.42) contains the true differences
of the mean yield of the chemical process using catalyst 1 and catalyst 2 respectively.
Using Minitab:
Insert the data into columns C1 & C2 and call them Catalyst 1 &2.
Press Stat ⇨ Basic Statistics ⇨ 2 Sample t ⇨ press "samples in different
columns" ⇨ Insert the data into First & Second ⇨ Click on "Assume equal
variances" ⇨ press Options. In the options dialog box, type 95.0 for the confidence
level.
- 58 -
3.9 Confidence Interval for The Proportion p
Suppose there is a population of interest, a particular trait is being studied, and each
member of the population can be classed as either having or failing to have the trait (e.g.
success or failure). Confidence limits are to be found for the parameter p, the proportion of
the population with the trait. For this purpose, we should draw a random sample from the
population of interest, determine the proportion of objects in the sample with the trait, and
use this sample proportion as a point estimate of the population proportion p. That is,
X number of objects in sample with trait
p̂ = =
n Sample size
= sample propotion
Note that X is a binomial random variable with mean np and variance np(1-p) .
As the sample size n increases, the sampling distribution of p̂ approaches a
normal distributed with mean and variance,
p (1 - p )
p̂ = p and p̂ =
2
n
i.e.
p̂ - p
Z= N(0 , 1) for n large enough
p (1 - p )
n
In order to use the normal approximation, n should be large enough so that
np 5 and n(1-p) 5
Although some statisticians recommend both terms np and n(1-p) be greater than 9 or even 10.
Hence a 100(1 - α)% confidence limits for p are given by
p (1 - p)
p̂ z 2
n
Since p, the parameter we are trying to estimate, is unknown, we must use p̂ as an estimate and
thus our confidence limits for p become
- 59 -
p̂ ( 1 - p̂ )
p̂ z 2
n
Example 3.7
The fraction of defective integrated circuits produced in a photolithography process
is being studied. A random sample of 400 circuits is tested, revealing 16 defectives. Find a
96% confidence interval on the fraction of defective circuits produced by this particular
tool.
Solution
Here we have, n=400, x=16 so the sample proportion is
16
p̂ = = 0.04
400
= 0.04 zα/2 = z0.02 = 2.054. A 96% confidence interval for p is:
ˆ ˆ
p(1-p) (0.04)(0.96)
p̂ z / 2 0.04 2.054 (0.0199, 0.0601)
p̂ 400
Using Minitab:
Press Stat ⇨ Basic Statistics ⇨ 1 Proportion ⇨ Insert the Number of trials
(400) and Number of events (16).
Press Options. In the options dialog box, type 96.0 for the confidence level and click
on "Use test and interval based on normal distribution".
- 60 -
Press OK , we obtain the following results in the session
<><><><><><><><><><><><><><><><>
- 61 -
EXERCISES
[1] Finding critical values that corresponds to the given confidence level.
a- 98% b- 95% c- 92% d- 90%
[2] If the mean and standard deviation of serum iron values for healthy men are 120 and
15 micrograms per 100 ml, respectively. A random sample of 36 normal men is
taken. Consider the distribution of X , the sample mean of serum iron
(a) What is the probability that the sample mean falls between 115 and 130
micrograms per 100 ml?
(b)What is the sample n would be necessary in order to have P(110 < X < 130) =
0.99?
[3] An industrial engineer is interested in estimating the mean time required to assemble a printed
circuit board. How large a sample is required if the engineer wishes to be 95% confident that
the error in estimating the mean is less than 0.25 minutes? The standard deviation of
assembly time is 0.45 minutes.
[4] The life in hours of a 75-watt light bulb is known to be normally distributed with
standard deviation 25 hours. A random sample of 25 bulbs has a mean life of 1015
hours.
(a) Construct a 95% confidence interval on the mean life.
(b) Construct a 99% confidence bound on the mean life.
[5] Suppose that in Exercise [4] we wanted to be 95% confident that the error in
estimating the mean life is less than five hours. What sample size should be used?
[6] Suppose that in Exercise [4] we wanted the total width of the two-sided confidence
interval on mean life to be six hours at 95% confidence. What sample size should be
used?
[7] A machine produces metal rods used in an automobile suspension system. A random
sample of 15 rods is selected, and the diameter is measured. The resulting data (in
millimeters) are as follows:
8.24 8.25 8.20 8.23 8.24 8.21 8.26 8.26 8.20 8.25 8.23 8.23 8.19 8.28 8.24
Assuming normality for the rod diameter, find a 98% confidence interval on mean
rod diameter.
[8] The heights of female students in Alex University, in inches, are normally distributed
with mean μ and standard deviation σ=2.4. We select a simple random sample of 4
- 62 -
students and measure their heights. The four heights, in inches, are: 63, 69, 62 and 66.
If we wanted the margin of error for the 99% confidence interval to be 1 inch, we
should select a simple random sample of size
a- 2 b- 7 c- 16 d- 39.
[9] The brightness of a television picture tube can be evaluated by measuring the amount
of current required to achieve a particular brightness level. A sample of 10 tubes
results in mean 317.2 and standard deviation 15.7. Find (in microamps) a 99%
confidence interval on mean current required. State any necessary assumptions
about the underlying distribution of the data.
[10] The compressive strength of concrete is being tested by a civil engineer. He tests 16
specimens and obtains the following data.
2216 2237 2249 2204 2225 2301 2281 2263 2318 2255 2275
2295 2250 2238 2300 2217
Assuming that the compressive strength of this concrete is normally distributed,
construct a 92% confidence interval on the mean strength.
[11] A sample of 12 measurements of the breaking strengths of cotton threads gave a mean
of 7.38 ounces and a standard deviations of 1.24 ounces. Find (a) 95% and (b) 99%
confidence limits for the actual breaking strength.
[16] In a simple random sample of 150 of the employees of a large firm, 93 were absent due to
sickness three or more days this year. Construct 95% C.I. for the true proportion of these
employees who are absent three or more days yearly due to sickness.
[17] A random sample of 200 voters in a town is selected, and 114 are found to support an
annexation suit.
a- Find the 96% confidence interval for the fraction of the voting population
favoring the suit.
b- What can we assert with 96% confidence about the possible size of our error if we
estimate the fraction of voters favoring the annexation suit to be 0.57?
c- How large a sample is needed if we wish to be 96% confident that our sample
proportion will be within 0.02 of the true fraction of the voting population?
[18] Circle the correct answer from each of the following multiple choice questions
1. In a test for acid rain, an SRS of 49 water samples showed a mean pH level of 4.4 with
standard deviation of 0.35. A 95% confidence interval estimate for the mean pH level is
given by
a. 4.4 ± 0.98 b. 4.4 ± 0.082 c. 4.4 ± 0.098
2. A random sample of 64 drivers on a highway were stopped and checked for fastening seat
belts. It was found that 48 drivers had their seat belts on. A 92% confidence interval for the
proportion of drivers with seat belt on is given by
a. 0.75±.005 b. 0.75±.076 c. 0.75±.095
- 64 -
4. A sample of 36 measurements of the breaking strengths of cotton threads gave a mean of
7.38 ounces and a standard deviations of 1.5 ounces. A 95% confidence interval for the
actual breaking strength
a. 7.38±0.49 b. 7.38±0.98 c. 7.38±0.41 d. None of the above.
5. The width of a 100(1-α)% confidence interval for the mean μ increases when
a. α increases b. α decreases c. n decreases
d. both a & c e. both b &c
6. The critical value z / 2 that corresponds to a degree of confidence 91% is
a. 1.34 b. 1.75 c. 1.70 d. 1.645
7. The margin of error E in estimating the duration of telephone calls directed by a local telephone
company with σ=3.0 minutes, n=580,97 and confidence level 97% is
a. 0.270 min. b. 0.057 min. c. 0.011 min. d. 0.006 min.
8. The minimum sample size you should use to assure that your estimate of will be within the
required margin of error E=0.006, around the population p and confidence level 95% is:
a. 161 b. 38,415 c. 82 d. 38416
9. A 90% confidence interval estimate for the mean µ with n=30, x 79.1 and s =16.8,
assume that the population has a normal distribution, is:
a. 72.83 < µ < 85.37 b. 73.89 < µ < 84.31
c. 73.92 < µ < 84.28 d. 70.65 < µ < 87.5
>+<>+<>+<>+<>+<>+<>+<>+<
- 65 -