0% found this document useful (0 votes)
221 views

Statistical Intervals For A Single Sample

This document discusses statistical intervals for a single sample. It provides methods for constructing confidence intervals on the mean, variance, and proportion of a population based on a random sample. These include using the normal, t, chi-squared, and binomial distributions depending on whether the population distribution and variance are known or unknown. Examples are provided to demonstrate how to calculate confidence intervals on the mean and variance of a normal distribution in various situations.

Uploaded by

Bui Tien Dat
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
221 views

Statistical Intervals For A Single Sample

This document discusses statistical intervals for a single sample. It provides methods for constructing confidence intervals on the mean, variance, and proportion of a population based on a random sample. These include using the normal, t, chi-squared, and binomial distributions depending on whether the population distribution and variance are known or unknown. Examples are provided to demonstrate how to calculate confidence intervals on the mean and variance of a normal distribution in various situations.

Uploaded by

Bui Tien Dat
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

8 Statistical Intervals for a

Single Sample
Learning Objectives

• After careful study of this chapter, you should be able to


do the following:
• 1. Construct confidence intervals on the mean of a normal
distribution, using normal distribution / t distribution method
• 2. Construct confidence intervals on the variance and
standard deviation of a normal distribution
• 3. Construct confidence intervals on a population proportion
• 4. Use a general method for constructing an approximate
confidence interval on a parameter
Introduction

How to compute I, u
from sample data?

Random sample
Population
unknown
parameter 
l    u,
where P(l    u ) = 1 - 
l: lower-confidence limits (bounds)
u: upper-confidence limits (bounds)
1 − α: confidence coefficient
Confidence Interval On The Mean Of A
Normal Distribution, 2 Known
• Problem. Suppose that X1, X2, … , Xn is a random
sample from a normal distribution with unknown mean 
and known variance 2.
• A confidence interval estimate for  is an interval of the
form l    u
• If P(L    U) = 1 - , (0    1), then
• [l, u] is called confidence interval
• 1 -  is called the confidence coefficient
100(1 - )% CI of A standard normal
distribution

This area
This area This area
equals /2
equals /2 equals 1 - 

For a Standard Normal distribution


P(-z/2  Z  z/2) = 1 - 
How to compute I, u
from sample data?

Random sample
Population
unknown mean

L    U,
where P(L    U ) = 1 - 

CLT ➔
𝑋ത is normally
distributed with mean
 and variance 2/n
Confidence Interval on the Mean,
Variance Known
If xത is the sample mean of a random sample of size n
from a normal population with known variance 2, a
100(1 - )% CI on  is given by

where z/2 is the upper 100/2 percentage point of the


standard normal distribution.
Confidence Interval on the Mean,
Variance Known - Example
Metallic Material Transition ASTM Standard E23 defines standard test methods for notched
bar impact testing of metallic materials. The Charpy V-notch (CVN) technique measures
impact energy and is often used to determine whether or not a material experiences a
ductile-to-brittle transition with decreasing temperature. Ten measurements of impact
energy (J) on specimens of A238 steel cut at 60ºC are as follows: 64.1, 64.7, 64.5, 64.6,
64.5, 64.3, 64.6, 64.8, 64.2, and 64.3. Assume that impact energy is normally distributed
with  = 1J. We want to find a 95% CI for , the mean impact energy. The required
quantities are z/2 = z0.025 = 1.96, n = 10,  = 1, and xത = 64.46. The resulting 95% CI is found
as follows:

Practical Interpretation: Based on the sample data, a range of highly plausible values for
mean impact energy for A238 steel at 60°C is 63.84J    65.08J.
Interpreting a Confidence Interval

• If an infinite number of random samples are collected


and a 100(1 )% confidence interval for is computed
from each sample, 100(1 - )% of these intervals will
contain the true value of .
• We don’t know if the statement is true for this specific
sample, but the method used to obtain the interval [l, u]
yields correct statements 100(1 - )% of the time.
Simulated confidence intervals
100 samples of size 25 were generated from
a norm(mean = 50, sd = 4) distribution, and
each sample was used to find a 95%
confidence interval for the population mean.
The 100 confidence intervals are represented
above by horizontal lines, and the respective
sample means are denoted by vertical
slashes. Confidence intervals that “cover” the
true mean µ = 50 are plotted in black; those
that fail to cover are plotted in a lighter color.
In the plot we see that 7 of the simulated
intervals out of the 100 failed to cover µ = 50,
which is a success rate of 93%. If the number
of generated samples were to increase from
100 to 1000 to 10000, . . . , then we would
expect our success rate to approach the
exact value of 95%.
Choice of sample size
• From the 100(1 - )% CI

𝜎
• We have E = error = | xത -  |  z/2
n
𝜎
➔ Choose n such that E = z/2
n


Choice of sample size - Example
Suppose that we wanted to determine how many specimens
must be tested to ensure that the 95% CI on  for A238 steel cut
at 60°C has a length of at most 1.0J.
-----------------------
Since the bound on error in estimation E is one-half of the length of
z 
the CI, that is, E  ½, to determine n we use n = ( /2 )2  with E = ½,
𝐸
z/2 = 1.96.

The required sample size is 16.


One-Sided Confidence Bounds

This area equals 


This area This area
equals 1 -  equals 1 - 

100(1-)% lower-confidence bound for  is 100(1-)% upper-confidence bound for  is

𝜎 𝜎
xത − z  -    xത + z
n n
Confidence Interval on the Mean
Unknown 2  Large sample size
• What if σ is unknown? We instead use the interval
s s

where s (used to estimate ) is the sample standard


deviation and n is large enough
• What if n is small?
t distribution

• Let X1, X2, … , Xn be a random sample from a normal


distribution with unknown mean  and unknown
variance 2. The random variable

𝑋−𝜇
T=
𝑆/ 𝑛
has a t distribution
with n - 1 degrees of freedom.
Probability density functions of several
t distributions
t Confidence Interval on 

Percentage points of the t distribution

𝑠 𝑠
xത − t/2,n-1    xത + t/2,n-1
n n
Remark. One-sided confidence bounds on the mean are found by replacing t/2,n-1 with t,n-1.
t Confidence Interval on  - Example
An article in the journal Materials Engineering (1989, Vol. II, No. 4, pp. 275–281)
describes the results of tensile adhesion tests on 22 U-700 alloy specimens.
The load at specimen failure is as follows (in megapascals):

19.8 10.1 14.9 7.5 15.4 15.4


15.4 18.5 7.9 12.7 11.9
11.4 11.4 14.1 17.6 16.7
15.8 19.5 8.8 13.6 11.9 11.4

The sample mean is xത = 13.71, and the sample standard deviation is s = 3.55.
We want to find a 95% CI on . Since n = 22, we have n - 1 = 21 degrees of
freedom for t, so t0.025,21 = 2.080.
𝑠 𝑠
xത − t/2,n-1    xത + t/2,n-1 ➔ 12.14    15.28
n n
t Confidence Interval on  - Example

12.14    15.28
• Practical Interpretation: The CI is fairly wide because
there is a lot of variability in the tensile adhesion test
measurements. A larger sample size would have led to
a shorter interval.
Confidence Interval for 2 of a Normal
Distribution
2 Distribution. Let X1, X2, … , Xn be a random sample
from a normal distribution with mean  and variance 2,
and let S2 be the sample variance. Then the random
variable
(𝑛−1)𝑆 2
X2 =
𝜎2
has a chi-square (2) distribution with n - 1 degrees of
freedom.
2 distributions

mean = k
Pdf of several 2 variance = 2k
distributions.
➔ skewed to
the right
Percentage point of the 2 distribution.

(a) The percentage point 2,k.


(b) The upper percentage point 20.05,10 = 18.31 and the lower percentage
point 20.95,10 = 3.94
Confidence Interval on the Variance 2

If s2 is the sample variance


from a random sample of n
observations from a normal
distribution with unknown
variance 2, then a 100(1 - )%
CI on 2 is
Challenge

Find the One-Sided Confidence Bounds on the Variance


CI on the Variance 2 - Example
An automatic filling machine is used to fill bottles with liquid detergent. A
random sample of 20 bottles results in a sample variance of fill volume of s2
0.0153 (fluid ounce)2. If the variance of fill volume is too large, an
unacceptable proportion of bottles will be under- or overfilled. We will
assume that the fill volume is approximately normally distributed. A 95%
upper confidence bound is
Practical Interpretation: At the 95% level of
confidence, the data indicate that the process
standard deviation could be as large as 0.17 fluid
ounce. The process engineer or manager now
needs to determine if a standard deviation this large
could lead to an operational problem with under-or
over filled bottles.
Normal Approximation for a Binomial
Proportion
• p: a population proportion
• ෡ = X/n: a point estimator of p
P
• When n is large enough, X/n ~ Normal(mean = p, variance = p(1-p)/n), if p
is not too close to either 0 or 1.
• Requirement for approximation: np, n(1 - p)  5.
If n is large, the distribution of

X − np P −p
Z= =
np(1 − p) p(1 − p)
n
is approximately standard normal.
Approximate Confidence Interval on a
Binomial Proportion
If pො is the proportion of observations in a
random sample of size n that belongs to a
class of interest, an approximate 100(1- )%
CI on the proportion p of the population that
belongs to this class is

where z/2 is the upper /2 percentage point


of the standard normal distribution.
Approximate Confidence Interval on a
Binomial Proportion - Example
• In a random sample of 85 automobile engine crankshaft bearings,10
have a surface finish that is rougher than the specifications allow.
Therefore, a point estimate of the proportion of bearings in the
population that exceeds the roughness specification is p ො = x/n =
10/85 = 0.12. A 95% two-sided confidence interval for p is
Choice of Sample Size

• Error = E := |p - P|. Note that p(1 - p)  0.25
• E  z/2 p(1−p)/n So, n can be chosen using
z 2
z 2 /2 n /2
(0.25)
➔n E
p(1-p) E
(In practice, use pො as an estimate of p in this formula)
Example. In a random sample of 85 automobile engine crankshaft bearings,10
have a surface finish that is rougher than the specifications allow. How large a
sample is required if we want to be 95% confident that the error in using pො to
estimate p is less than 0.05?
z/2 2 1.96 2 Practical Interpretation: if we
n p(1-p) = 0.12(1 - 0.12) = 163
E 0.05 have information concerning the
z/2 2 value of p, we could use a
Or n  (0.25) = 385
E smaller sample
One-Sided Confidence Bounds
Summary

• CONFIDENCE INTERVAL ON THE MEAN OF A NORMAL


DISTRIBUTION, VARIANCE KNOWN
• CONFIDENCE INTERVAL ON THE MEAN OF A NORMAL
DISTRIBUTION, VARIANCE UNKNOWN
• CONFIDENCE INTERVAL ON THE VARIANCE AND
STANDARD DEVIATION OF A NORMAL DISTRIBUTION
• LARGE-SAMPLE CONFIDENCE INTERVAL FOR A
POPULATION PROPORTION
(3x + 17) mod 26 + 46
(5x + 17) mod 26 + 46
(7x + 17) mod 26 + 46

You might also like