Lecture No.11
Lecture No.11
Lecture No. 11
Resourse Person: Dr. Absar Ul Haq Department: Mechanical Engineering (Narowal Campus).
The field of statistical inference is basically concerned with generalizations and predictions. For example, we
might claim, based on the opinions of several people interviewed on the street, that in a forthcoming election
60% of the eligible voters in the city of Detroit favor a certain candidate. In this case, we are dealing with a
random sample of opinions from a very large finite population. As a second illustration we might state that
the average cost to build a residence in Charleston, South Carolina, is between $330,000 and $335,000, based
on the estimates of 3 contractors selected at random from the 30 now building in this city. The population
being sampled here is again finite but very small. Finally, let us consider a soft-drink machine designed
to dispense, on average, 240 milliliters per drink. A company official who computes the mean of 40 drinks
obtains x̄ = 236 milliliters and, on the basis of this value, decides that the machine is still dispensing drinks
with an average content of µ = 240 milliliters. The 40 drinks represent a sample from the infinite population
of possible drinks that will be dispensed by this machine.
Sampling Distribution of S 2
In the preceding lecture, we learned about the sampling distribution of X̄. The Central Limit Theorem
allowed us to make use of the fact that
X̄ − µ
√
σ/ n
tends toward N (0, 1) as the sample size grows large. Sampling distributions of important statistics allow
us to learn information about parameters. Usually, the parameters are the counterpart to the statistics in
question. For example, if an engineer is interested in the population mean resistance of a certain type of
resistor, the sampling distribution of X̄ will be exploited once the sample information is gathered. On the
other hand, if the variability in resistance is to be studied, clearly the sampling distribution of S 2 will be
used in learning about the parametric counterpart, the population variance σ 2 . If a random sample of size
n is drawn from a normal population with mean µ and variance σ 2 , and the sample variance is computed,
we obtain a value of the statistic S 2 .
Theorem 11.1 If S 2 is the variance of a random sample of size n taken from a normal population having
the variance σ 2 , then the statistic
n
(n − 1)S 2 X (Xi − X̄)2
χ2 = 2
=
σ i=1
σ2
The values of the random variable χ2 are calculated from each sample by the formula
(n − 1)S 2
χ2 =
σ2
11-1
Lecture No. 11 11-2
The probability that a random sample produces a χ2 value greater than some specified value is equal to the
area under the curve to the right of this value. It is customary to let χ2α a represent the χ2 value above
which we find an area of α. This is illustrated by the shaded region in Figure 11.1.
A manufacturer of car batteries guarantees that the batteries will last, on average, 3 years with a
standard deviation of 1 year. If five of these batteries have lifetimes of 1.9, 2.4, 3.0, 3.5, and 4.2 years,
should the manufacturer still be convinced that the batteries have a standard deviation of 1 year?
Assume that the battery lifetime follows a normal distribution.
Solution: We first find the sample variance using Theorem 8.1
(5)(48.26) − (15)2
s2 = = 0.815.
(5)(4)
(4)(0.815)
χ2 = = 3.26
1
is a value from a chi-squared distribution with 4 degrees of freedom. Since 95% of the χ2 values with 4
degrees of freedom fall between 0.484 and 11.143, the computed value with σ 2 = 1 is reasonable, and
therefore the manufacturer has no reason to suspect that the standard deviation is other than 1 year.
11.1.1 t-Distribution
In previous theorem, we discussed the utility of the Central Limit Theorem. Its applications revolve around
inferences on a population mean or the difference between two population means. Use of the Central Limit
Theorem and the normal distribution is certainly helpful in this context. However, it was assumed that the
population standard deviation is known. This assumption may not be unreasonable in situations where the
engineer is quite familiar with the system or process. However, in many experimental scenarios, knowledge
of s is certainly no more reasonable than knowledge of the population mean µ. Often, in fact, an estimate
of s must be supplied by the same sample information that produced the sample average x̄. As a result, a
natural statistic to consider to deal with inferences on µ is
X̄ − µ
T = √
S/ n
Lecture No. 11 11-3
since S is the sample analog to σ. If the sample size is small, the values of S 2 fluctuate considerably from
sample to sample and the distribution of T deviates appreciably from that of a standard normal distribution.
If the sample size is large enough, say n = 30, the distribution of T does not differ considerably from the
standard normal. However, for n < 30, it is useful to deal with the exact distribution of T . In developing the
sampling distribution of T , we shall assume that our random sample was selected from a normal population.
We can then write √
(X̄ − µ)(S/ n) Z
T = q =p
S 2
2
V /(n − 1)
σ
where
X̄ − µ
Z= √
σ/ n
has the standard normal distribution and
(n − 1)S 2
V =
σ2
has a chi-squared distribution with v = n − 1 degrees of freedom. In sampling from normal populations, we
can show thatX̄ and S 2 are independent, and consequently so are Z and V . The following theorem gives
the definition of a random variable T as a function of Z (standard normal) and χ2 . For completeness, the
density function of the t-distribution is given.
Theorem 11.2 Let Z be a standard normal random variable and V a chi-squared random variable with v
degrees of freedom. If Z and V are independent, then the distribution of the random variable T , where
Z
T =p
V /v
From the foregoing and the theorem above we have the following corollary.
Corollary 11.3 Let X1 , X2 , · · ·, Xn be independent random variables that are all normal with mean µ and
standard deviation σ. Let
n
1X
X̄ = X1
n i=1
and
n
1 X
S2 = (X1 − X̄)2
n − 1 i=1
X̄−µ
Then the random variable T = √
S/ n
has a t-distribution with v = n − 1 degrees of freedom.
The distribution of T is similar to the distribution of Z in that they both are symmetric about a mean of
zero. Both distributions are bell shaped, but the tdistribution is more variable, owing to the fact that the
Lecture No. 11 11-4
T -values depend on the fluctuations of two quantities, X̄ and S 2 , whereas the Z-values depend only on the
changes in X̄ from sample to sample. The distribution of T differs from that of Z in that the variance of
T depends on the sample size n and is always greater than 1. Only when the sample size n → ∞ will the
two distributions become the same. In Figure 11.2, we show the relationship between a standard normal
distribution (v = 8) and t-distributions with 2 and 5 degrees of freedom. The percentage points of the
t-distribution are given in Table A.4. It is customary to let ta represent the t-value above which we find
an area equal to a. Hence, the t-value with 10 degrees of freedom leaving an area of 0.025 to the right is
t = 2.228. Since the t-distribution is symmetric about a mean of zero, we have t1−α = −tα ; that is, the
t-value leaving an area of 1 − α to the right and therefore an area of α to the left is equal to the negative
t-value that leaves an area of a in the right tail of the distribution (see Figure 11.3). That is, t0.95 = −t0.05 ,
t0.99 = −t0.01 , and so forth.
The t-value with v = 14 degrees of freedom that leaves an area of 0.025 to the left, and therefore an
area of 0.975 to the right, is
t0.975 = −t0.025 = −2.145.
Find P (−t0.025 < T < t0.05).
Solution: Since t0.05 leaves an area of 0.05 to the right, and -t0.025 leaves an area of 0.025 to the
left, we find a total area of
1 − 0.05 − 0.025 = 0.925
between -t0.025 and t0.05. Hence
P (−t0.025 < T < t0.05 ) = 0.925.
Lecture No. 11 11-5
Find k such that P (k < T < −1.761) = 0.045 for a random sample of size 15 selected from a normal
X̄−µ
distribution and s/√
n
Solution: From Table A.4 we note that 1.761 corresponds to t0.05 when v = 14. Therefore, −t0.05 =
−1.761. Since k in the original probability statement is to the left of −t0.05 = −1.761, let k = −tα .
Then, from Figure 11.4, we have
k = −t0.005 = −2.977
and
P (−2.977 < T < −1.761) = 0.045.
A chemical engineer claims that the population mean yield of a certain batch process is 500 grams per
milliliter of raw material. To check this claim he samples 25 batches each month. If the computed
t-value falls between −t0.05 and t0.05 , he is satisfied with this claim. What conclusion should he draw
from a sample that has a mean x̄ = 518 grams per milliliter and a sample standard deviation σ = 40
grams? Assume the distribution of yields to be approximately normal.
Solution: From Table A.4 we find that t0.05 = 1.711 for 24 degrees of freedom. Therefore, the engineer
can be satisfied with his claim if a sample of 25 batches yields a t-value between -1.711 and 1.711. If
µ = 500, then
518 − 500
t= √ = 2.25,
40/ 25
a value well above 1.711. The probability of obtaining a t-value, with v = 24, equal to or greater than
2.25 is approximately 0.02. If µ > 500, the value of t computed from the sample is more reasonable.
Hence, the engineer is likely to conclude that the process produces a better product than he thought.
11.1.2 F-Distribution
We have motivated the t-distribution in part by its application to problems in which there is comparative
sampling (i.e., a comparison between two sample means). For example, some of our examples in future
chapters will take a more formal approach, chemical engineer collects data on two catalysts, biologist collects
data on two growth media, or chemist gathers data on two methods of coating material to inhibit corrosion.
While it is of interest to let sample information shed light on two population means, it is often the case that a
Lecture No. 11 11-6
comparison of variability is equally important, if not more so. The F -distribution finds enormous application
in comparing sample variances. Applications of the F -distribution are found in problems involving two or
more samples. The statistic F is defined to be the ratio of two independent chi-squared random variables,
each divided by its number of degrees of freedom. Hence, we can write
U/v1
F = ,
V /v2
where U and V are independent random variables having chi-squared distributions with v1 and v2 degrees
of freedom, respectively. We shall now state the sampling distribution of F .
Theorem 11.4 Let U and V be two independent random variables having chi-squared distributions with v1
and v2 degrees of freedom, respectively. Then the distribution of the random variable F = U/v1
V /v2 is given by
the density function
We will make considerable use of the random variable F in future chapters. However, the density function
will not be used and is given only for completeness. The curve of the F -distribution depends not only on the
two parameters v1 and v2 but also on the order in which we state them. Once these two values are given, we
can identify the curve. Typical F -distributions are shown in Figure 11.5. Let fα be the f -value above which
we find an area equal to α. This is illustrated by the shaded region in Figure 11.6. Table A.6 gives values
of fa only for α = 0.05 and α = 0.01 for various combinations of the degrees of freedom v1 and v2 . Hence,
the f -value with 6 and 10 degrees of freedom, leaving an area of 0.05 to the right, is f0.05 = 3.22. By means
of the following theorem, Table A.6 can also be used to find values of f0.95 and f0.99 .
Suppose that random samples of size n1 and n2 are selected from two normal populations with variances σ12
and σ22 , respectively. From Theorem 11.1, we know that
(n1 − 1)S12
χ21 =
σ12
and
(n2 − 1)S22
χ22 =
σ22
are random variables having chi-squared distributions with v1 = n1 − 1 and v2 = n2 − 1 degrees of freedom.
Furthermore, since the samples are selected at random, we are dealing with independent random variables.
Then, using Theorem 11.4 with χ21 = U and χ22 = V , we obtain the following result.
Theorem 11.5 If S12 and S22 are the variances of independent random samples of size n1 and n2 taken from
normal populations with variances σ12 and σ22 , respectively, then
References
[TT] T.T. Soong, “Fundamentals of probability and statistics for engineers,” John Wiley & Sons
Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England, 2004.
Table A.4 Student t-Distribution Probability Table 737
α
Table A.4 Critical Values of the t-Distribution 0 tα
α
v 0.40 0.30 0.20 0.15 0.10 0.05 0.025
1 0.325 0.727 1.376 1.963 3.078 6.314 12.706
2 0.289 0.617 1.061 1.386 1.886 2.920 4.303
3 0.277 0.584 0.978 1.250 1.638 2.353 3.182
4 0.271 0.569 0.941 1.190 1.533 2.132 2.776
5 0.267 0.559 0.920 1.156 1.476 2.015 2.571
6 0.265 0.553 0.906 1.134 1.440 1.943 2.447
7 0.263 0.549 0.896 1.119 1.415 1.895 2.365
8 0.262 0.546 0.889 1.108 1.397 1.860 2.306
9 0.261 0.543 0.883 1.100 1.383 1.833 2.262
10 0.260 0.542 0.879 1.093 1.372 1.812 2.228
11 0.260 0.540 0.876 1.088 1.363 1.796 2.201
12 0.259 0.539 0.873 1.083 1.356 1.782 2.179
13 0.259 0.538 0.870 1.079 1.350 1.771 2.160
14 0.258 0.537 0.868 1.076 1.345 1.761 2.145
15 0.258 0.536 0.866 1.074 1.341 1.753 2.131
16 0.258 0.535 0.865 1.071 1.337 1.746 2.120
17 0.257 0.534 0.863 1.069 1.333 1.740 2.110
18 0.257 0.534 0.862 1.067 1.330 1.734 2.101
19 0.257 0.533 0.861 1.066 1.328 1.729 2.093
20 0.257 0.533 0.860 1.064 1.325 1.725 2.086
21 0.257 0.532 0.859 1.063 1.323 1.721 2.080
22 0.256 0.532 0.858 1.061 1.321 1.717 2.074
23 0.256 0.532 0.858 1.060 1.319 1.714 2.069
24 0.256 0.531 0.857 1.059 1.318 1.711 2.064
25 0.256 0.531 0.856 1.058 1.316 1.708 2.060
26 0.256 0.531 0.856 1.058 1.315 1.706 2.056
27 0.256 0.531 0.855 1.057 1.314 1.703 2.052
28 0.256 0.530 0.855 1.056 1.313 1.701 2.048
29 0.256 0.530 0.854 1.055 1.311 1.699 2.045
30 0.256 0.530 0.854 1.055 1.310 1.697 2.042
40 0.255 0.529 0.851 1.050 1.303 1.684 2.021
60 0.254 0.527 0.848 1.045 1.296 1.671 2.000
120 0.254 0.526 0.845 1.041 1.289 1.658 1.980
∞ 0.253 0.524 0.842 1.036 1.282 1.645 1.960
738 Appendix A Statistical Tables and Proofs
0 χα2
Table A.5 Critical Values of the Chi-Squared Distribution
α
v 0.995 0.99 0.98 0.975 0.95 0.90 0.80 0.75 0.70 0.50
1 0.04 393 0.03 157 0.03 628 0.03 982 0.00393 0.0158 0.0642 0.102 0.148 0.455
2 0.0100 0.0201 0.0404 0.0506 0.103 0.211 0.446 0.575 0.713 1.386
3 0.0717 0.115 0.185 0.216 0.352 0.584 1.005 1.213 1.424 2.366
4 0.207 0.297 0.429 0.484 0.711 1.064 1.649 1.923 2.195 3.357
5 0.412 0.554 0.752 0.831 1.145 1.610 2.343 2.675 3.000 4.351
6 0.676 0.872 1.134 1.237 1.635 2.204 3.070 3.455 3.828 5.348
7 0.989 1.239 1.564 1.690 2.167 2.833 3.822 4.255 4.671 6.346
8 1.344 1.647 2.032 2.180 2.733 3.490 4.594 5.071 5.527 7.344
9 1.735 2.088 2.532 2.700 3.325 4.168 5.380 5.899 6.393 8.343
10 2.156 2.558 3.059 3.247 3.940 4.865 6.179 6.737 7.267 9.342
11 2.603 3.053 3.609 3.816 4.575 5.578 6.989 7.584 8.148 10.341
12 3.074 3.571 4.178 4.404 5.226 6.304 7.807 8.438 9.034 11.340
13 3.565 4.107 4.765 5.009 5.892 7.041 8.634 9.299 9.926 12.340
14 4.075 4.660 5.368 5.629 6.571 7.790 9.467 10.165 10.821 13.339
15 4.601 5.229 5.985 6.262 7.261 8.547 10.307 11.037 11.721 14.339
16 5.142 5.812 6.614 6.908 7.962 9.312 11.152 11.912 12.624 15.338
17 5.697 6.408 7.255 7.564 8.672 10.085 12.002 12.792 13.531 16.338
18 6.265 7.015 7.906 8.231 9.390 10.865 12.857 13.675 14.440 17.338
19 6.844 7.633 8.567 8.907 10.117 11.651 13.716 14.562 15.352 18.338
20 7.434 8.260 9.237 9.591 10.851 12.443 14.578 15.452 16.266 19.337
21 8.034 8.897 9.915 10.283 11.591 13.240 15.445 16.344 17.182 20.337
22 8.643 9.542 10.600 10.982 12.338 14.041 16.314 17.240 18.101 21.337
23 9.260 10.196 11.293 11.689 13.091 14.848 17.187 18.137 19.021 22.337
24 9.886 10.856 11.992 12.401 13.848 15.659 18.062 19.037 19.943 23.337
25 10.520 11.524 12.697 13.120 14.611 16.473 18.940 19.939 20.867 24.337
26 11.160 12.198 13.409 13.844 15.379 17.292 19.820 20.843 21.792 25.336
27 11.808 12.878 14.125 14.573 16.151 18.114 20.703 21.749 22.719 26.336
28 12.461 13.565 14.847 15.308 16.928 18.939 21.588 22.657 23.647 27.336
29 13.121 14.256 15.574 16.047 17.708 19.768 22.475 23.567 24.577 28.336
30 13.787 14.953 16.306 16.791 18.493 20.599 23.364 24.478 25.508 29.336
40 20.707 22.164 23.838 24.433 26.509 29.051 32.345 33.66 34.872 39.335
50 27.991 29.707 31.664 32.357 34.764 37.689 41.449 42.942 44.313 49.335
60 35.534 37.485 39.699 40.482 43.188 46.459 50.641 52.294 53.809 59.335
740 Appendix A Statistical Tables and Proofs
α
Table A.6 Critical Values of the F-Distribution 0 fα
f0.05 (v1 , v2 )
v1
v2 1 2 3 4 5 6 7 8 9
1 161.45 199.50 215.71 224.58 230.16 233.99 236.77 238.88 240.54
2 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38
3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18
10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02
11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90
12 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80
13 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71
14 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65
15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59
16 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54
17 4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55 2.49
18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46
19 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42
20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39
21 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37
22 4.30 3.44 3.05 2.82 2.66 2.55 2.46 2.40 2.34
23 4.28 3.42 3.03 2.80 2.64 2.53 2.44 2.37 2.32
24 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30
25 4.24 3.39 2.99 2.76 2.60 2.49 2.40 2.34 2.28
26 4.23 3.37 2.98 2.74 2.59 2.47 2.39 2.32 2.27
27 4.21 3.35 2.96 2.73 2.57 2.46 2.37 2.31 2.25
28 4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24
29 4.18 3.33 2.93 2.70 2.55 2.43 2.35 2.28 2.22
30 4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21
40 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12
60 4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04
120 3.92 3.07 2.68 2.45 2.29 2.18 2.09 2.02 1.96
∞ 3.84 3.00 2.60 2.37 2.21 2.10 2.01 1.94 1.88
Reproduced from Table 18 of Biometrika Tables for Statisticians, Vol. I, by permission of E.S.
Pearson and the Biometrika Trustees.
742 Appendix A Statistical Tables and Proofs