0% found this document useful (0 votes)
28 views

ENMA 311 Module 9

This document discusses statistical inference procedures for comparing two independent samples or populations. It covers hypothesis testing and confidence intervals for the difference in means of two normal distributions when variances are known or unknown. It also covers procedures for comparing variances or proportions of two populations.

Uploaded by

Camila Soriano
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views

ENMA 311 Module 9

This document discusses statistical inference procedures for comparing two independent samples or populations. It covers hypothesis testing and confidence intervals for the difference in means of two normal distributions when variances are known or unknown. It also covers procedures for comparing variances or proportions of two populations.

Uploaded by

Camila Soriano
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Module 9

Statistical Inference of Two Samples

LEARNING OBJECTIVES

1. Structure comparative experiments involving two samples as hypothesis tests


2. Test hypotheses and construct confidence intervals on the difference in means of
two normal distributions
3. Test hypotheses and construct confidence intervals on the ratio of the variances or
standard deviations of two normal distributions
4. Test hypotheses and construct confidence intervals on the difference in two
population proportions
5. Use the P-value approach for making decisions in hypotheses tests

Lesson 1 Inference on the Difference in Means of two Normal


Distributions, Variances Known

In this section we consider statistical inferences on the difference in means µ1 - µ2 of


two normal distributions, where the variances and are known. The assumptions for this
section are summarized as follows:

1. X11, X12, . . . X1n1 , is a random sample from population 1.


2. X21, X22, . . . . X2n2 , is a random sample from population 2.
3. The two populations represented by X1 and X2 are independent.
4. Both populations are normal.

A logical point estimator of µ1 = µ2 is the difference in sample means 1 - 2. Based


on the properties of expected values

and the variance of 1 - 2 is

Based on the assumptions and the preceding results, we may state the following.
We now consider hypothesis testing on the difference in the means µ1 - µ2 of two normal
populations. Suppose that we are interested in testing that the difference in means µ1
- µ2 is equal to a specified value Δ0. Thus, the null hypothesis will be stated as H0: µ1 -
µ2 = Δ0. Obviously, in many cases, we will specify Δ0 = 0 so that we are testing the
equality of two means (i.e., H0: µ1 - µ2). The appropriate test statistic would be found
by replacing µ1 - µ2 in Equation 10-1 by Δ0, and this test statistic would have a standard
normal distribution under H0. That is, the standard normal distribution is the reference
distribution for the test statistic.

Critical regions for this test are as follows:

Example:

A product developer is interested in reducing the drying time of a primer paint. Two
formulations of the paint are tested; formulation 1 is the standard chemistry, and
formulation 2 has a new drying ingredient that should reduce the drying time. From
experience, it is known that the standard deviation of drying time is 8 minutes, and this
inherent variability should be unaffected by the addition of the new ingredient. Ten
specimens are painted with formulation 1, and another 10 specimens are painted with
formulation 2; the 20 specimens are painted in random order. The two sample average
drying times are 1 = 121 minutes and 2 = 112 minutes, respectively. What
conclusions can the product developer draw about the effectiveness of the new
ingredient, using α = 0.05?

Solution:
We apply the eight-step procedure to this problem as follows:
1. The quantity of interest is the difference in mean drying times, µ1 - µ2, and Δ0 = 0.
2. H0: µ1 - µ2 = 0 or H0: µ1 = µ2
3. H1: µ1 > µ2
We want to reject H0 if the new ingredient reduces mean drying time.
4. α = 0.05
5. The test statistic is

6. Reject H0: µ1 = µ2 if z0 > 1.645 = z0.05.


7. Computations: Since minutes and minutes, the test statistic is

8. Conclusion: Since z0 = 2.52 > 1.645, we reject H0: µ1 = µ2 at the α = 0.05 level
and conclude that adding the new ingredient to the paint significantly reduces the
drying time. Alternatively, we can find the P-value for this test as

P-value = 1 – Φ(2.52) = 0.0059

Therefore, H0: µ1 = µ2 would be rejected at any significance level α ≥ 0.0059.


Lesson 2 Inference on the Difference in Means of two Normal
Distributions, Variances Unknown

We now consider tests of hypotheses on the difference in means µ1 - µ2 of two normal


distributions where the variances σ12 and σ22 are unknown. A t-statistic will be used to
test these hypotheses. As noted above, the normality assumption is required to develop
the test procedure, but moderate departures from normality do not adversely affect
the procedure. Two different situations must be treated. In the first case, we assume
that the variances of the two normal distributions are unknown but equal; that is, σ12
= σ22 = σ2. In the second, we assume that σ12 and σ22 are unknown and not necessarily
equal.

Case 1. σ12 = σ22 = σ2

Suppose we have two independent normal populations with unknown means µ1 and µ2
and unknown but equal variances, σ12 = σ22 = σ2. We wish to test

H0: µ1 - µ2 = Δ0
H1: µ1 - µ2 ≠ Δ0

Let X11, X12, . . . X1n , be a random sample of n1 observations from the first population
and X21, X22, . . . X2n, be a random sample of n2 observations from the second population.
Let , , S21 , and S22 be the sample means and sample variances, respectively. Now the
expected value of the difference in sample means is 1 - 2 = µ1 - µ2, so 1 - 2 is an
unbiased estimator of the difference in means. The variance of is 1 - 2 is

Given the assumptions of this section, the quantity


has a t distribution with n1 + n2 - 2 degrees of freedom.

The use of this information to test the hypotheses above is now straightforward:
simply replace µ1 - µ2 by Δ0 and the resulting test statistic has a t distribution with
n1 + n2 - 2 degrees of freedom under H0: µ1 - µ2 = Δ0. Therefore, the reference
distribution for the test statistic is the t distribution with n1 + n2 - 2 degrees of
freedom. The location of the critical region for both two- and one-sided alternatives
parallels those in the one-sample case. Because a pooled estimate of variance is used,
the procedure is often called the pooled t-test.

Example:
Two catalysts are being analyzed to determine how they affect the mean yield of a
chemical process. Specifically, catalyst 1 is currently in use, but catalyst 2 is
acceptable. Since catalyst 2 is cheaper, it should be adopted, providing it does not
change the process yield. A test is run in the pilot plant and results in the data shown
in Table 10-1. Is there any difference between the mean yields? Use α 0.05, and
assume equal variances.
The solution using the eight-step hypothesis-testing procedure is as follows:
1. The parameters of interest are µ1 and µ2, the mean process yield using catalysts 1
and 2, respectively, and we want to know if µ1 - µ2 = 0.
2. H0: µ1 - µ2 = 0, or H0: µ1 = µ2
3. H1: µ1 ≠ µ2
4. α = 0.05
5. The test statistic is

6. Reject H0 if t0 > t0.025,14 = 2.145 or if t0 < -t0.025,14 = -2.145.


7. Computation:

8. Conclusions: Since -2.145 < t0 = -0.35 < 2.145, the null hypothesis cannot be rejected.
That is, at the 0.05 level of significance, we do not have strong evidence to conclude
that catalyst 2 results in a mean yield that differs from the mean yield when
catalyst 1 is used.
A P-value could also be used for decision making in this example. From Appendix Table
IV we find that t0.40,14 = 0.258 and t0.25,14 = 0.692. Therefore, since 0.258 < 0.35 < 0.692,
we conclude that lower and upper bounds on the P-value are 0.50 < P < 0.80. Therefore,
since the P-value exceeds α = 0.05, the null hypothesis cannot be rejected.
Case 2: 𝝈𝟐𝟏 ≠ 𝝈𝟐𝟐

is distributed approximately as t with degrees of freedom given by


Case 3. Paired t-Test

You can use a paired test when there is a natural pairing of observations in the
samples, such as when a sample group is tested twice — before and after an
experiment.
This analysis tool and its formula perform a paired two-sample Student's t-Test to
determine whether observations that are taken before a treatment and observations
taken after a treatment are likely to have come from distributions with equal
population means.
This t-Test form does not assume that the variances of both populations are equal.

Example:

An article in the Journal of Strain Analysis (1983, Vol. 18, No. 2) compares several
methods for predicting the shear strength for steel plate girders. Data for two of
these methods, the Karlsruhe and Lehigh procedures, when applied to nine specific
girders, are shown in Table 10-2. We wish to determine whether there is any
difference (on the average) between the two methods.

You might also like