Mathematics Soln
Mathematics Soln
2)
Definition (Independent vs. Dependent Samples)- Two samples are called independent
if they are drawn from two different populations and the elements of one sample have no
relationship to the elements of the second sample.
Example- A car magazine is comparing the total repair costs incurred during the first
three years on two sports cars, the T-999 and the XPY. Random samples of 45 T-999s and 51
XPYs are taken. Let X1 , X2 , · · · , X45 be the repair costs for the 45 T-999s and Y1 , Y2 , · · · , Y51
be the repair costs for the 51 XPYs. These two samples are independent.
Example- A nutritionist compares the average weights of 50 participants before and after
they went through a weight-loss program. Let X1 , X2 , · · · , X50 be the weights before and
Y1 , Y2 , · · · , Y50 be the weights after completing the program. These samples are dependent
because they involve the same 50 people.
Consider two populations with population means µ1 and µ2 and populations SDs σ1 and σ2 .
We will would like to perform estimation and hypothesis testing for the difference in the
population means µ1 − µ2 . For this purpose, we take n1 samples X1 , X2 , · · · , Xn1 from
population 1 and n2 samples Y1 , Y2 , · · · , Yn2 from population 2. Let X̄ and Ȳ be the sample
means for the two populations, i.e.,
X1 + X2 + · · · + Xn1
X̄ =
n1
and
Y1 + Y2 + · · · + Yn2
Ȳ = .
n2
Then it is easy to see that X̄ − Ȳ serves as an unbiased estimator for µ1 − µ2 , i.e.,
µX̄−Ȳ = µ1 − µ2 .
1
2
What can we say about the SD and sampling distribution of X̄ − Ȳ ? When the two samples
are independent, we have the following result:
Fact- If the two samples are independent, then
q
σX̄−Ȳ = σX2 + σY2 . (?)
(1) The population distribution for both populations 1 and population 2 is (approxi-
mately) normal.
(2) The sample sizes satisfy n1 ≥ 30 and n2 ≥ 30.
X̄ − Ȳ ∼ N µX̄−Ȳ , σX̄−Ȳ .
n1 n2 √σ1 √σ2
Remark- If N1
≤ 0.05 and N2
≤ 0.05, then σX̄ = n1
and σȲ = n2
and we can
write (?) as s
σ12 σ22
σX̄−Ȳ = + .
n1 n2
Remark- Throughout the rest of this chapter we assume the two samples under study are
independent and the populations satisfy one of the conditions provided in the previous Fact.
3
E = zσX̄−Ȳ
is the margin of error and z > 0 is the unique number that satisfies
P (Z < z) = 1 − α/2.
and
n2 = 51, ȳ = 3850, σ2 = 1000.
Finally, x̄ − ȳ = 3300 − 3850 = −550 and the desired interval estimate is given by
H0 : µ1 = µ2 , H1 : µ1 6= µ2 .
5
x̄ − ȳ ≥ t OR x̄ − ȳ ≤ −t
for some t > 0. The critical points are where x̄ − ȳ = ±t. The relationship between
α and t comes through
(3) The evidence lies on the critical points when x̄ − ȳ = ±t, i.e., 3300 − 3850 = ±t
giving t = 550. Also, recall that
X̄ − Ȳ ∼ N µX̄−Ȳ , σX̄−Ȳ .
X̄ − Ȳ ∼ N (0, 183.932 ).
We have
X̄ − Ȳ − 0 550 − 0 X̄ − Ȳ − 0 −550 − 0
= P ≥ H0 is true + P ≤ H0 is true
183.93 183.93 183.93 183.93
= P (Z ≥ 2.99) + P (Z ≤ −2.99)
= 2P (Z ≤ −2.99)
= 2 × 0.0014
= 0.0028.
We will call the common value of σ1 , σ2 by σ, i.e., σ1 = σ2 = σ. Recall the sample variances
for the two samples are the random variables
( Xi )2 ( Y i )2
P P
Xi2 − Yi2 −
P P
n1 n2
S12 = , S22 = .
n1 − 1 n2 − 1
µS 2 = σ 2 .
p
is the margin of error and t > 0 is the unique number that satisfies
n1 = 25, x̄ = 29, s1 = 7
and
n2 = 23, ȳ = 22, s2 = 6.2.
We need to find t > 0 such that P (T25+23−2 > t) = α/2, i.e., P (T46 > t) = 0.05. Using Table
V, we read t = 1.679. Then the margin of error is
r r
1 1 1 1
E = tsp + = 1.679 × 6.63 × + = 3.216.
n1 n2 25 23
x̄ − ȳ x̄ − ȳ
q ≥ t OR q ≤ −t
sp n11 + 1
n2
sp n11 + 1
n2
X̄ − Ȳ
q ∼ Tn1 +n2 −2 .
Sp n11 + 1
n2
Example- Consider the setup in the previous example. Test at a 5% significance level
whether the two population means are different.
We have
α = 0.05,
n1 = 25, x̄ = 29, s1 = 7
and
n2 = 23, ȳ = 22, s2 = 6.2.
1
2
H0 : µ1 = µ2 , H1 : µ1 6= µ2 .
x̄ − ȳ x̄ − ȳ
q ≥ t OR q ≤ −t,
sp n11 + 1
n2
sp n11 + 1
n2
for some t > 0. The critical points are where qx̄−ȳ = ±t. The relationship
1
sp n
+ n1
1 2
between α and t comes through
X̄ − Ȳ X̄ − Ȳ
α = P (reject H0 | H0 is true) = P ≥ t OR ≤ −t H0 is true
q q
Sp n11 + 1
n2
Sp n11 + 1
n2
(3) Recall
n1 = 25, x̄ = 29, n2 = 23, ȳ = 22, sp = 6.63.
X̄ − Ȳ
q ∼ Tn1 +n2 −2 = T25+23−2 = T46 .
Sp n11 + 1
n2
Then
p-value = P (T46 ≥ 3.654 OR T46 ≤ −3.654)
= 2P (T46 ≥ 3.654).
p-value ≤ 0.002.