0% found this document useful (0 votes)
19 views28 pages

Chapter 8 - Hypothesis Testing - 2populations - L1 - Jan 2024

Uploaded by

afiqmaniyamin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views28 pages

Chapter 8 - Hypothesis Testing - 2populations - L1 - Jan 2024

Uploaded by

afiqmaniyamin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Chapter 8

Hypothesis Testing
- Two Populations

L 1: Hyp Testing – mean ( 2 populations, variance known)

L2: Hyp Testing – mean ( 2 populations, variance


unknown but equal)
Learning Objectives

At the end of the lesson student should be able to

n Carry out hypothesis testing for difference in


means for two normal populations ( variances
known)

n Carry out hypothesis testing for difference in


means for two normal populations ( variances
unknown but equal)
Comparing Two Population Means

- Comparison between two different probability distributions.


Purpose of study is to determine if there is a difference
between the two probability distributions.
- Experimenter has two sets of data observations.
x1 , x 2 ,...., x n Population A

y1 , y 2 ,...., y n Population B

One aspect of assessment is the comparison between the


means mA and mB

If mA = mB Population A = Population B
( probability distributions are equal)

If m A  m B probability distributions are different


Comparing Two Population Means

Normal distribution,
Normal distribution, Unknown variances,
Known variances,

Case 1:  12   22 Case 2:  12   22
TEST ABOUT TWO NORMAL MEANS
WHEN TWO VARIANCES ARE KNOWN
Assumptions:

X 11 , X 12 , X 13 ,...., X 1n1 is a random sample from population 1 with


mean m1 and variance  21
2. X 21 , X 22 , X 23 ,...., X 2 n2 is a random sample from population 2 with
mean m2 and variance  22

3. The two populations represented by X1 and X2 are independent.

4. Both populations are normally distributed.

Point estimator for the difference between two means m1  m 2  x1  x 2


TEST ABOUT TWO NORMAL MEANS
WHEN TWO VARIANCES ARE KNOWN
v Test about the difference
Null Hypothesis:

H 0 : m1  m 2 vs H1 : m1  m 2 , or H1 : m1  m 2 or H1 : m1  m 2

Test statistic: (x1  x2 )  (m1  m2 )


Z  ~ N(0,1) if m1  m2
(12 / n1 )  ( 22 / n2 )

Alternative Hypothesis Rejection Criteria ( Reject H0)

H 1 : m1  m 2 Z  z / 2 or Z   z / 2

H 1 : m1  m 2 Z  z
H 1 : m1  m 2 Z   z
A 100(1   )% confidence interval for m1  m 2 is

 
2 2 2
  2
( x1  x2 )  z / 2 1
  m1  m2  ( x1  x2 )  z / 2
2 1
 2
n1 n2 n1 n2

The procedures used in hypothesis testing for 2 populations


are the same as in the hypothesis testing for one population.

Conclusion of the test could also be based on the confidence


Interval and the p-value.
Example 1:
A random sample of size n = 25 taken from a normal population
with standard deviation 5.2 has a mean equals 81. A second
random sample of size n = 36, taken from a different normal
population with standard deviation 3.4, has a mean equals 76. Do
the data indicate that the true mean value m1 and m2 are different?
Carry out a test at  = 0.01
Solution
1. From the problem context, identify the parameter of interest
Parameter of interest; the difference between the true average m1  m 2

2. State the null hypothesis H0 and appropriate alternative


hypothesis, H1
H 0 :m1  m 2 H 1 :m1  m 2
3. Determine the appropriate test statistic

(x1  x2 )  (m1  m2 )
Z
(12 / n1 )  ( 22 / n2 )

4. Critical value given  = 0.01

/2 = 0.01/2 = 0.005,
So z /2 = z 0.005 = 2.58
5. State the rejection region for the statistic

Re ject H 0 if Z   z 0.005 or Z  z 0.005


 Re ject H 0 if Z  2.58 or Z  2.58
6. Compute any necessary sample quantities, substitute these into
the equation for the test statistic, and compute the value

X 1  81, X 2  76,  1  5.2,  2 3.4, n1  25, n2  36,

Compute the value of the test statistic:

( x1  x 2 )  ( m1  m 2 ) 81  76  ( 0)
Z   4.22
( 12 / n1 )  ( 22 / n2 ) (5.2 2 / 25 )  (3.4 2 / 36 )

7. Make a decision

Decision: since Z  4.22  2.58 , we reject H0


the data shows a sufficient evidence that m1  m 2
Example 1:
Find 99% CI on the difference in mean strength

X 1  81, X 2  76,  1  5.2,  2 3.4, n1  25, n2  36,


Solution From formula:

 12  22  12  22
( x1  x 2 )  z  / 2   ( m 1  m 2 )  ( x1  x 2 )  z  / 2 
n1 n2 n1 n2

(5.2) 2 (3.4) 2 (5.2) 2 (3.4) 2


 (81  76)  2.58   (m1  m 2 )  (81  76)  2.58 
25 36 25 36

 1.944  (m1  m 2 )  8.055

99% Confidence interval is [1.944, 8.055 ]


Example 2:
The burning rates of two different solid-fuel propellants used
in rocket systems are being studied. It is known that both
propellants have approximately the same standard deviation
of burning rate, that is 3cm/second. Two random samples
with the same sample size of 20 specimens are tested and
the sample mean burning rates are 18 cm/second and 24
cm/second respectively.

i. Test the hypothesis that both propellants have the


same mean burning rate, using the P-value
approach.
ii. Construct a 95% two-sided CI on the difference in
means, m1  m2. What is the practical meaning of this
interval?
Solution
1. From the problem context, identify the parameter of interest
Parameter of interest; the difference between the true average m1  m 2

2. State the null hypothesis H0 and appropriate alternative


hypothesis, H1
H 0 :m1  m 2 H 1 :m1  m 2

3. Determine the appropriate test statistic

(x1  x2 )  (m1  m2 )
Z
(12 / n1 )  ( 22 / n2 )
6. Compute any necessary sample quantities, substitute these into
the equation for the test statistic, and compute the value

X 1  18, X 2  24,  12  9,  22  9, n1  n2  20,


Compute the value of the test statistic:
(18  24)  0
Z  6.32
9 9

20 20
Find the p - value

P-value = 2(1   (6.32))  2(1  1)  0


Since p-value < 0.05, we reject the null hypothesis at =0.05.
.

Both propellants are not the same mean burning rate.


Find a 95% CI on the difference in means.

From formula:

 12  22  12  22
( x1  x 2 )  z  / 2   ( m 1  m 2 )  ( x1  x 2 )  z  / 2 
n1 n2 n1 n2

9 9 9 9
18  24  1.96   m1  m 2  18  24   1.96 
20 20 20 20

  7.589  m1  m 2  4.141

Since m1  m2  0 is not in the interval then we reject H0.


Both propellants are not the same mean burning rate.
TEST ABOUT TWO MEANS
WHEN TWO VARIANCES ARE UNKNOWN

v Test about the difference m1  m 2  0


Case 1:
 12   22
Null Hypothesis:

H 0 : m1  m 2 vs H1 : m1  m 2 , or H1 : m1  m 2 or H1 : m1  m 2
Test statistic:
( x1  x 2 )  ( m 1  m 2 )
T 
1 1
S p 
n1 n2

pooled estimator of 2 ( n  1) S 2
 ( n  1) S 2
S p2  1 1 2 2
n1  n2  2
TEST ABOUT TWO NORMAL MEANS WHEN TWO
VARIANCES ARE UNKNOWN

v Test about the difference m1  m 2  0

Case 1:  12   22
Critical region:

Alternative Hypothesis Rejection Criteria ( Reject H0)

H 1 : m1  m 2 T  t / 2, n1  n2  2 or T  t / 2, n1  n2  2

H 1 : m1  m 2 T  t , n1  n2  2

H 1 : m1  m 2 T  t , n1  n2  2
Confidence interval:

A 100(1   )% two- sided confidence interval for m1  m 2


when variance is unknown :

1 1
( x1  x2 )  t / 2,n1  n2  2 S p   m1  m 2
n1 n2
1 1
 ( x1  x2 )  t / 2,n1  n2  2 S p 
n1 n2
Example 3:

Professor Adam taught the same large lecture course for two terms.
Except for negligible differences the courses were the same. However,
one met at 8am and the other met at 11am. The two courses were given
final exams of the same degree of difficulty and covering the same
material. Both exams were worth 100 points. A random sample of 16
students from the 8am class had an average score 73.2 with standard
deviation 8.1. A random sample of 16 students from the 11am class had
an average score of 78.1 with standard deviation 10. Assume that the
populations variance are the same and the data are drawn from a normal
distribution.

i. Do these data indicate that the mean score for the 11am class is
higher than mean score for the 8am class?. Use a=0.05.

ii. What is the P-value for this test? What is your conclusion?

iii. Construct a 95% two-sided CI for the difference in means scores.


Interpret this interval.
Solution
1. From the problem context, identify the parameter of interest
Parameter of interest; the difference between the effect of the
score m1  m 2

2. State the null hypothesis H0 and appropriate alternative


hypothesis, H1

H 0 : m1  m 2 H1 : m1  m 2
3. Determine the appropriate test statistic

( x1  x 2 )  ( m 1  m 2 )
T 
1 1
S p 
n1 n2
4. Critical value given  = 0.05 t , n1  n2  2  t0.05,30  1.697

5. State the rejection region for the statistic

 Re ject H 0 if T  1.697

6. Compute the value of the test statistic

X 1  73.2, X 2  78.1, s1  8.1, s2  10.0, n1  16, n2  16

(n1  1) S12  (n2  1) S 22 (15)(8.1) 2  (15)(10) 2


Sp    9.1
n1  n2  2 16  16  2

( x1  x 2 )  ( m 1  m 2 )
T 
1 1
S p 
n1 n2
X 1  73.2, X 2  78.1, s1  8.1, s2  10.0, n1  16, n2  16
Compute the value of the test statistic:

( x1  x2 )  (m1  m2 ) (73.2  78.1)  (0)


T   1.52
1 1 1 1
Sp  9.1 (  )
n1 n2 16 16

7. Make a decision

Decision: since T   1 .52   1 .697 , we fail to reject H 0. Not


enough evidence to say that the mean score at 11am is better than at
8am.
Find the p - value
From t-table with 30 df, T =1.52 is between t= 1.310 and
t=1.697, which give 0.05<p<0.1.

Since P > 0.05, thus we fail to reject H0 at the 0.05 level of


significance and conclude that there is not enough
evidence to say that test at 11am is better result from test
at 8am.
Find a 95% CI on the difference in means where t0.025,30 =2.042 is

x 1  73 .2, x 2  78 .1, s p  8 .95 , n1  16 , n 2  16


1 1 1 1
( x 1  x 2 )  t / 2 , n1  n 2  2 ( s p )   m1  m 2  ( x 1  x 2 )  t / 2 , n1  n 2  2 ( s p ) 
n1 n 2 n1 n 2
 1 1   1 1 

( 73 .2  78 .1)  ( 2 .042 )( 9 .1)   
 m1  m 2  ( 73 .2  78 .1)  ( 2 .042 )( 9 .1)  
 
 16 16   16 16 
- 11 .47  m1  m 2  1.67

Since m1  m2  0 is inside the interval then we fail to reject


H0 and conclude that there is not enough evidence to say
that test at 11am is better result from test at 8am.
Exercise:
The diameter of steel rods manufactured on two different extrusion
machines is being investigated. Two random samples of sizes n1 = 15
2 2
and n2 = 17 are selected, and x1  8.37, s1  0.35 and x2  8.68, s2  0.40
respectively. Assume that data are drawn normal distribution with
equal variances.

a) Is there evidence to support the claim that the two machines produce
rods with different mean diameters ? Use the p – value approach.

b) Construct a 95% two-sided CI on the difference in mean rod


diameter.
Solution
1. From the problem context, identify the parameter of interest
Parameter of interest; the difference between the true average m1  m 2

2. State the null hypothesis H0 and appropriate alternative


hypothesis, H1
H 0 :m1  m 2 H 1 :m1  m 2

3. Determine the appropriate test statistic

( x1  x 2 )  ( m 1  m 2 )
T 
1 1
S p 
n1 n2
6. Compute any necessary sample quantities, substitute these into
the equation for the test statistic, and compute the value
( n1  1)s12  ( n 2  1)s22 14(0.35)  16(0.40)
sp    0.614
n1  n 2  2 30

(8.73  8.68)  0
T  0.230
1 1
0.614 
15 17
Find the p - value

P-value = 2P t  0.230   2(0.40), P-value > 0.80


Since p-value > 0.05, we do not reject the null hypothesis and conclude
the two machines do not produce rods with significantly different
mean diameters
b) 95% confidence interval: t0.025,30 = 2.042

1 1 1 1
  t / 2,n1  n2  2 ( s p )   2.042(0.614) 
n1 n2 15 17

(8.73  8.68)    (m1  m 2 )  (8.73  8.68)  

 0.394  (m1  m 2 )  0.494

95% Confidence interval is [-0.394, 0.494 ]

You might also like