0% found this document useful (0 votes)
129 views

Estimation and Hypothesis Testing

This document provides an overview of estimation and hypothesis testing for two populations. It discusses constructing confidence intervals and performing hypothesis tests on: 1) The difference between two population means for small, independent samples with unequal standard deviations using the t-distribution. 2) The difference between two population proportions for large, independent samples using the normal distribution. Examples and exercises are provided to illustrate calculating confidence intervals and conducting hypothesis tests to compare two populations.

Uploaded by

Wan Hamzah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
129 views

Estimation and Hypothesis Testing

This document provides an overview of estimation and hypothesis testing for two populations. It discusses constructing confidence intervals and performing hypothesis tests on: 1) The difference between two population means for small, independent samples with unequal standard deviations using the t-distribution. 2) The difference between two population proportions for large, independent samples using the normal distribution. Examples and exercises are provided to illustrate calculating confidence intervals and conducting hypothesis tests to compare two populations.

Uploaded by

Wan Hamzah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Estimation & Hypothesis

Testing (For two populations)


BEKG 2462
ENGINEERING MATHEMATICS & STATISTICS
Learning Outcomes
At the end of this lecture, student should be able
1. To construct confidence interval and perform a hypothesis testing about the
difference between two population means for small and independent
samples: unequal standard deviations.
2. To construct confidence interval and perform a hypothesis testing about the
difference between two population proportions for large and independent
samples.
Inferences About the Difference Between
Two Population Means for Small and
Independent Samples:
Unequal Standard Deviations
Inferences About the Difference Between
Two Population Means for Small and Independent
Samples: Unequal Standard Deviations
The t-distribution is used to make inferences about 𝜇1 − 𝜇2 :
(a) The two populations from which the two samples are drawn are
approximately normally distributed or normally distributed.
(b) The samples are small (𝑛1 < 30 and 𝑛2 < 30) and they are independent.
(c) The standard deviations 𝜎1 and 𝜎2 of the two populations are unknown but
they are unknown and unequal.
The degrees of freedom is given by

The degrees of freedom given by the formula is always rounded down to the
nearest integer.
ഥ𝟏 − 𝒙
Estimate of the Standard Deviation of 𝒙 ഥ𝟐

𝑠12 𝑠22
𝑠𝑥ҧ1 −𝑥ҧ2 = +
𝑛1 𝑛2

Interval Estimation of 𝝁𝟏 − 𝝁𝟐
The 1 − 𝛼 100% confidence interval for 𝜇1 − 𝜇2 is given by

𝑥ҧ1 − 𝑥ҧ2 ± 𝑡𝛼Τ2 𝑠𝑥ҧ1− 𝑥ҧ2


Test statistic t for 𝑥1ҧ − 𝑥ҧ2 is given by

𝑥1ҧ − 𝑥ҧ2 − 𝜇1 − 𝜇2
𝑡=
𝑠𝑥ҧ1 −𝑥ҧ2
Example
Assuming that the two populations are normally distributed with unequal and
unknown population standard deviations, construct a 98% confidence interval
for 𝜇1 − 𝜇2 for the following:
Solution
Exercise
A manufacturing company is interested in buying one of two machines. The company
tested the two machines for production purposes. The first machine was run for 15 days
and produced an average of 111 items per day with a standard deviation of 10 items.
The second machine was run for 18 days and produced an average of 118 items per day
with a standard deviation of 8 items. Assume that the production per day for each
machine is normally distributed and that the standard deviations of the daily
productions of the two populations are unequal.
(a) Make a 95% confidence interval for the difference between the two population
means.
(b) Using the 1% significance level, can you conclude that the mean number of items
produced per day by the first machine is lower than the second machine?
Inferences about the difference between
two population proportions for large and
independent samples
Inferences about the difference between two
population proportions for large and independent
samples
Mean, Standard Deviation, and Sampling Distribution of 𝑝Ƹ1 − 𝑝Ƹ 2
The sampling distribution of 𝑝Ƹ1 − 𝑝Ƹ 2 is (approximately) normal with:
The mean of ෝ𝑝1 − 𝑝Ƹ 2 , denoted by 𝜇𝑝ො1−𝑝ො2 , is given by
𝜇𝑝ො1−𝑝ො2 = 𝑝1 − 𝑝2
The standard deviation of 𝑝Ƹ1 − 𝑝Ƹ 2 , denoted by 𝜎𝑝ො1−𝑝ො2 , is given by
𝑝1 𝑞1 𝑝2 𝑞2 𝑝1 (1− 𝑝1 ) 𝑝2 (1− 𝑝2 )
𝜎𝑝ො1−𝑝ො2 = + = +
𝑛1 𝑛2 𝑛1 𝑛2
Interval Estimation of 𝑝1 − 𝑝2
The (1− 𝛼 )100% confidence interval for 𝑝1 − 𝑝2 is given by
known 𝜎1 and 𝜎2 unknown 𝜎1 and 𝜎2
confidence interval 𝑝Ƹ1 − 𝑝Ƹ 2 ± 𝑧𝛼/2 𝜎𝑝ො1 −𝑝ො2 𝑝Ƹ1 − 𝑝Ƹ 2 ± 𝑧𝛼/2 s𝑝ො1−𝑝ො2

𝑝Ƹ1 𝑞ො1 𝑝Ƹ 2 𝑞ො2 𝑝Ƹ1 𝑞ො1 𝑝Ƹ 2 𝑞ො2


Standard deviation 𝜎𝑝ො1−𝑝ො2 = + s𝑝ො1 −𝑝ො2 = +
𝑛1 𝑛2 𝑛1 𝑛2

The shape of the sampling distribution of 𝑥1ҧ − 𝑥ҧ2 is approximately normal and
both samples must be large.
Hypothesis Test about 𝑝1Ƹ − 𝑝Ƹ2
Test Statistic 𝑧 for 𝑝Ƹ1 − 𝑝Ƹ 2 is given by:
(𝑝Ƹ1 − 𝑝Ƹ 2 ) − ( 𝑝1 − 𝑝2 )
𝑧=
s𝑝ො1 −𝑝ො2
where
1 1
s𝑝ො1−𝑝ො2 = 𝑝𝑞 +
𝑛1 𝑛2

The value of 𝑝 is called the pooled sample proportion and is given by


𝑝Ƹ1 𝑛1 + 𝑝Ƹ 2 𝑛2
𝑝ҧ =
𝑛1 + 𝑛2
Hypothesis Test about 𝑝1Ƹ − 𝑝Ƹ2
Hypothesis Statement Rejection Region

𝐻0 : 𝑝1 = 𝑝2 𝑧 > 𝑧𝛼
𝐻1 : 𝑝1 > 𝑝2

𝐻0 : 𝑝1 = 𝑝2 𝑧 < −𝑧𝛼
𝐻1 : 𝑝1 < 𝑝2

𝐻0 : 𝑝1 = 𝑝2 𝑧 < −𝑧𝛼/2 or
𝐻1 : 𝑝1 ≠ 𝑝2 𝑧 > 𝑧𝛼/2
Example
Construct a 97% confidence interval for 𝑝1 − 𝑝2 for the following:

𝑛1 = 300, 𝑝Ƹ1 = 0.53, 𝑛2 = 200, 𝑝Ƹ 2 = 0.59


Construct a 97%
confidence
interval for
𝑝1 − 𝑝2
𝑛1 = 300, 𝑝Ƹ1 = 0.53, 𝑛2 = 200, 𝑝Ƹ 2 = 0.59
Solution
𝑝Ƹ1 𝑞ො1 𝑝Ƹ 2 𝑞ො2 (0.53)(0.47) (0.59)(0.41)
s𝑝ො1 −𝑝ො2 = + = + = 0.0452
𝑛1 𝑛2 300 200

From the normal table with 𝛼/2 = 0.015 , z-value is 2.17.


Thus, 97% confidence interval for 𝑝1 − 𝑝2 is
= 𝑝Ƹ1 − 𝑝Ƹ 2 ± 𝑧𝛼/2 s𝑝ො1−𝑝ො2
= 0.53 − 0.59 ± 2.17 0.0452
= (−0.1581,0.0381)
Example
One consumer protection agency wanted to check whether the proportions of luggage
lost between two airline companies differ or not. A sample of 600 luggage from airline
company P showed that 12 are lost. Another sample of 700 luggage from airline
company Q showed that 13 luggage are lost.
(a) What is the point estimate of the difference between the two population
proportions?
(b) Construct a 94% confidence interval for the differences in the proportions of all
luggage between airline companies P and Q.
(c) Testing at 3% significance level, can you conclude that the proportions of all luggage
lost between airline companies P and Q are different?
Solution(a)
(a) What is the point estimate of the difference between the two population
proportions?
Solution:
12 13
𝑛1 = 600, 𝑝Ƹ1 = = 0.02, 𝑛2 = 700, 𝑝Ƹ 2 = = 0.0186
600 700

1
The point estimate 𝑝1 − 𝑝2 ; 𝑝Ƹ1 − 𝑝Ƹ 2 = = 0.0014
700
𝑛1 = 600, 𝑝Ƹ1 = 0.02, 𝑛2 = 700, 𝑝Ƹ 2 = 0.0186

Solution(b)
(b) Construct a 94% confidence interval for the differences in the proportions of
all luggage between airline companies P and Q.
Solution:
𝑝Ƹ1 𝑞ො1 𝑝Ƹ 2 𝑞ො2 (0.02)(0.98) (0.0186)(0.9814)
s𝑝ො1−𝑝ො2 = + = +
𝑛1 𝑛2 600 700
= 0.0076
From the normal table with 𝛼/2 = 0.03 , z-value is 1.88
Thus, 94% confidence interval for 𝑝1 − 𝑝2 is
= 𝑝Ƹ1 − 𝑝Ƹ 2 ± 𝑧𝛼/2 s𝑝ො1−𝑝ො2
= 0.0014 ± 1.88 0.0076 = (−0.0129,0.0157)
Solution(c) 𝑛1 = 600, 𝑝Ƹ1 = 0.02, 𝑛2 = 700, 𝑝Ƹ 2 = 0.0186

(c) Testing at 3% significance level, can you conclude that the proportions of all luggage lost
between airline companies P and Q are different?
Solution:

Step 1: Hypothesis statement:


𝐻0 : 𝑝1 = 𝑝2
𝐻1 : 𝑝1 ≠ 𝑝2

Step 2: The critical values are ±2.17 and rejection regions


are 𝑧𝛼/2 < −2.17 and 𝑧𝛼/2 > 2.17.
Solution (c) 𝑛1 = 600, 𝑝Ƹ1 = 0.02, 𝑛2 = 700, 𝑝Ƹ 2 = 0.0186

(c) Testing at 3% significance level, can you conclude that the proportions of all luggage lost
between airline companies P and Q are different?
Solution:
(𝑝Ƹ1 − 𝑝Ƹ 2 ) − ( 𝑝1 − 𝑝2 ) 1 1 𝑝Ƹ1 𝑛1 + 𝑝Ƹ 2 𝑛2
Step 3: Test Statistic; 𝑧= s𝑝ො1 −𝑝ො2 = 𝑝𝑞ത + 𝑝ҧ =
s𝑝ො1−𝑝ො2 𝑛1 𝑛2 𝑛1 + 𝑛2

0.02 (600) + 0.0186 (700)


𝑝ҧ = = 0.0192
1300

1 1 Since −𝟐. 𝟏𝟕 < 𝒛 < 𝟐. 𝟏𝟕, then we accept 𝑯𝟎 .


s𝑝ො1−𝑝ො2 = 0.0192(0.9808) + = 0.0076
600 700 We conclude that the proportions of all luggage
lost between airline companies P and Q are the
(0.0014) − (0) same.
𝑧= = 0.1842
0.0076
Data Analysis using MS EXCEL

You might also like