0% found this document useful (0 votes)
1 views

Week 4 - Statistical hypothesis testing (2)(1)

The document outlines various statistical hypothesis testing methods, including Z-tests, t-tests, paired t-tests, F-tests, and tests for proportions. It provides learning objectives, assumptions, calculations, and examples for each method to compare population means, variances, and proportions. The document emphasizes the importance of normal distribution and random sampling in conducting these tests.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Week 4 - Statistical hypothesis testing (2)(1)

The document outlines various statistical hypothesis testing methods, including Z-tests, t-tests, paired t-tests, F-tests, and tests for proportions. It provides learning objectives, assumptions, calculations, and examples for each method to compare population means, variances, and proportions. The document emphasizes the importance of normal distribution and random sampling in conducting these tests.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

PRACTICE in BIOSTATISTICS

Statistical Hypothesis
testing (2)
Z-test statistic
t-test statistic
Paired t-test statistic
F-test statistic
Learning objectives

• Test hypotheses and construct confidence intervals about the


difference in 2 population means using the Z-test statistic

• Test hypotheses and construct confidence intervals about the


difference in 2 population means using the t-test statistic

• Test hypotheses and construct confidence intervals about the


difference in 2 population proportions using the Z-test statistic

• Test hypotheses and construct confidence intervals about 2


population variances using the F-test statistic
Difference in 2 Means using Z-test statistic

• Using the difference in the two sample means to test the


difference in the populations
• Assumption: - 𝛿 2 or 𝛿 is known
- Samples are independent
- Random sampling data
- Data is normally distributed, or Large sample
sizes (both 𝑛1 > 30 and 𝑛2 > 30)
*** The central limit theorem states that the difference
in two sample means is normally distributed for large
sample sizes regardless of the shape of the population
Difference in 2 Means using Z-test statistic

• Hypotheses: • Decision rule:

or

• Calculation:
ഥ𝟏 − 𝒙
𝒙 ഥ𝟐 − 𝝁𝟏 − 𝝁𝟐
𝒁𝒔𝒄𝒐𝒓𝒆 =
𝜹𝟏 𝟐 𝜹𝟐 𝟐
𝒏𝟏 + 𝒏𝟐

𝜹𝟏 𝟐 𝜹𝟐 𝟐
ഥ𝟏 − 𝒙
𝑨% 𝑪𝑰𝝁𝟏 −𝝁𝟐 = 𝒙 ഥ𝟐 ± 𝒁𝟏−𝑨% +
𝟐 𝒏𝟏 𝒏𝟐
A sample of 87 men showed that the average calcium
depletion per year is 3352 µg. The population standard
deviation is 1100 µg. A sample of 76 women showed
that the average calcium depletion per year is 5727 µg,
with a population standard deviation of 1700 µg. A
researcher wants to “prove” that women lose more
calcium. If they use 𝛼 = 0.001 and these sample data,
will they be able to reject a null hypothesis that
women annually lose as much (or less) calcium as men
do?
Suppose we want to conduct a hypothesis test to
determine whether the average annual growth for an
animal species is different from the average annual
growth of another species.
Difference in 2 Means using t-test statistic

• Using the difference in the two sample means to test the


difference in the population means
• Assumption: - 𝛿 2 or 𝛿 is unknown
- The variances are equal 𝜹𝟏 𝟐 = 𝜹𝟐 𝟐
- Samples are independent
- Random sampling data
- Data is normally distributed
Difference in 2 Means using t-test statistic

• Hypotheses: • Decision rule:

or

• Calculation:
ഥ𝟏 − 𝒙
𝒙 ഥ𝟐 − 𝝁𝟏 − 𝝁𝟐
𝒕𝒔𝒄𝒐𝒓𝒆 = , 𝒘𝒉𝒆𝒓𝒆 𝒅𝒇 = 𝒏𝟏 + 𝒏𝟐 − 𝟐
𝒔𝟏 𝟐 𝒏𝟏 − 𝟏 + 𝒔𝟐 𝟐 𝒏𝟐 − 𝟏 𝟏 𝟏
+
𝒏𝟏 + 𝒏𝟐 − 𝟐 𝒏𝟏 𝒏𝟐

𝒔𝟏 𝟐 𝒏𝟏 −𝟏 +𝒔𝟐 𝟐 𝒏𝟐 −𝟏 𝟏 𝟏
ഥ𝟏 − 𝒙
𝑨% 𝑪𝑰𝝁𝟏 −𝝁𝟐 = 𝒙 ഥ𝟐 ± 𝒕𝟏−𝑨% + 𝒏 , 𝒘𝒉𝒆𝒓𝒆 𝒅𝒇 = 𝒏𝟏 + 𝒏𝟐 − 𝟐
𝟐
𝒏𝟏 +𝒏𝟐 −𝟐 𝒏𝟏 𝟐
A researcher conducted an experiment on different
hatching methods for shrimps. She/he wants to know if
different hatching methods give different number of
hatched shrimps, thus, it will be used for the process.

Test the sample data for population data on the


average number of hatched shrimps in each method for
the effectiveness of the hatching methods.
A coffee manufacturer is interested in estimating the
difference in the average daily coffee consumption of
regular‐coffee drinkers and decaffeinated‐coffee
drinkers. Its researcher randomly selects 13
regular‐coffee drinkers and asks how many cups of
coffee per day they drink. He randomly locates 15
decaffeinated‐coffee drinkers and asks how many cups
of coffee per day they drink. The average for the
regular‐coffee drinkers is 4.35 cups, with a standard
deviation of 1.20 cups. The average for the
decaffeinated‐coffee drinkers is 6.84 cups, with a
standard deviation of 1.42 cups. The researcher
assumes, for each population, that the daily
consumption is normally distributed, and he constructs
a 95% confidence interval to estimate the difference in
the averages of the two populations.
Difference in 2 Means using paired t-test statistic

• Using the sample difference (D) between individual matched


samples to determine if there is difference in populations
• Assumption: - 𝛿 2 or 𝛿 is unknown
- Samples are dependent, used in:
before & after studies
same objective in different conditions
studies of twin

- Random sampling data


- Data is normally distributed
Difference in 2 Means using paired t-test statistic

• Hypotheses: • Decision rule:

• Calculation:
σ𝒅 𝟐
ഥ−𝑫
𝒅 σ𝒅 σ 𝒅𝟐
− 𝒏
𝒕𝒔𝒄𝒐𝒓𝒆 = ഥ
, 𝒘𝒉𝒆𝒓𝒆 𝒅𝒇 = 𝒏 − 𝟏, 𝐰𝐢𝐭𝐡 𝒅 = 𝒂𝒏𝒅 𝒔𝒅 =
𝒔𝒅 𝒏 𝒏−𝟏
𝒏

ഥ ±𝒕𝟏−𝑨% 𝒔𝒅 , 𝒘𝒉𝒆𝒓𝒆 𝒅𝒇 = 𝒏 − 𝟏
𝑨% 𝑪𝑰𝑫 = 𝒅
𝟐
𝒏
Suppose a stock market investor is
interested in determining whether
there is a significant difference in the
W/H (weight to height) ratio for 2
years old children of different ethnic
groups in Vietnam at 𝛼 = 0.01. In an
effort to study this question, the
investor randomly samples nine ethnic
groups from Vietnam and records the
W/H ratios for each if these groups at
the end of year 1 and at the end of
year 2.
An analyst estimated with
99% level of confidence that
there is no difference in the
number of bacteria colonies
with and without treatments
of antibiotics. Calculate the
confident interval for mean
difference to disclaim
his/her estimation.
Difference in 2 Proportions using Z-test statistic

• Using the sample proportions to statistical inference about two


population proportions
• Hypotheses: • Decision rule:

• Calculation:
ෝ𝟏 − 𝒑
𝒑 ෝ𝟐 − 𝒑𝟏 − 𝒑𝟐 𝒙𝟏 + 𝒙𝟐
𝒁𝒔𝒄𝒐𝒓𝒆 = ഥ=
, 𝒘𝒊𝒕𝒉 𝒑 ഥ =𝟏−𝒑
𝒂𝒏𝒅 𝒒 ഥ
𝟏 𝟏 𝒏𝟏 + 𝒏𝟐
ഥ. 𝒒
𝒑 ഥ +
𝒏𝟏 𝒏𝟐

ෝ𝟏 . 𝒒
𝒑 ෝ𝟏 𝒑ෝ𝟐 . 𝒒
ෝ𝟐
ෝ𝟏 − 𝒑
𝑨% 𝑪𝑰𝒑𝟏 −𝒑𝟐 = 𝒑 ෝ𝟐 ± 𝒛𝟏−𝑨% +
𝟐 𝒏𝟏 𝒏𝟐
Calculate the CI and test the hypotheses of different
population proportion between 2 samples, with
indicated significance level:

a/ 𝑛1 = 100, 𝑥1 = 24, 𝑛2 = 95, 𝑥2 = 39 𝑎𝑡 𝛼 = 0.001

b/ 𝑛1 = 400, 𝑥1 = 48, 𝑛2 = 480, 𝑥2 = 187 𝑎𝑡 𝛼 = 0.02


Equality of 2 Variances using F-test statistic

• Using the sample variances to test for equality of two


population variances
• Assumptions: - Data is normally distributed
- Random sampling data
• Hypotheses: • Decision rule:

• Calculation:
𝒔𝟏 𝟐
𝑭𝒔𝒄𝒐𝒓𝒆 = 𝟐 , 𝒘𝒊𝒕𝒉 𝒅𝒇𝒏𝒖𝒎𝒆𝒓𝒂𝒕𝒐𝒓 = 𝒗𝟏 = 𝒏𝟏 − 𝟏 𝒂𝒏𝒅 𝒅𝒇𝒅𝒆𝒏𝒐𝒎𝒊𝒏𝒂𝒕𝒐𝒓 = 𝒗𝟐 = 𝒏𝟐 − 𝟏
𝒔𝟐

*** Note: 𝐹𝑐𝑟𝑖 𝑖𝑛 𝑟𝑖𝑔ℎ𝑡 𝑡𝑎𝑖𝑙 𝑎𝑛𝑑 𝑙𝑒𝑓𝑡 𝑡𝑎𝑖𝑙 𝑎𝑟𝑒 𝑛𝑜𝑡 𝑠𝑦𝑚𝑚𝑒𝑡𝑟𝑖𝑐
Suppose a machine produces metal sheets that are specified to be 22
millimeters thick. Because of the machine, the operator, the raw
material, the manufacturing environment, and other factors, there is
variability in the thickness. Two machines produce these sheets.
Operators are concerned about the consistency of the two machines. To
test consistency, they randomly sample 10 sheets produced by machine 1
and 12 sheets produced by machine 2. The thickness measurements of
sheets from each machine are given in the table. Assume sheet thickness
is normally distributed in the population. How can we test to determine
whether the variance from each sample comes from the same population
variance (population variances are equal) or from different population
variances (population variances are not equal)?

You might also like