Lecture III
Lecture III
Fundamentals of Hypothesis
Testing: One-Sample Tests
Prof. Dr. Innocent NM.
Tel: 677540384/653754070
Email: [email protected]
1
2
In this Chapter:
4
Null Hypothesis: H0
Alternative Hypothesis: H1
The alternative hypothesis (denoted by H1 or Ha or HA) is the
statement that the parameter has a value that somehow differs from the
null hypothesis. It is a statement of what we believe is true if our
sample data cause us to reject the null hypothesis
The symbolic form of the alternative hypothesis must use one of these
5
symbols: ≠, <, >
Critical Region, Critical Value, Test Statistic
The critical region (or rejection region) is the set of all values of the
test statistic that cause us to reject the null hypothesis. For example, see
the red-shaded region in the previous figure
Significance Level
The significance level (denoted by α) is the probability that the test
statistic will fall in the critical region when the null hypothesis is
actually true.
Critical Value
A critical value is any value that separates the critical region (where we reject the
null hypothesis) from the values of the test statistic that do not lead to rejection of
the null hypothesis.
The critical values depend on the nature of the null hypothesis, the sampling
distribution that applies, and the significance level α. 6
Two-tailed, Right-tailed, Left-tailed Tests
7
Right-tailed Test
8
Left-tailed Test
9
Hypothesis Testing
The hypothesis test is used to evaluate the results
from a research study in which
1. A sample is selected from the population.
2. The treatment is administered to the sample.
3. After treatment, the individuals in the
sample are measured.
10
Hypothesis Testing (cont.)
• The purpose of the hypothesis test is to decide
between two explanations:
1. The difference between the sample and
the population can be explained by sampling
error (there does not appear to be a
treatment effect)
2. The difference between the sample and
the population is too large to be
explained by sampling error (there does
appear to be a treatment effect).
11
Steps in hypothesis testing
Step 1:
Data: determine variable, sample size (n), sample mean(Xbar) ,
population standard deviation or sample standard deviation if is
unknown
Step 2.
Assumptions :We have two cases:
Case1: Population is normally or approximately normally
distributed with known or unknown variance (sample size n may
be small or large),
Case 2: Population is not normal with known or unknown variance
(n is large i.e. n≥30).
12
… Steps in hypothesis testing
Step 3
For a given level of significance, α.
State the null hypothesis, H0
always states that the treatment has no effect (no
change, no difference). According to the null
hypothesis, the population mean after treatment is the
same is it was before treatment.
The α level establishes a criterion, or "cut-off", for
making a decision about the null hypothesis.
13
Step 4
14
Step 5
16
P-Values in Hypothesis Testing
Confidence Intervals:
Because a confidence interval estimate of a population parameter contains the likely
values of that parameter, reject a claim that the population parameter has a value
that is not included in the confidence interval
19
…P-Value
The P-value (or p-value or probability value) is the probability of
getting a value of the test statistic that is at least as extreme as the one
representing the sample data, assuming that the null hypothesis is true.
The null hypothesis is rejected if the P-value is very small, such as 0.05
or less
20
…P-Value
21
…P-Value
22
…P-Value
Example:
1. There, we had Ho: μ = 300 vs Ha: μ > 300 and Z = 1.18 What is the p-value for
this result?
2. Repeat (1) if μ ≠ 300 and Z = 1.18
23
…P-value
• The P-value answer the question: What is the
probability of the observed test statistic or one more
extreme when H0 is true?
24
Interpretation
Conventions*
P > 0.10 non-significant evidence against H0
0.05 < P 0.10 marginally significant evidence
0.01 < P 0.05 significant evidence against H0
P 0.01 highly significant evidence against H0
Examples
P =.27 non-significant evidence against H0
P =.01 highly significant evidence against H0
* It is unwise to draw firm borders for “significance”
25
Summary
26
The case where sigma is unknown
27
Illustrative Example: “Body Weight”
• The problem: In the 1970s, 20–29 year old men in the U.S.
had a mean μ body weight of 170 pounds. Standard deviation
σ was 40 pounds. We test whether mean body weight in the
population now differs.
28
Inference on the Mean of a Population,
Variance Known
Assumptions
29
Inference on the Mean of a Population,
Variance Known
Hypothesis Testing on the Mean
We wish to test:
30
Inference on the Mean of a Population,
Variance Known
Hypothesis Testing on the Mean
31
Inference on the Mean of a Population,
Variance Known
Hypothesis Testing on the Mean
32
Inference on the Mean of a Population,
Variance Known
Hypothesis Testing on the Mean
33
Note about Forming Your Own Claims (Hypotheses)
Test Statistic
a value used in making a decision about the null hypothesis, and is
found by converting the sample statistic to a score with the assumption
that the null hypothesis is true
34
… Test Statistic
35
Example:
A survey of n = 880 randomly selected adult drivers showed that 56%
(or p = 0.56) of those respondents admitted to running red lights. Find
the value of the test statistic for the claim that the majority of all adult
drivers admit to running
Solution
H0: p = 0.5
36
z statistic
• For the illustrative example, μ0 = 170
• We know σ = 40
• Take an SRS of n = 64. Therefore
40
SE x 5
n 64
• If we found a sample mean of 173, then
x 0 173 170
zstat 0.60
SE x 5
37
P(Z>3.56) = 1 – 0.9999 = 0.0001.
38
39
Homework
40
2. An inventor has developed a new, energy-efficient lawn mower engine. He claims
that the engine will run continuously for more than 5 hours (300 minutes) on a single
gallon of regular gasoline. (The leading brand lawnmower engine runs for 300
minutes on 1 gallon of gasoline.).
From his stock of engines, the inventor selects a simple random sample of 50
engines for testing. The engines run for an average of 305 minutes. The true standard
deviation σ is known and is equal to 30 minutes, and the run times of the engines are
normally distributed. Test hypothesis that the mean run time is more than 300
minutes. Use a 0.05 level of significance.
3. Example: The Brinell scale is a measure of how hard a material is. An engineer
hypothesizes that the mean Brinell score of all subcritically annealed ductile iron
pieces is not equal to 170. The engineer measured the Brinell score of 25 pieces of
this type of iron and calculated the sample mean to be 174.52 and the sample standard
deviation to be 10.31. Perform a hypothesis test that the true average Brinell score is
not equal to 170, as well as the corresponding confidence interval. Set alpha = 0.01
41
Errors in Hypothesis Tests
• Just because the sample mean (following treatment) is
different from the original population mean does not
necessarily indicate that the treatment has caused a change.
42
Errors in Hypothesis Tests (cont.)
43
Type I Errors
• A Type I error occurs when the sample data appear to show a treatment effect
when, in fact, there is none.
• In this case the researcher will reject the null hypothesis and falsely conclude that
the treatment has an effect.
• Type I errors are caused by unusual, unrepresentative samples. Just by chance the
researcher selects an extreme sample with the result that the sample falls in the
critical region even though the treatment has no effect.
• The hypothesis test is structured so that Type I errors are very unlikely;
specifically, the probability of a Type I error is equal to the alpha level.
44
Type II Errors
• A Type II error occurs when the sample does not appear to
have been affected by the treatment when, in fact, the
treatment does have an effect.
• In this case, the researcher will fail to reject the null hypothesis
and falsely conclude that the treatment does not have an effect.
45
Measuring Effect Size
• A hypothesis test evaluates the statistical
significance of the results from a research study.
• That is, the test determines whether or not it is
likely that the obtained sample mean occurred
without any contribution from a treatment effect.
• The hypothesis test is influenced not only by the
size of the treatment effect but also by the size
of the sample.
• Thus, even a very small effect can be significant
if it is observed in a very large sample.
46
Measuring Effect Size
• Because a significant effect does not necessarily
mean a large effect, it is recommended that the
hypothesis test be accompanied by a measure
of the effect size.
• We use Cohen=s d as a standardized measure
of effect size.
• Much like a z-score, Cohen=s d measures the
size of the mean difference in terms of the
standard deviation.
47
Power
• β ≡ probability of a Type II error
β = Pr(retain H0 | H0 false)
(the “|” is read as “given”)
| | n
1 z1 0 a
2
| 170 190 | 16
1.96
40
0.04
0.5160 50
Sample Size Requirements
Sample size for one-sample z test:
n
2
z1 z1
2
2
2
where
1 – β ≡ desired power
α ≡ desired significance level (two-sided)
σ ≡ population standard deviation
Δ = μ0 – μa ≡ the difference worth detecting
51
Example: Sample Size
Requirement
How large a sample is needed for a one-sample z
test with 90% power and α = 0.05 (two-tailed)
when σ = 40? Let H0: μ = 170 and Ha: μ = 190
(thus, Δ = μ0 − μa = 170 – 190 = −20)
n
2
z1 z1
2
2
40 (1.28 1.96)
2 2
41.99
2
20 2