0% found this document useful (0 votes)
28 views

Week06 Hypothesis Testing

This document provides an overview of hypothesis testing. It discusses the basics, including defining the null and alternative hypotheses, calculating the test statistic, determining the P-value or critical value, and stating conclusions. Specific examples are provided to demonstrate key concepts like writing hypotheses for a claim about a proportion, calculating the test statistic, and determining the P-value. The document emphasizes correctly characterizing a test as one-tailed or two-tailed in order to determine critical values and P-values.

Uploaded by

a.bocus2510
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views

Week06 Hypothesis Testing

This document provides an overview of hypothesis testing. It discusses the basics, including defining the null and alternative hypotheses, calculating the test statistic, determining the P-value or critical value, and stating conclusions. Specific examples are provided to demonstrate key concepts like writing hypotheses for a claim about a proportion, calculating the test statistic, and determining the P-value. The document emphasizes correctly characterizing a test as one-tailed or two-tailed in order to determine critical values and P-values.

Uploaded by

a.bocus2510
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 125

Lecture 6

Hypothesis Testing
 Review and Preview
 Basics of Hypothesis Testing
 Testing a Claim about a Proportion
 Testing a Claim About a Mean
 Testing a Claim About a Standard Deviation
or Variance
 Type I & Type II errors

SIS 1037Y 2020/2021 2


 We have used descriptive statistics when we
summarized data using tools such as graphs, and
statistics such as the mean and standard deviation.
 Methods of inferential statistics use sample data to
make an inference or conclusion about a population.
 The two main activities of inferential statistics are
using sample data to
◦ (1) estimate a population parameter (such as estimating a
population parameter with a confidence interval), and
◦ (2) test a hypothesis or claim about a population parameter.
 We have seen methods for estimating a population
parameter with a confidence interval, and presently
we look at method of hypothesis testing.

SIS 1037Y 2020/2021 3


 The main objective of this lecture is to
develop the ability to conduct hypothesis
tests for claims made about a population
proportion p, a population mean μ, or a
population standard deviation σ.

SIS 1037Y 2020/2021 4


 Genetics: The Genetics & IVF Institute claims
that its XSORT method allows couples to
increase the probability of having a baby girl.
 Business: A newspaper cites a
PriceGrabber.com survey of 1631 subjects
and claims that a majority have heard of
Kindle as an e-book reader.
 Health: It is often claimed that the mean
body temperature is 98.6 degrees. We can
test this claim using a sample of 106 body
temperatures with a mean of 98.2 degrees.

SIS 1037Y 2020/2021 5


 When conducting hypothesis tests as
described in this and the following topics,
instead of jumping directly to procedures and
calculations, be sure to consider the context
of the data, the source of the data, and the
sampling method used to obtain the sample
data.

SIS 1037Y 2020/2021 6


 Review and Preview
 Basics of Hypothesis Testing
 Testing a Claim about a Proportion
 Testing a Claim About a Mean
 Testing a Claim About a Standard Deviation
or Variance

SIS 1037Y 2020/2021 7


 This section presents individual components of a
hypothesis test. We should know and understand
the following:
◦ How to identify the null hypothesis and alternative
hypothesis from a given claim, and how to express both
in symbolic form
◦ How to calculate the value of the test statistic, given a
claim and sample data
◦ How to choose the sampling distribution that is relevant
◦ How to identify the P-value or identify the critical
value(s)
◦ How to state the conclusion about a claim in simple and
nontechnical terms

SIS 1037Y 2020/2021 8


 A hypothesis is a claim or statement about a
property of a population.
 A hypothesis test is a procedure for testing a
claim about a property of a population.

SIS 1037Y 2020/2021 9


 The null hypothesis (denoted by H0) is a
statement that the value of a population
parameter (such as proportion, mean, or
standard deviation) is equal to some claimed
value.
◦ We test the null hypothesis directly in the sense that we
assume it is true and reach a conclusion to either reject
H0 or fail to reject H0.
 The alternative hypothesis (denoted by H1 or HA)
is the statement that the parameter has a value
that somehow differs from the null hypothesis.
 The symbolic form of the alternative hypothesis
must use one of these symbols: <, >, ≠.

SIS 1037Y 2020/2021 10


 Rare Event Rule for Inferential Statistics
◦ If, under a given assumption, the probability of a
particular observed event is exceptionally small, we
conclude that the assumption is probably not
correct.
 Note about Forming Your Own Claims
(Hypotheses)
◦ If you are conducting a study and want to use a
hypothesis test to support your claim, the claim
must be worded so that it becomes the alternative
hypothesis.

SIS 1037Y 2020/2021 11


SIS 1037Y 2020/2021 12
Assume that 100 babies are born to 100
couples treated with the XSORT method of
gender selection that is claimed to make girls
more likely.
We observe 58 girls in 100 babies. Write the
hypotheses to test the claim the “with the
XSORT method, the proportion of girls is
greater than the 50% that occurs without any
treatment”.
H0 : p = 0.5
H1 : p > 0.5

SIS 1037Y 2020/2021 13


SIS 1037Y 2020/2021 14
 The significance level (denoted by α) is the
probability that the test statistic will fall in the
critical region when the null hypothesis is
actually true (making the mistake of rejecting
the null hypothesis when it is true).
 This is the same α introduced previously.
 Common choices for α are 0.05, 0.01, and
0.10.

SIS 1037Y 2020/2021 15


SIS 1037Y 2020/2021 16
 The test statistic is a value used in making a
decision about the null hypothesis, and is
found by converting the sample statistic to a
score with the assumption that the null
hypothesis is true.

SIS 1037Y 2020/2021 17


 Find the Value of the Test Statistic, Then Find
Either the P-Value or the Critical Value(s)
 First transform the relevant sample statistic
to a standardized score called the test
statistic.
 Then find the P-Value or the critical value(s).

SIS 1037Y 2020/2021 18


 Again consider the claim that the XSORT
method of gender selection increases the
likelihood of having a baby girl.
 Preliminary results from a test of the XSORT
method of gender selection involved 100
couples who gave birth to 58 girls and 42 boys.
 Use the given claim and the preliminary results
to calculate the value of the test statistic.
 Use the format of the test statistic given above,
so that a normal distribution is used to
approximate a binomial distribution.

SIS 1037Y 2020/2021 19


 The claim that the XSORT method of gender
selection increases the likelihood of having a
baby girl results in the following null and
alternative hypotheses:
 H0 : p = 0.5
 H1 : p > 0.5
 We work under the assumption that the null
hypothesis is true with p = 0.5.
 The sample proportion of 58 girls in 10 births
results in: p̑ = 58/100 = 0.58

SIS 1037Y 2020/2021 20


pˆ  p 0.58  0.5
z   1.60
pq  0.5 0.5
n 100
 We know from previous lectures that a z
score of 1.60 is not “unusual”.
 At first glance, 58 girls in 100 births does
not seem to support the claim that the
XSORT method increases the likelihood a
having a girl (more than a 50% chance).

SIS 1037Y 2020/2021 21


 The tails in a distribution are the extreme
regions bounded by critical values.
 Determinations of P-values and critical values
are affected by whether a critical region is in
two tails, the left tail, or the right tail. It,
therefore, becomes important to correctly
characterize a hypothesis test as two-tailed,
left-tailed, or right-tailed.

SIS 1037Y 2020/2021 22


H 0 :
α is divided equally between the two
H1 : tails of the critical region

SIS 1037Y 2020/2021 23


H 0 :
All α in the left tail
H1 :

SIS 1037Y 2020/2021 24


H 0 :
All α in the right tail
H1 :

SIS 1037Y 2020/2021 25


 The P-value (or probability value) is the
probability of getting a value of the test statistic
that is at least as extreme as the one
representing the sample data, assuming that the
null hypothesis is true.
 Critical region in the left tail: P-value = area to
the left of the test statistic
 Critical region in the right tail: P-value = area to
the right of the test statistic
 Critical region in two tails: P-value = twice the
area in the tail beyond the test statistic
 The null hypothesis is rejected if the P-value is
very small, such as 0.05 or less.

SIS 1037Y 2020/2021 26


 The claim that the XSORT method of gender
selection increases the likelihood of having
a baby girl results in the following null and
alternative hypotheses:
 H0 : p = 0.5
 H1 : p > 0.5
 The test statistic was :
pˆ  p 0.58  0.5
z   1.60
pq  0.5 0.5
n 100
SIS 1037Y 2020/2021 27
 The test statistic of z = 1.60 has an area of
0.0548 to its right, so a right-tailed test
with test statistic z = 1.60 has a P-value of
0.0548

SIS 1037Y 2020/2021 28


SIS 1037Y 2020/2021 29
 The critical region (or rejection region) is the set
of all values of the test statistic that cause us to
reject the null hypothesis. For example, see the
red-shaded region in the previous figures.
 A critical value is any value that separates the
critical region (where we reject the null
hypothesis) from the values of the test statistic
that do not lead to rejection of the null
hypothesis.
 The critical values depend on the nature of the
null hypothesis, the sampling distribution that
applies, and the significance level α.

SIS 1037Y 2020/2021 30


 For the XSORT birth hypothesis test, the critical
value and critical region for an α = 0.05 test are
shown below:

SIS 1037Y 2020/2021 31


 Do not confuse a P-value with a proportion p.
 Know this distinction:
◦ P-value = probability of getting a test statistic at
least as extreme as the one representing sample
data
◦ p = population proportion

SIS 1037Y 2020/2021 32


 The methodologies depend on if you are using
the P-Value method or the critical value method.
 Decision Criterion
◦ P-value Method:
 Using the significance level α:
 If P-value ≤ α, reject H0.
 If P-value > α, fail to reject H0.
◦ Critical Value Method:
 If the test statistic falls within the critical region, reject H0.
 If the test statistic does not fall within the critical region, fail
to reject H0.

SIS 1037Y 2020/2021 33


 For the XSORT baby gender test, the test had a
test statistic of z = 1.60 and a P-Value of
0.0548. We tested:
◦ H0 : p = 0.5
◦ H1 : p > 0.5
 Using the P-Value method, we would fail to
reject the null at the α = 0.05 level.
 Using the critical value method, we would fail
to reject the null because the test statistic of z
= 1.60 does not fall in the rejection region.
 We come to the same decision using either
method.

SIS 1037Y 2020/2021 34


 State a final conclusion that addresses the
original claim with wording that can be
understood by those without knowledge of
statistical procedures
 Example:
◦ For the XSORT baby gender test, there was not
sufficient evidence to support the claim that the
XSORT method is effective in increasing the
probability that a baby girl will be born.

SIS 1037Y 2020/2021 35


SIS 1037Y 2020/2021 36
 Never conclude a hypothesis test with a
statement of “reject the null hypothesis” or
“fail to reject the null hypothesis.”
 Always make sense of the conclusion with a
statement that uses simple nontechnical
wording that addresses the original claim.
 Accept Versus Fail to Reject
◦ Some texts use “accept the null hypothesis.”
◦ We are not proving the null hypothesis.
◦ Fail to reject says more correctly that the available
evidence is not strong enough to warrant rejection
of the null hypothesis.

SIS 1037Y 2020/2021 37


 Review and Preview
 Basics of Hypothesis Testing
 Testing a Claim about a Proportion
 Testing a Claim About a Mean
 Testing a Claim About a Standard Deviation
or Variance

SIS 1037Y 2020/2021 38


 Complete procedures for testing a hypothesis (or
claim) made about a population proportion.
 Components introduced in the previous weeks for
the P-value method, the traditional method or the
use of confidence intervals will be used.
 Two common methods for testing a claim about a
population proportion are:
◦ 1) to use a normal distribution as an approximation to the
binomial distribution, and
◦ (2) to use an exact method based on the binomial
probability distribution.

SIS 1037Y 2020/2021 39


 Basic Methods of Testing Claims
about a Population Proportion p
 Notation
◦ n = sample size or number of trials
◦ P̑ = x/n
◦ p = population proportion
◦ q= 1 – p

SIS 1037Y 2020/2021 40


 1) The sample observations are a simple
random sample.
 2) The conditions for a binomial distribution
are satisfied.
 The conditions np ≥ 5 and nq ≥ 5 are both
satisfied, so the binomial distribution of
sample proportions can be approximated by
a normal distribution with μ = np and
σ=npq.
 Note: p is the assumed proportion not the
sample proportion.

SIS 1037Y 2020/2021 41


p̂  p
z
pq
n
 P-values: Use the standard normal
distribution (Table) & figures
 Critical Values: Use the standard normal
distribution

SIS 1037Y 2020/2021 42


 P-value = probability of getting a test
statistic at least as extreme as the one
representing sample data
 Computer programs and calculators usually
provide a P-value, so the P-value method is
used.
 See Figure on page ??.

SIS 1037Y 2020/2021 43


 Critical Value Method
◦ See previous notes
 Confidence Interval Method
◦ In general, for two-tailed hypothesis tests,
construct a confidence interval with a
confidence level corresponding to the
significance level

SIS 1037Y 2020/2021 44


 When testing claims about a population proportion,
the traditional method and the P-value method are
equivalent and will yield the same result since they
use the same standard deviation based on the
claimed proportion p.
 However, the confidence interval uses an estimated
standard deviation based upon the sample proportion
p̑.
 Consequently, it is possible that the traditional and
P-value methods may yield a different conclusion
than the confidence interval method.
 A good strategy is to use a confidence interval to
estimate a population proportion, but use the P-value
or traditional method for testing a claim about the
proportion.

SIS 1037Y 2020/2021 45


 Based on information from a survey, 93% of
computer owners believe they have antivirus
programs installed on their computers.
 In a random sample of 400 scanned
computers, it is found that 380 of them (or
95%) actually have antivirus software
programs.
 Use the sample data from the scanned
computers to test the claim that 93% of
computers have antivirus software.

SIS 1037Y 2020/2021 46


 Requirement check:
1. The 400 computers are randomly selected.
2. There is a fixed number of independent
trials with two categories (computer has an
antivirus program or does not).
3. The requirements np ≥ 5 and nq ≥ 5 are
both satisfied with n = 400
◦ np = 400*0.93 = 372
◦ nq = 400*0.07 = 28

SIS 1037Y 2020/2021 47


1. The original claim that 93% of computers have
antivirus software can be expressed as p =
0.93.
2. The opposite of the original claim is p ≠ 0.93.
3. The hypotheses are written as:
◦ H0: p = 0.93
◦ H1: p ≠ 0.93
4. For the significance level, we select α = 0.05.
5. Because we are testing a claim about a
population proportion, the sample statistic
relevant to this test is:
◦ p̑, approximated by a normal distribution

SIS 1037Y 2020/2021 48


6. The test statistic is calculated as

380
 0.93
pˆ  p 400
z   1.57
pq  0.93 0.07 
n 400

SIS 1037Y 2020/2021 49


6. Because the hypothesis test is two-tailed
with a test statistic of z = 1.57, the P-
value is twice the area to the right of z =
1.57.
The P-value is twice 0.0582, or 0.1164.

SIS 1037Y 2020/2021 50


7. Because the P-value of 0.1164 is greater
than the significance level of α = 0.05, we
fail to reject the null hypothesis.
8. We fail to reject the claim that 93%
computers have antivirus software. We
conclude that there is not sufficient sample
evidence to warrant rejection of the claim
that 93% of computers have antivirus
programs.

SIS 1037Y 2020/2021 51


Steps 1 – 5 are the same as for the P-
value method.
6. The test statistic is computed to be z = 1.57. We
now find the critical values, with the critical
region having an area of α = 0.05, split equally in
both tails.

SIS 1037Y 2020/2021 52


7. Because the test statistic does not fall in the
critical region, we fail to reject the null
hypothesis.
8. We fail to reject the claim that 93%
computers have antivirus software. We
conclude that there is not sufficient sample
evidence to warrant rejection of the claim
that 93% of computers have antivirus
programs.

SIS 1037Y 2020/2021 53


 The claim of p = 0.93 can be tested at the α
= 0.05 level of significance with a 95%
confidence interval.
 Using previous methods, we get:
 0.929 < p < 0.971
 This interval contains p = 0.93, so we do
not have sufficient evidence to warrant the
rejection of the claim that 93% of computers
have antivirus programs.

SIS 1037Y 2020/2021 54


 Testing Claims Using the Exact Method
◦ We can get exact results by using the binomial
probability distribution.
◦ Binomial probabilities are a nuisance to calculate
manually, but technology makes this approach quite
simple.
◦ Also, this exact approach does not require that np ≥ 5
and nq ≥ 5 so we have a method that applies when that
requirement is not satisfied.
◦ To test hypotheses using the exact binomial distribution,
use the binomial probability distribution with the P-
value method, use the value of p assumed in the null
hypothesis, and find P-values as follows:

SIS 1037Y 2020/2021 55


 Left-tailed test:
◦ The P-value is the probability of getting x or fewer
successes among n trials.
 Right-tailed test:
◦ The P-value is the probability of getting x or more
successes among n trials.
 Two-tailed test:
◦ If p̑>p, the P-value is twice the probability of
getting x or more successes
◦ If p̑<p, the P-value is twice the probability of
getting x or fewer successes

SIS 1037Y 2020/2021 56


 In testing a method of gender selection, 10
randomly selected couples are treated with
the method, and 9 of the babies are girls.

 Use a 0.05 significance level to test the claim


that with this method, the probability of a
baby being a girl is greater than 0.75.

SIS 1037Y 2020/2021 57


 We will test
◦ H0: p = 0.75
◦ H1: p > 0.75
 using technology to find probabilities in a
binomial distribution with p = 0.75.
 Because it is a right-tailed test, the P-value
is the probability of 9 or more successes
among 10 trials, assuming p = 0.75.

SIS 1037Y 2020/2021 58


 The calculated probability of 9 or more
successes is 0.2440252, which is the P-
value of the hypothesis test.
 The P-value is high (greater than 0.05), so
we fail to reject the null hypothesis.
 There is not sufficient evidence to support
the claim that with the gender selection
method, the probability of a girl is greater
than 0.75.

SIS 1037Y 2020/2021 59


 Review and Preview
 Basics of Hypothesis Testing
 Testing a Claim about a Proportion
 Testing a Claim About a Mean
 Testing a Claim About a Standard Deviation
or Variance

SIS 1037Y 2020/2021 60


 Methods for testing a claim about a
population mean are presented:
◦ The very realistic and commonly used case in which
the population standard deviation σ is not known.
◦ The procedure when σ is known, which is very rare.

SIS 1037Y 2020/2021 61


 When σ is not known, we use a “t
test” that incorporates the Student
t distribution.
 n = sample size
 x̄= sample mean
 μx̄ = population mean

SIS 1037Y 2020/2021 62


 The sample is a simple random sample.
 Either or both of these conditions is satisfied:
The population is normally distributed or n >
30.

SIS 1037Y 2020/2021 63


x  x
 Test statistics: t
s
n
 P-values: Use technology or use the Student t
distribution table with degrees of freedom df
= n – 1.
 Critical values: Use the Student t distribution
with degrees of freedom df = n – 1.

SIS 1037Y 2020/2021 64


1. The Student t distribution is different for
different sample sizes (see notes).
2. The Student t distribution has the same general
bell shape as the normal distribution; its wider
shape reflects the greater variability that is
expected when s is used to estimate σ.
3. The Student t distribution has a mean of t = 0.
4. The standard deviation of the Student t
distribution varies with the sample size and is
greater than 1.
5. As the sample size n gets larger, the Student t
distribution gets closer to the standard normal
distribution.

SIS 1037Y 2020/2021 65


Listed below are the measured radiation
emissions (in W/kg) corresponding to a sample
of cell phones:
0.38 0.55 1.54 1.55 0.50 0.60 0.92 0.96 1.00 0.86 1.46

Use a 0.05 level of significance to test the claim


that cell phones have a mean radiation level
that is less than 1.00 W/kg.
The summary statistics are:
x̄ = 0.938 and s = 0.423

SIS 1037Y 2020/2021 66


1. We assume the sample is a simple random
sample.
2. The sample size is n = 11, which is not
greater than 30, so we must check a normal
quantile plot for normality.

SIS 1037Y 2020/2021 67


The points are reasonably close to a straight line
and there is no other pattern, so we conclude the
data appear to be from a normally distributed
population.

SIS 1037Y 2020/2021 68


Step 1: The claim that cell phones have a mean
radiation level less than 1.00 W/kg is expressed as
μ < 1.00 W/kg.
Step 2: The alternative to the original claim is μ ≥
1.00 W/kg.
Step 3: The hypotheses are written as:
 H0 : μ = 1.00 W/kg.
 H1 : μ < 1.00 W/kg.
Step 4: The stated level of significance is α = 0.05.
Step 5: Because the claim is about a population
mean μ, the statistic most relevant to this test is
the sample mean: x̄.

SIS 1037Y 2020/2021 69


 Step 6: Calculate the test statistic and then
find the P-value or the critical value from
table:
x   x 0.938  1.00
t   0.486
s 0.423
n 11

 Step 7: P-value method: We have a P-value


of 0.3191. Since the P-value exceeds α =
0.05, we fail to reject the null hypothesis.

SIS 1037Y 2020/2021 70


Step 7: Critical Value Method: Because the test
statistic of t = –0.486 does not fall in the critical
region bounded by the critical value of t = –
1.812, fail to reject the null hypothesis.

SIS 1037Y 2020/2021 71


 Step 8: Because we fail to reject the null
hypothesis, we conclude that there is not
sufficient evidence to support the claim that
cell phones have a mean radiation level that
is less than 1.00 W/kg.

SIS 1037Y 2020/2021 72


 Use a table to find a range of values for the
P-value corresponding to the given results.
 a) In a left-tailed hypothesis test, the sample
size is n = 12, and the test statistic is t = –
2.007.
 b) In a right-tailed hypothesis test, the
sample size is n = 12, and the test statistic is
t = 1.222.
 c) In a two-tailed hypothesis test, the sample
size is n = 12, and the test statistic is t = –
3.456.

SIS 1037Y 2020/2021 73


 We can use a confidence interval for testing a
claim about μ.
 For a two-tailed test with a 0.05 significance
level, we construct a 95% confidence interval.
 For a one-tailed test with a 0.05 significance
level, we construct a 90% confidence interval.
 Using the cell phone example, construct a
confidence interval that can be used to test the
claim that μ < 1.00 W/kg, assuming a 0.05
significance level.
 Note that a left-tailed hypothesis test with α =
0.05 corresponds to a 90% confidence interval.
 We find: 0.707 W/kg < μ < 1.169 W/kg

SIS 1037Y 2020/2021 74


 Because the value of μ = 1.00 W/kg is
contained in the interval, we cannot reject the
null hypothesis that μ = 1.00 W/kg .
 Based on the sample of 11 values, we do not
have sufficient evidence to support the claim
that the mean radiation level is less than 1.00
W/kg.

SIS 1037Y 2020/2021 75


 When σ is known, we use test that involves
the standard normal distribution.
 In reality, it is very rare to test a claim about
an unknown population mean while the
population standard deviation is somehow
known.
 The procedure is essentially the same as a t
test, with the following exception:

SIS 1037Y 2020/2021 76


x  x
The test statistic is: z


n
 The P-value can be provided by the
standard normal distribution.
 The critical values can be found using the
standard normal distribution.

SIS 1037Y 2020/2021 77


 If we repeat the cell phone radiation example,
with the assumption that σ = 0.480 W/kg, the
test statistic is:
x  x 0.938  1.00
z   0.43
 0.480
n 11
 The example refers to a left-tailed test, so the P-
value is the area to the left of z = –0.43, which is
0.3336.
 Since the P-value is large, we fail to reject the
null and reach the same conclusion as before.

SIS 1037Y 2020/2021 78


 Review and Preview
 Basics of Hypothesis Testing
 Testing a Claim about a Proportion
 Testing a Claim About a Mean
 Testing a Claim About a Standard Deviation
or Variance

SIS 1037Y 2020/2021 79


 Methods for testing a claim made about a
population standard deviation σ or population
variance σ2.
 These methods use the chi-square distribution that
was first introduced previously.

SIS 1037Y 2020/2021 80


n = sample size
s = sample standard deviation
s2 = sample variance
σ = claimed value of the population standard deviation
σ2= claimed value of the population variance

1. The sample is a simple random sample.


2. The population has a normal distribution.
(This is a much stricter requirement than the requirement of
a normal distribution when testing claims about means.)

SIS 1037Y 2020/2021 81


 Test Statistic

(n  1) s 2
 
2

 2

SIS 1037Y 2020/2021 82


• P-values: Use technology or Table.
• Critical Values: Use Table.
• In either case, the degrees of freedom = n –1.
The χ2 test here is not robust against a departure
from normality, meaning that the test does not
work well if the population has a distribution that is
far from normal.
The condition of a normally distributed population
is therefore a much stricter requirement here than
it was previously.

SIS 1037Y 2020/2021 83


 All values of χ2 are nonnegative, and the
distribution is not symmetric (see the Figure
on the next slide).

 There is a different distribution for each


number of degrees of freedom.

 The critical values are found in associated


table using n – 1 degrees of freedom.

SIS 1037Y 2020/2021 84


Properties of Chi-Square
Distribution
Properties of the Chi-Square Chi-Square Distribution for 10
Distribution and 20 df

Different distribution for each


number of df.

SIS 1037Y 2020/2021 85


Listed below are the heights (inches) for a simple
random sample of ten supermodels:
70 71 69.25 68.5 69 70 71 70 70 69.5
Consider the claim that supermodels have heights
that have much less variation than the heights of
women in the general population.
We will use a 0.01 significance level to test the
claim that supermodels have heights with a
standard deviation that is less than 2.6 inches.
Summary Statistics:s2 = 0.7997395,
And s = 0.8942816

SIS 1037Y 2020/2021 86


Requirement Check:
1. The sample is a simple random sample.
2. We check for normality, which seems
reasonable based on the normal quantile plot.

SIS 1037Y 2020/2021 87


 Step 1: The claim that “the standard
deviation is less than 2.6 inches” is expressed
as σ < 2.6 inches.
 Step 2: If the original claim is false, then σ ≥
2.6 inches.
 Step 3: The hypotheses are:
◦ H0: σ = 2.6 inches.
◦ H1: σ < 2.6 inches.
 Step 4: The significance level is α = 0.01.
 Step 5: Because the claim is made about σ,
we use the chi-square distribution.

SIS 1037Y 2020/2021 88


Step 6: The test statistic is calculated as
follows:
10  1 0.7997395 
2
(n  1) s 2
x 
2
  0.852
 2
2.6 2

with 9 degrees of freedom.

SIS 1037Y 2020/2021 89


Step 6: The critical value of χ2 = 2.088 is found
from the table, and it corresponds to 9 degrees of
freedom and an “area to the right” of 0.99.

SIS 1037Y 2020/2021 90


 Step 7: Because the test statistic is in the
critical region, we reject the null hypothesis.
 There is sufficient evidence to support the
claim that supermodels have heights with a
standard deviation that is less than 2.6
inches.
 Heights of supermodels have much less
variation than heights of women in the
general population.

SIS 1037Y 2020/2021 91


 P-values are generally found using
technology, but a table can be used if
technology is not available.
 The P-value is 0.0002897.
 Since the P-value = 0.0002897, we can reject
the null hypothesis (it is under the 0.01
significance level).
 We reach the same exact conclusion as before
regarding the variation in the heights of
supermodels as compared to the heights of
women from the general population.

SIS 1037Y 2020/2021 92


Since the hypothesis test is left-tailed using a
0.01 level of significance, we can run the test by
constructing an interval with 98% confidence.
Using the previous methods, and the critical
values found in our table, we can construct the
following interval:

SIS 1037Y 2020/2021 93


Based on this interval, we can support the
claim that σ < 2.6 inches, reaching the
same conclusion as using the P-value
method and the critical value method.

SIS 1037Y 2020/2021 94


 Review and Preview
 Basics of Hypothesis Testing
 Testing a Claim about a Proportion
 Testing a Claim About a Mean
 Testing a Claim About a Standard Deviation
or Variance
 Type I & Type II errors

SIS 1037Y 2020/2021 95


 A Type I error is the mistake of rejecting the
null hypothesis when it is actually true.
◦ The symbol α is used to represent the probability of
a type I error.
 A Type II error is the mistake of failing to
reject the null hypothesis when it is actually
false.
◦ The symbol β (beta) is used to represent the
probability of a type II error.

SIS 1037Y 2020/2021 96


 Definition
◦ The power of a hypothesis test is the probability
1–β of rejecting a false null hypothesis.
◦ The probability of seeing a true effect if one
exists.
◦ The value of the power is computed by using a
particular significance level α and a particular
value of the population parameter that is an
alternative to the value assumed true in the null
hypothesis.

SIS 1037Y 2020/2021 97


2020/2021 SIS 1037Y
98
 Type I error rate (or significance level): the
probability of finding an effect that is not
real (false positive).
 If we require p-value<.05 for statistical significance, this
means that 1/20 times we will find a positive result just
by chance.
 Type II error rate: the probability of missing
an effect (false negative).
 Statistical power: the probability of finding
an effect if it is there (the probability of not
making a type II error).
 When we design studies, we typically aim for a power of
80% (allowing a false negative rate, or type II error rate, of
20%).
SIS 1037Y 2020/2021 99
Type I and Type II Errors

SIS 1037Y 2020/2021 100


Result Possibilities
H0: Innocent

Trial Hypothesis Test

Actual Situation Actual Situation

Verdict Innocent Guilty Decision H 0 True H 0 False

Do Not
Type II
Innocent Correct Error Reject 1- a
Error ( b )
H0
Type I
Reject Power
Guilty Error Correct Error
H0 (1 - b )
(a )

SIS 1037Y 2020/2021 101


 Assume that we are conducting a hypothesis
test of the claim that a method of gender
selection increases the likelihood of a baby girl,
so that the probability of a baby girls is p >
0.5.
 Here are the null and alternative
hypotheses:
◦ H0 : p = 0.5
◦ H1 : p > 0.5
 a) Identify a type I error.
 b) Identify a type II error.

SIS 1037Y 2020/2021 102


 a) A type I error is the mistake of rejecting a
true null hypothesis:
◦ We conclude the probability of having a girl is
greater than 50%, when in reality, it is not. Our
data misled us.
 b) A type II error is the mistake of failing to
reject the null hypothesis when it is false:
◦ There is no evidence to conclude the probability of
having a girl is greater than 50% (our data misled
us), but in reality, the probability is greater than
50%.

SIS 1037Y 2020/2021 103


 a & b Have an Inverse Relationship
 For any fixed α, an increase in the sample
size n will cause a decrease in β.
 For any fixed sample size n, a decrease in α
will cause an increase in β. Conversely, an
increase in α will cause a decrease in β.
 To decrease both α and β, increase the
sample size.

SIS 1037Y 2020/2021 104


 True Value of Population Parameter
◦ Increases When Difference Between Hypothesized
Parameter & True Value Decreases
 Increases When Significance Level α
Decreases
 Increases When σ Increases
 Increases When Sample Size n Decreases

SIS 1037Y 2020/2021 105


 Just as 0.05 is a common choice for a
significance level, a power of at least 0.80 is a
common requirement for determining that a
hypothesis test is effective.
◦ Some arguments that the power should be higher, such
as 0.85 or 0.90.
 When designing an experiment, we might
consider how much of a difference between the
claimed value of a parameter and its true value is
an important amount of difference.
 When designing an experiment, a goal of having
a power value of at least 0.80 can often be used
to determine the minimum required sample size.

SIS 1037Y 2020/2021 106


 Type II error probabilities depend on:
◦ a
◦ sample size
◦ population variance
◦ difference between actual and hypothesized means

 How is the type II error probability calculated?

SIS 1037Y 2020/2021 107


 Begin with the usual picture (assuming Ha: μ >
μ0)

Translate to a slightly
different rejection
rule…

0 za
SIS 1037Y 2020/2021 108
 If the rule is, reject H0 if z = (𝑥-μ
ҧ 0)/(σ/√n) > za, then
an equivalent rule is to reject when 𝑥ҧ > μ0 + za (σ/√n)

μ0 μ0 + z a
(σ/√n) SIS 1037Y 2020/2021 109
 The type II error probability (β) is the blue area,
where μt is the true population mean.

μt μ0 + z a
(σ/√n) SIS 1037Y 2020/2021 110
 So to find β, we need to find the area
to the left of μ0 + za (σ/√n).
Score Actual Mean

◦ Standardize: [μ0 + za (σ/√n)] – μt


σ/√n
Standard Error

◦ Simplify and we get:


β = P(z < (μ0– μt)/(σ/√n) + za )

SIS 1037Y 2020/2021 111


SIS 1037Y 2020/2021 112
 Suppose the mean weight of King Penguins found
in an Antarctic colony last year was 5.2 kg.
Assume the actual mean population weight is 5.4
kg, and the population standard deviation is 0.6
kg.
 At 0.05 significance level, what is the probability
of having type II error for a sample size of 9
penguins?
 Given,
◦ H0: μ0 = 5.2, HA: μA = 5.4, σ = 0.6, n = 9
 To Find,
◦ Beta or Type II Error rate

SIS 1037Y 2020/2021 113


 Step 1:
 Let us first calculate the value of c: Substitute
the values of H0, HA, σ and n in the formula,

c - μ0 / (σ / √n) = -1.645
c - 5.2 / (0.6 / √(9)) = -1.645
c - 5.2 = -0.329
c = 4.87

SIS 1037Y 2020/2021 114


 Step 2:
 In the formula, take β to the left hand side
and the other values to right hand side,
◦ β = 1 - p(z > (c - μA / (σ / √n)))
◦ [ z = x̄ - μA / (σ / √n) ]
◦ Substitute the values in the above equation,

β = 1 - p(z > (4.87 - 5.4 / (0.6 / √(9))))


= 1 - p(z > -2.65)
= 1 - 0.9960
= 0.0040

SIS 1037Y 2020/2021 115


Your Statistical True state of null hypothesis
Decision
H0 True H0 False
(example: the drug doesn’t work) (example: the drug works)

Reject H0
(ex: you conclude that the drug Type I error (α) Correct
works)

Do not reject H0
(ex: you conclude that there is
insufficient evidence that the drug Correct Type II Error (β)
works)
 A particular compound is not hazardous in
drinking water if it is present at a rate of no
more than 25ppm. A watchdog group believes
that a certain water source does not meet this
standard.
◦ μ: mean amount of the compound (in ppm)
H0: μ < 25
Ha: μ > 25

◦ If the watchdog group decides to gather data and formally


conduct this test, describe type I and type II errors in the
context of this scenario and the consequences of each.

SIS 1037Y 2020/2021 117


 Type I error:
◦ Stating that the evidence indicates the water is unsafe
when, in fact, it is safe.
◦ The watchdog group will have potentially initiated a
clean-up where none was required ($$ wasted).
 Type II error:
◦ Stating that there is no evidence that the water is
unsafe when, in fact, it is unsafe.
◦ The opportunity to note (and repair) a potential health
risk will be missed.

SIS 1037Y 2020/2021 118


 A lobbying group has a been advocating a particular
ballot proposal. One week before the election, they
are considering moving some of their advertising
efforts to other issues. If the proposal has a support
level of at least 55%, they will feel it’s “safe” and
move money to other campaigns.
◦ p: proportion of people who support the proposal
H0: p > .55
Ha: p < .55

◦ If the lobbying group decides to gather data and formally conduct this
test, describe type I and type II errors in the context of this scenario
and the consequences of each.
SIS 1037Y 2020/2021 119
 Type I error:
◦ Stating that the evidence indicates the support level is
less than 55% (and the proposal may be in jeopardy of
failing) when that is not the case.
◦ The lobbying group will have kept advertising dollars
aimed at this proposal when they could have been
spent elsewhere.
 Type II error:
◦ Stating that the proposal appears to have a “safe” level
of support when that is not the case.
◦ The lobbying group would shift funds away from
supporting this proposal even though it may still be in
need of that support.

SIS 1037Y 2020/2021 120


 For our drinking water scenario suppose the
survey was taken on 35 water samples and the
test was to be conducted at a = 0.05. If the
actual mean concentration is 27ppm and the
standard deviation is 4ppm, what is the
probability of a type II error.
 Replacing we get:
◦ β = P(z < (μ0– μt)/(σ/√n) + za )
= P(z < (25– 27)/(4/√35) + 1.645 )
= P(z < -1.31) = 0.0951 (from table)

SIS 1037Y 2020/2021 121


 A tire manufacturer claims that its tires last 35000
miles, on average. A consumer group wishes to test
this, believing it is actually less. The group plans to
assess lifetime of tires on a sample of 35 cars and
test these assumptions at a = 0.05. If the standard
deviation of tire life is 4000 miles, what is the
probability of a type II error if the actual mean
lifetime of the tires is 32000 miles?
 A few things change:
◦ β =1-P(z < (μ0– μt)/(σ/√n) - za ))
= 1-P(z < (35000 – 32000)/(4000/√ 35) -1.645)
= 1-P(z < 2.79) = 1-0.9974=0.0026
SIS 1037Y 2020/2021 122
 What if a = 0.001?
 The z-score changes:
◦ β =1-P(z < (μ0– μt)/(σ/√n) - za ))
= 1-P(z < (35000 – 32000)/(4000/√ 35) -3.090)
= 1-P(z < 1.35) = 1-0.9115=0.0885

 A more stringent a (lower P(type I error)) increases the


type II error rate—all else being equal.

SIS 1037Y 2020/2021 123


 Basics of Hypothesis Testing
 Testing a Claim about a Proportion, Mean,
Standard Deviation or Variance
 Tables
 Different methods
 Suitability of methods
 Type I & Type II errors

SIS 1037Y 2020/2021 124


Comments?

You might also like