0% found this document useful (0 votes)
1 views

Lecture 4 Hypothesis Testing slides

The document outlines the steps of hypothesis testing, which involves stating a hypothesis, determining the distribution of sample means, identifying the critical region, and making decisions based on sample data. It discusses one-tailed tests, the p-value approach, and the types of errors that can occur, including Type I and Type II errors. Additionally, it covers the power of a test and the types of one-sample tests based on sample size and population distribution.

Uploaded by

Aaliyan Bandealy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Lecture 4 Hypothesis Testing slides

The document outlines the steps of hypothesis testing, which involves stating a hypothesis, determining the distribution of sample means, identifying the critical region, and making decisions based on sample data. It discusses one-tailed tests, the p-value approach, and the types of errors that can occur, including Type I and Type II errors. Additionally, it covers the power of a test and the types of one-sample tests based on sample size and population distribution.

Uploaded by

Aaliyan Bandealy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Lecture 4 Hypothesis Testing

1 / 30
Lecture outline

1. Hypothesis testing steps


2. One tail test
3. Types one sample tests

2 / 30
Introduction

I Hypothesis testing is a statistical method that uses sample


data to evaluate a hypothesis about a population
I Hypothesis testing therefore allows to use sample data to
draw inferences about the population.
I The drawn conclusions can always be wrong, but with
hypothesis testing, we can put a probability on whether the
conclusion is correct and can make a decision beyond
reasonable doubt.

3 / 30
Procedure of hypothesis testing

1. State a hypothesis about a population


2. Use the hypothesis to predict the distribution of sample means
3. Specify which sample means are consistent with the null
hypothesis and which are not (specify critical region)
4. Compare the obtained sample data with the prediction made
from the hypothesis and draw conclusion

4 / 30
Example 6.3

I The weights of full term female babies born to mothers in


Australia have been recorded over ten years and found to have
a mean of 3.34 kg and a standard deviation of 0.55 kg.
I Medical researchers ask: Does heavy smoking affect babies’
weights?
I A random sample of 100 mothers were identified as heavy
smokers throughout their pregnancy, and data on their babies’
weights have been collected.

5 / 30
Step 1: Stating a hypothesis

I We state two opposing hypothesis:


1. Null hypothesis H0 states that there is no change, no effect, no
difference - nothing happened, hence the name null.
2. Alternative hypothesis H1 states that there is a change, a
difference or a relationship for the general population.
I In symbols, we write:

H0 :µ = µ0
H1 :µ 6= µ0

6 / 30
Example

I In our example, the null hypothesis states that smoking has


no effect on babies’ weights.

H0 :µwithsmoking = 3.34 (1)


(Even with smoking, mean babies’ weights is still 3.34 kg)

I The alternative hypothesis states that smoking has an effect


on babies’ weights and will change the mean weight of babies.

H1 :µwithsmoking 6= 3.34 (2)


(With smoking, mean babies’ weights is different from 3.34 kg)

7 / 30
Step 2: Determine sample mean distribution

I We use null hypothesis to specify the distribution of sample


means that ought to be obtained if the null hypothesis is true.
I Example: null hypothesis states that smoking does not affect
babies’ weights. If this is true, the population mean is
µ = 3.34 and the sample mean has a distribution with a mean
of 3.34.

8 / 30
Step 3: Determine critical region

I We specify exactly which sample means are consistent with


the null hypothesis and which are not
I This is done by dividing the distribution of sample means into
two sections:
1. Sample means that are likely to be obtained if H0 is true, i.e.
sample means that are close to the null hypothesis
2. Sample means that are very unlikely to be obtained if H0 is
true, i.e. sample means that are very different from the null
hypothesis

9 / 30
Sample mean distribution and critical region

Figure: Sample mean distribution

10 / 30
Critical region

I The region that contains the extreme sample values are very
unlikely to be obtained if the null hypothesis is true.
I This region is called the critical region
I The boundaries that define the critical region are determined
by the level of significance or the alpha level
I For example, if α = 0.05, the boundaries are selected such
that the total probability of extreme sample values is equal to
0.05.

11 / 30
Critical region

I In most cases, the distribution of sample means is normal, and


we use the alpha level and the standard normal table to locate
the boundaries of the critical region.
I For example, when α = 0.05, the extreme 5% will be split
between two tails of the distribution, and there is exactly
2.5% in each tail.
I In Table 2, a probability of 0.025 corresponds to z = 1.96.
Thus the extreme 5% is in the tails of the distribution beyond
z = 1.96 and z = −1.96.
I These values define the boundaries of the critical region for a
hypothesis test using α = 0.05.

12 / 30
Critical region

Figure: Critical region


13 / 30
Step 4

I Use sample data to compute the sample mean


I Compare sample mean with null hypothesis by using z-score
I z-score describes where the sample mean is located relative to
the hypothesized population mean from the null hypothesis.
I The z-score formula for a sample mean is:

x̄ − µ
z= √ , (3)
σ/ n

where x̄ is the sample mean, µ is the population mean under


the null hypothesis, and σ is the population standard
deviation.

14 / 30
Step 4

I Use the z-score to make a decision.


I If the z-score is located in the critical region, reject the null
hypothesis.
I If the z-score does not lie in the critical region, then the data
do not provide strong evidence that the null hypothesis is
wrong
I Sample mean is reasonably close to the population mean
specified in the null hypothesis. Our conclusion is not to
reject the null hypothesis

15 / 30
Step 4

Table: Hypothesis testing vs jury trial

Steps Hypothesis testing Jury trial


Null hypothesis Treatment has no effect Defendant is innocent
(innocent until proven guilty)
Data collection Collect sample data Police gather evidence
Not enough evidence Fail to reject null hypothesis Fail to find defendant guilty
Wrong conclusion: there is no treatment effect defendent is innocent

16 / 30
One tail test

I If the direction of the treatment effect could only be in one


direction, one tailed tests are more appropriate.
I For example we usually expect that smoking reduces baby
weight
I The null hypothesis is as before, but the alternative
hypothesis changes to:

H1 :µwithsmoking < 3.34 (4)


(With smoking, mean babies’ weights is lower than 3.34 kg)

17 / 30
One tail test

I The critical region is located entirely in the left tail.


I The proportion specified by the alpha level is not divided
between two tails, but is contained entirely in one tail.
I In our example, if we use α = 0.05, then the whole 5% is
located in the left tail.
I The z-score boundary for the critical region would be
z = −1.65.

18 / 30
p-value

I Instead of using critical values, we can use a p-value approach.


I The p-value is the probability of obtaining test results at least
as extreme as the observed result in the sample, when the null
hypothesis is correct.
I p-value equal to the alpha level when the critical values are
set equal to the value of the test statistic.
I We reject H0 when p-value is lower than the alpha level and
do not reject H0 when p-value is higher than the alpha level.

19 / 30
Errors in hypothesis testing

I A hypothesis test uses a sample as the basis for reaching a


general conclusion about the whole population.
I A sample provides only limited information about the
population, and there is always the possibility that an
incorrect conclusion will be made.
I There are two kinds of errors that can be made.

20 / 30
Types of error

I Type I error occurs when a researcher rejects a null hypothesis


that is actually true.
I In our example, it occurs when we conclude that smoking has
an impact on the babies’ weights while in fact it has no effect.
I Type I error occurs when we have an extreme,
nonrepresentative sample.
I The probability that we get an extreme, nonrepresentative
sample is the probability that the samples have means in the
critical region.
I The probability of a type I error is equal to the alpha level

21 / 30
Types of error

I A type II error is the failure to reject a false null hypothesis.


I In our example, a type II error means that the impact of
smoking on babies’ weights exists, but the hypothesis test fails
to detect it.
I Type II error occurs when the effect of the treatment is
relatively small.
I The treatment does influence the sample, but the magnitude
of the effect is not big enough to move the sample mean into
the critical region.
I The probability of a type II error is often denoted by β.
I The power of a statistical test is the probability that it does
not make a type II error and is therefore equal to 1 − β.

22 / 30
Power of test

I Consider an one (right) tail z-test with α = 0.05.


I Probability of type II error is the probability that z-score is less
than 1.645:
X̄ − µ0
β = Pr ( √ < 1.645)
σ/ n
I To calculate β and test power, we need to know the true
population mean (µ0 is just a hypothesized population mean,
which may or may not be true).

23 / 30
Power of test

I Suppose true population mean is µ = µ1 , i.e.



X̄ ∼ N(µ1 , σ/ n).
I From the previous slide, the test power can be written as:

X̄ − µ1 µ −µ
1 − β = 1 − Pr ( √ < 0 √ 1 + 1.645)
σ/ n σ/ n
µ0 − µ1
= 1 − Φ( √ + 1.645)
σ/ n

where Φ is the cumulative probability function of the standard


normal distribution.

24 / 30
Power of test

I For µ0 that is much lower than µ1 , µ0 −µ


√ 1 is small and test
σ/ n
power is large.
I It is much easier to reject a very wrong null hypothesis.
I An increase in sample size will help to magnify the difference
between µ0 and µ1 and increase test power.
I For µ0 that is close to µ1 or higher than µ1 , test power is
small and it is difficult to reject the wrong null hypothesis.

25 / 30
Example

I Suppose we want to do an experiment to see if a particular


drug increases the number of hours a person can go without
food. Suppose that from previous research, we know the
population standard deviation is σ = 10.
I Our null hypothesis is:

H0 : µ = 50

HA : µ > 50
I We collect a sample of 100 patients and find that the sample
mean is X̄ = 52. To test the claim that µ = 50,√we use z-test.
The test statistic is: z-statistic = (52-50)/(10/ 100) = 2.
I For α = 0.05, the critical value is CV=1.645, and we reject
H0 when z-stat > CV.

26 / 30
Example

I Suppose we wish to find the power to detect a true mean


when the true mean is 52. We do this by first calculating the
probability of type II error:
µ0 − µ1
β = Φ( √ + 1.645)
σ/ n
50 − 52
= Φ( √ + 1.645) = Φ(−0.346) = 0.3647
10/ 100
I The power of the test is:

1 − β = 1 − 0.3647 = 0.6353

I Therefore, when the true population mean is 52 hours, using


the described sample, the test can reject false hypothesis with
probability of 0.6353.

27 / 30
Types of one sample tests

I Available information determines the type of test statistic to


be used (Figure 3).
I If the sample size is large, we can use z-statistics.
I If the sample size is small, we need an additional condition
that the population is normally distributed.
I For small sample size, if population standard deviation is
known, we use z-statistic, otherwise, use t-statistic.
I When population standard deviation is unknown, we use
sample standard deviation.
I t-test statistic has a quantity called degree of freedom. The
degrees of freedom is denoted by ν and is given by ν = n − 1.

28 / 30
Types of one sample tests

Figure: Test statistics for different cases of one sample tests

29 / 30
Example 6.6

I Find the critical values for the t-statistic in the following


cases:
a. two-tailed, α = 0.05, ν =6
b. two-tailed, α = 0.01, ν = 12
c. one-tailed, α = 0.05, ν =7
d. one-tailed, α = 0.01, ν = 11
Solution: Use Table 4
a. row ν = 6; column α = 0.05
b. row ν = 12; column α = 0.01
c. Table 4 is for two tailed test. To apply to one-tailed test,
multiply α by 2. so row ν = 7, column α = 0.1
d. so row ν = 11, column α = 0.02

30 / 30

You might also like