0% found this document useful (0 votes)

47 views95 pages

Biostatistics of HKU MMEDSC Session 7

The document outlines a lecture on designing studies and calculating appropriate sample sizes. It discusses key concepts like type I and II errors, significance levels, and power. It covers the perspectives of Fisher and Neyman-Pearson on hypothesis testing and how current practice synthesizes aspects of both approaches. An example randomized controlled trial comparing two drugs for migraine is presented to illustrate statistical hypothesis testing.

Uploaded by

Xin chao Li

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views95 pages

Biostatistics of HKU MMEDSC Session 7

Uploaded by

Xin chao Li

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 95

Outline

Designing studies
CMED6100 – Session 7

ST Ali

School of Public Health

The University of Hong Kong

30 October 2021

sli.do/#hkubiostat21

ST Ali CMED6100 – Session 7 Slide 2

Outline

Announcements

• Practical 3 (sample size calculations) next week.

ST Ali CMED6100 – Session 7 Slide 3

Outline

Announcements
• Ask related questions via Moodle forums only (not via email)
– Mention clearly the questions numbers/slide numbers/assignment or
practical session details
– Put your queries to the respective forums

• For any further clarifications, you need to discuss personally, come to office
hours
• Don’t ask questions, which you are supposed to answer (assignments questions
before deadline)
ST Ali CMED6100 – Session 7 Slide 4
Outline

Objectives

After the lecture, students should be able to:

• Define type I and II errors, significance levels, and power;

• Describe the determinants of power;

• Calculate or comment on appropriate sample sizes in a variety
of settings.
– Formulas for 2-group calculations would be provided if needed

• Practical 3 will cover the use of an online sample size

calculation tool.

ST Ali CMED6100 – Session 7 Slide 9

Outline

Statistical hypothesis: Assertion or statement about the population characteristics (𝜇)

Null hypothesis: Hypothesis of no difference (𝐻! : 𝜇 = 𝜇0 )
Alternative hypothesis: hypothesis which is complementary to the null hypothesis (𝐻# 𝑜𝑟𝐻$ : 𝜇 ≠ 𝜇! )

Critical Value
𝐻! : 𝜇 = 𝜇! 𝐻" : 𝜇 = 𝜇" (> 𝜇! )
Pr (Type I Error) = 𝛼 (Level of significance)
Pr (Type II Error) = 𝛽 (1 − 𝛽 is Power of the test)

ST Ali CMED6100 – Session 7 Slide 10

Outline

Practical outcomes
You should be able to:
• Define key terms such as type I and II errors, power,
significance confidence etc.
• Describe the determinants of power

• In a given scenario, classify the type of sample size calculation

required: one or two groups; dichotomous or continuous
outcome.
• Perform the appropriate sample size calculation using
computer software (not by hand)
ST Ali CMED6100 – Session 7 Slide 11
Introduction Errors Power Summary

Perspective of scientific investigation

Recall the lecture on hypothesis testing.

Karl Popper (1902–1994) stated that a scientist who wishes to
prove an effect should follow two logical steps:

1. Set up a null hypothesis that NO effect exists.

2. Try to build sufficient evidence to DISPROVE the null

hypothesis.

His idea was that no theory is completely correct, but if not

falsified, it can be accepted as truth.

ST Ali CMED6100 – Session 7 Slide 12

Introduction Errors Power Summary

Fisher’s approach
• Set a null hypothesis (usually ‘no effect’ or ‘no difference
between two groups’)
• Perform the study, collect and analyse data.

• Calculate the p-value as the probability of observing such

unusual data (or more unusual data) if the null hypothesis is
true.
• Flexible interpretation of p-value
– 0.05 is a reasonable threshold but should not be strict

• Take action accordingly.

ST Ali CMED6100 – Session 7 Slide 13
Introduction Errors Power Summary

The Neyman-Pearson approach

• Set a null hypothesis (not necessarily ‘no difference’)

• Set an alternative hypothesis
– e.g. H0 : relative risk = 1 (null hypothesis)
– e.g. H1 : relative risk 6= 1 (alternative hypothesis)

• Decide what action will be taken if your study provides

evidence (1) in support of, or (2) against the null hypothesis.

• Set a decision rule as a ‘significance level’, α, for the p-value,

e.g. α = 0.05.

ST Ali CMED6100 – Session 7 Slide 14

Introduction Errors Power Summary

The Neyman-Pearson approach (cont.)

• Perform the study, collect and analyse data.

• Calculate the p-value as the probability of observing such unusual

data (or more unusual data) if the null hypothesis is true.

• Strict interpretation of p-value:

– If p < α reject null hypothesis as false, otherwise if p > α accept
the null hypothesis as true (exact p-value not important).
– (Note – an echo of this approach is seen in some journals which
quote only “ns”, “p <0.05” etc. instead of giving exact p-values)

• Take action accordingly.

• What are the consequences of the resulting action?

ST Ali CMED6100 – Session 7 Slide 15
Introduction Errors Power Summary

Errors in drawing conclusions:

Is there a true difference?

Yes No
Reject H0 with p ≤ α I
Accept H0 with p > α II

The type I error (α) corresponds to the false-positive risk.

The type II error (β) corresponds to the false-negative risk.

ST Ali CMED6100 – Session 7 Slide 16

Introduction Errors Power Summary

Costs/benefits associated with each decision

Is there a true difference?

Yes No
Reject H0 with p ≤ α A B
Accept H0 with p > α C D

To determine the appropriate value of α and β, we need to know

something about the values A, B, C , D and the likelihood that
there really is a true difference.

ST Ali CMED6100 – Session 7 Slide 17

Introduction Errors Power Summary

The clash of titans!

• Clearly these two approaches to statistical inference are
different.
• Fisher’s approach only involves rejecting or failing to reject a
null hypothesis of no difference.
• Neyman-Pearson approach involves rejecting or accepting a
null hypothesis vs an alternative hypothesis
– Decision-making approach (focus on controlled experiments?)
– Cost vs benefit of correct decisions, and type I and type II
errors, should be taken into account.
• What’s the best way?
ST Ali CMED6100 – Session 7 Slide 18
Introduction Errors Power Summary

Synthesis of approaches

• These two approaches are often confused and mixed up in

textbooks and the literature.
• Current practice in medical research is a kind of synthesis:
– Set a significance level before the start of the experiment
(usually α = 0.05).
– Run the study, collect and analyse the data,
– Use a null hypothesis of no difference between groups.
– If p < α reject the null hypothesis, otherwise if p > α fail to
reject the null hypothesis.

ST Ali CMED6100 – Session 7 Slide 19

Introduction Errors Power Summary

Example experiment 1
• A small RCT comparing drug A and drug B for migraine (drug A vs.
drug B).

• Randomise 32 patients to each arm.

• After conducting the study, we analyse the data and estimate the
relative risk of headache on A vs B.

• On average 25% of patients would experience the primary outcome

of this study (1+ severe headache within one month of
randomisation). i.e. 25% is the true value for the risk on drug A
and drug B, and the true RR is 1.

• We use α = 0.05 as the level for statistical significance.

ST Ali CMED6100 – Session 7 Slide 20
Introduction Errors Power Summary

Example experiment 1 (v1)

• Maybe these were the results:

Drug Sample size Headache

n (%)
Drug A 32 8 (25%)
Drug B 32 8 (25%)
• Then RR = 0.25/0.25 = 1.00

• 95% confidence interval for this RR is (0.43, 2.34)

• p-value under the null hypothesis RR = 1 is 1.00.

ST Ali CMED6100 – Session 7 Slide 21

Introduction Errors Power Summary

Example experiment 1 (v2)

• If someone somewhere else had done the experiment maybe
these were the results:
Drug Sample size Headache
n (%)
Drug A 32 4 (13%)
Drug B 32 10 (31%)
• Then RR = 0.40, with 95% CI (0.14, 1.14)

• p-value under the null hypothesis RR = 1 is 0.09.

• We correctly fail to reject the null hypothesis.

ST Ali CMED6100 – Session 7 Slide 22
Introduction Errors Power Summary

Example experiment 1 (v3)

• Or in another scenario, maybe these were the results:

Drug Sample size Headache

n (%)
Drug A 32 14 (44%)
Drug B 32 4 (13%)
• Then RR = 3.50, with 95% CI (1.29, 9.49)

• p-value under the null hypothesis RR = 1 is 0.01.

• This time we incorrectly reject the null hypothesis.

ST Ali CMED6100 – Session 7 Slide 23

Introduction Errors Power Summary

Example experiment 1

• It is possible (but unlikely) that by chance we could observe

such unusual results that we mistakenly reject a true null
hypothesis, at any given level of statistical significance.

• At the 5% significance level, we will make this mistake in 5%

of studies in which the null hypothesis is true.

• The results of 40 repetitions of this experiment are shown on

the next slide ...

ST Ali CMED6100 – Session 7 Slide 24

Introduction Errors Power Summary

Example experiment 1 (v1-40)

Some of the 95%
confidence intervals don’t
even include the ‘true’
value 1.0

Approximately 5% of
these intervals will not
include 1, since we
pre-specified a
significance level of 0.05
ST Ali CMED6100 – Session 7 Slide 25
Introduction Errors Power Summary

Type I error (α)

• The significance level is also called the type I error risk (α) and is defined
as the probability of incorrectly rejecting a true null hypothesis.
• We must be quite sure before we claim the existence of a real effect. To
do otherwise would be dangerous.
• A type I error risk of α = 5% is traditionally used in most medical
research, occasionally α =1% or even 0.1%.
• For α = 5%, this means that we will only make a mistake by rejecting a
true null hypothesis 5 times out of 100.
• Why aren’t we more concerned about making a mistake by rejecting a
true null hypothesis? – we could definitely make less mistakes if we set a
lower type I error risk (e.g. 1%)...
ST Ali CMED6100 – Session 7 Slide 26
Introduction Errors Power Summary

Example experiment 2
• Another small RCT comparing two treatments for migraine (drug A
vs. drug C). Outcome is the same as before (1+ severe headache
within one month of randomisation).

• Drug A is same as before i.e. 25% is the true value for the risk of
the outcome on drug A. However, drug C is much less effective with
a 75% risk of the outcome, and the true RR for A vs C is 0.33.

• Randomise 32 patients to each arm.

• After conducting the study, we analyse the data and estimate the
relative risk of headache on A vs C.

• We use p < 0.05 as the level for statistical significance.

ST Ali CMED6100 – Session 7 Slide 27
Introduction Errors Power Summary

Example experiment 2 (v1)

• Maybe these were the results:

Drug Sample size Headache

n (%)
Drug A 32 8 (25%)
Drug C 32 24 (75%)
• Then RR = 0.33, with 95% CI (0.18, 0.67)

• p-value under the null hypothesis RR = 1 is 0.001.

• SO we reject null hypothesis and conclude A is better.

ST Ali CMED6100 – Session 7 Slide 28

Introduction Errors Power Summary

Example experiment 2 (v2)

• But what if someone else had done the study and these were
the results:
Drug Sample size Headache
n (%)
Drug A 32 17 (53%)
Drug C 32 24 (75%)
• Then RR = 0.71, with 95% CI (0.48, 1.04)

• p-value under the null hypothesis RR = 1 is 0.08.

• SO we fail to reject the null hypothesis.

ST Ali CMED6100 – Session 7 Slide 29
Introduction Errors Power Summary

Example experiment 2 (v1-40)

Quite a few (10/40) of the
95% confidence intervals
include 1, and in these
experiments we wouldn’t be
justified in rejecting the null
hypothesis.

Sometimes our hypothesis

tests will produce false
negative
conclusions.
ST Ali CMED6100 – Session 7 Slide 30
Introduction Errors Power Summary

Type II error (β)

• The type II error risk (β) is defined as the probability of
incorrectly failing to reject a false null hypothesis.
• This is analogous to a false-negative risk.
• 1 − β is known as the power of a hypothesis test, and is the
probability of correctly rejecting a false null hypothesis.
• Phase 3 RCTs will typically be designed to have a power of
70%–90% for effect sizes that are considered clinically
important.
• Earlier we asked why we should use a significance level of 5%
rather than a lower level of (say) 1% ...
ST Ali CMED6100 – Session 7 Slide 31
Introduction Errors Power Summary

Relationship between α and β

Our study had power ∼0.75

1.0
to detect the true RR of
0.8
0.33 at a significance level
0.6
of α = 0.05. Power
β)
(1−β
0.4

If we had specified a 0.2

stricter α of 0.01, we would 0.0

0.00 0.05 0.10 0.15
have had very low power. α)
Significance level (α

ST Ali CMED6100 – Session 7 Slide 32

Introduction Errors Power Summary

Balance between α and β

• Typically we will choose α smaller than β.

• For example, many randomised trials have α = 0.05 and

β = 0.2 (=80% power).

• It is usually considered preferable to make a false negative

error (e.g. fail to approve a new beneficial intervention) than
to make a false positive error (e.g. licence a new (expensive!)
drug which is no more effective than the existing therapy).

ST Ali CMED6100 – Session 7 Slide 33

Introduction Errors Power Summary

What about the effect size?

1.0
We would have lower power
0.8 RR=0.33
to detect a smaller effect or RR=3.0

size (here, an effect which Power

0.6
RR=0.50
β)
(1−β
is closer to the null RR of 0.4
or RR=2.0

RR=0.67
1.0) at any given 0.2
or RR=1.5

significance level. 0.0

0.0 0.1 0.2 0.3 0.4 0.5
α)
Significance level (α

ST Ali CMED6100 – Session 7 Slide 34

Introduction Errors Power Summary

What about the sample size?

Example experiment 2 had

1.0
n = 32 in each arm and a
0.8
true RR of 0.33. n=32

n=20
0.6
Power
β)
(1−β n=10
0.4
A smaller sample would
0.2
reduce our power to detect
0.0
this effect size, at a given
0.0 0.1 0.2 0.3 0.4

significance level. α)
Significance level (α

ST Ali CMED6100 – Session 7 Slide 35

Introduction Errors Power Summary

We will have higher power if . . .

• The significance level is less strict (larger α).

• We are not looking to detect smaller effects (we are only interested in
larger effects).
• The sample size is larger

• Our continuous outcome is less variable between patients (see examples

later)

• Also note that some statistical methods are inherently more powerful

than others
– e.g. parametric tests are more powerful than non-parametric; most
sample size calculations are based on parametric tests unless there
is a strong reason to believe this is inappropriate.
ST Ali CMED6100 – Session 7 Slide 36
Introduction Errors Power Summary

Summary table of α and β

Is there a true difference?

Yes No
type I error risk
Rejects H0 1−β α
(power) (significance)
Fail to type II error risk
reject H0 β 1−α
(confidence)

ST Ali CMED6100 – Session 7 Slide 37

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Part II

Sample size calculations

ST Ali CMED6100 – Session 7 Slide 38

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Ethics in RCTs

In randomised controlled trials (RCTs), it is particularly important

to get the right sample size:
“A small study with no chance of detecting a clinically significant
difference between treatments is unfair to all the subjects put to
the risk and discomfort of the trial.”
“A study which is too large may be unfair if one treatment is
proven to be more effective and so a large number of patients
receive inferior treatment.”
(Altman & Gore, 1982)
ST Ali CMED6100 – Session 7 Slide 39
Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Research Ethics

The previous comments are not only true for RCTs ...

In a questionnaire survey of doctors, is it ethical to send a

questionnaire survey to 10,000 doctors, when it would have been
sufficient to question only 1,000?

Alternatively, is it ethical to send a questionnaire to only 500

patients, when you would need at least 5000 returns to be able to
answer your research question?
ST Ali CMED6100 – Session 7 Slide 40
Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

What information do we need to calculate sample size?

• The desired significance level (α), and the desired power

(1 − β);
– Higher confidence and power require bigger sample
• The size of the effect we want to detect (perhaps a ‘clinically
important difference’ ?)
– Need a bigger sample to detect a smaller effect
• The likely variability of the measurements
– If measurements are more variable, need bigger sample

ST Ali CMED6100 – Session 7 Slide 41

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

What other information might we need?

• Background knowledge (literature review);

• Research objectives;

• Outcome measures;

• Proposed statistical methods of analysis;

• An estimate of the resources required;
– Money;
– Time;

• An idea of the proportion of eligible participants that will

agree to participate.
ST Ali CMED6100 – Session 7 Slide 42
Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Different types of data

• We will look at sample size calculations for two general

situations:
– Continuous outcome variables (not necessarily following a
Normal distribution);
– Dichotomous (binary) outcome variables;
• For each outcome variable we will look at two types of
calculation:
– One-group calculations
– Two-group calculations

ST Ali CMED6100 – Session 7 Slide 43

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Background

Continuous outcome

Dichotomous outcome

Exercise

Odds ratios

Other issues

ST Ali CMED6100 – Session 7 Slide 44

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Continuous outcome (1 group)

• We want to produce an estimate of the mean of a continuous
outcome variable, to a particular fixed precision.
• How can we calculate the required sample size n?
• Determine the likely variability, σ, of the measurements.
– Usually based on previous studies in the literature, or on our
own pilot studies.
• Decide what significance level (α) we want, for example
α = 0.05 will produce a 95% confidence interval.
• Decide how wide (w ) we want the final confidence interval to
be, in terms of the units of the outcome variable.
ST Ali CMED6100 – Session 7 Slide 45
Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Continuous outcome (1 group) – cont

• For the standard choice of α = 0.05, the formula is:

16σ 2
n=
w2

√
• Why? A 95% confidence interval is the estimate ±1.96σ/ n,
√
so the width w = 2 × 1.96σ/ n.

• For the standard choice α = 0.05, (2 × 1.96)2 ≈ 16.

ST Ali CMED6100 – Session 7 Slide 46

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Example (continuous, 1 group)

• Study to estimate the mean systolic blood pressure of elderly
care home residents.
• When I analyse the data, I want my 95% confidence interval
to have a margin of error of ±2mmHg (or, equivalently, to
have width 4mmHg).
• From a pilot study, I think that the subjects’ systolic blood
pressures will follow a Normal distribution with a standard
deviation of approximately 10mmHg.
• How many patients (n) do I need to measure?
• From above, we specify α = 0.05, w = 4, σ = 10.
ST Ali CMED6100 – Session 7 Slide 47
Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Example (continuous, 1 group) – cont

• Using the formula given before,

16σ 2 16 × 102
n= = = 100.
w2 42

• I need to measure the systolic blood pressures of 100 patients.

ST Ali CMED6100 – Session 7 Slide 48

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Continuous outcome (2 groups)

• We want to detect a difference between a continuous outcome
variable in two equally-sized groups;

• How can we calculate the required sample size n of each of the two
groups?

• Determine the means µ1 and µ2 that we expect to see;

• Or determine µ1 and then choose µ2 so that the difference between

µ2 and µ1 is a ‘clinically important difference’.

• Determine the variability, σ, of the measurements.

• Usually, our ‘estimates’ are based on previous studies in the

literature, or on our own pilot studies.
ST Ali CMED6100 – Session 7 Slide 49
Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Continuous outcome (2 groups) – cont

For standard choices of (power) 1 − β = 0.80, and (significance

level) α = 0.05, the formula for the number of subjects required in
each group is:

16
n = ,
∆2
µ 1 − µ2
∆= .
σ

ST Ali CMED6100 – Session 7 Slide 50

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Example (continuous, 2 groups)

• I want to test a ‘magic diet pill’.

• I will take a group of people, and randomly assign them to

take either this diet pill, or a placebo.
• Patients should take the pill for 6 months, and at the end of
this period I will compare the weight changes in the two
groups of patients.
• I hope that the diet pill will be more effective in reducing
weight, but I should set up the experiment to test the null
hypothesis µ1 = µ2 .
ST Ali CMED6100 – Session 7 Slide 51
Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Example (continuous, 2 groups) – cont

• From my background knowledge of the patient population, I

think that the mean weight loss in the placebo group is likely
to be approximately 4kg, with standard deviation 2.5kg.

• I would like to detect a difference if the diet pill reduces

weight by at least 1kg on top of the 4kg in the placebo group.

• So I specify µ1 = −4, µ2 = −5, and σ = 2.5.

ST Ali CMED6100 – Session 7 Slide 52

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Example (continuous, 2 groups) – cont

• I want to have 80% power and significance level 5%

µ1 − µ 2 −4 − (−5)
∆= = = 0.4
σ 2.5
16
n = = 100
∆2

• Therefore I would need 100 patients in each group, or a total

of 200 patients.

ST Ali CMED6100 – Session 7 Slide 53

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Background

Continuous outcome

Dichotomous outcome

Exercise

Odds ratios

Other issues

ST Ali CMED6100 – Session 7 Slide 54

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Dichotomous outcome (1 group)

• We want to produce an estimate of the prevalence of a dichotomous
outcome variable, to a particular fixed precision. This could be the
prevalence of a particular condition.

• How can we calculate the required sample size n?

• Determine the probability p that the outcome variable will take the
value 1 rather than 0, or equivalently the prevalence.

• Decide what significance level (α) we want, for example α = 0.05

will produce a 95% confidence interval.

• Decide how wide (w ) we want the confidence interval to be.

ST Ali CMED6100 – Session 7 Slide 55

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Dichotomous outcome (1 group) – cont

• For the standard choice of α = 0.05, the formula is:

16p(1 − p)
n=
w2

• Why? The width of a 95% confidence interval is

√ √
w = 2 × 1.96σ/ n (or, the estimate ±1.96σ/ n).

• We can approximate σ 2 by p(1 − p).

• (2 × 1.96)2 ≈ 16.

ST Ali CMED6100 – Session 7 Slide 56

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Example (dichotomous, 1 group)

• I want to investigate the prevalence p of smoking in the Hong

Kong population in individuals aged 20-30.

• I want my 95% confidence interval to be precise to ±1% (or,

equivalently, a width of 2%).

• I think that approximately 8% of the people in this age group

are current smokers.

• How many individuals do I need to question?

• From above, we specify α = 0.05, w = 0.02, p = 0.08.

ST Ali CMED6100 – Session 7 Slide 57

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Example (dichotomous, 1 group) – cont

• Using the formula given before,

16p(1 − p) 16 × 0.08 × 0.92

n= = = 2944.
w2 0.022

• I need to question 2944 individuals about their current

smoking status.

• ... extra note ... if I want to use a questionnaire, and I expect

that 50% of subjects will complete the survey, I need to send
out 6000 questionnaires.

ST Ali CMED6100 – Session 7 Slide 58

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Dichotomous outcome (2 groups)

• We want to detect a difference between a dichotomous
outcome variable (e.g. prevalence) in two equally-sized groups;
• How can we calculate the required sample size n of each of
the two groups?
• Determine the probabilities p1 and p2 of the binary outcome
variable being 1 in each of the two groups;
• Or determine p1 and then choose p2 so that the difference
between p2 and p1 is a ‘clinically important difference’.
• Decide what power and significance level we want to use.
ST Ali CMED6100 – Session 7 Slide 59
Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Dichotomous outcome (2 groups) – cont

• For standard choices of (power) 1 − β = 0.80, and

(significance level) α = 0.05, the formula for the number of
subjects in each group is:

8
n = ,
∆2
(p1 − p2 )2
∆2 = .
p1 (1 − p1 ) + p2 (1 − p2 )

ST Ali CMED6100 – Session 7 Slide 60

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Example (dichotomous, 2 groups)

• Patients often don’t turn up to their appointments.

• I want to investigate whether a telephone reminder will make

patients more likely to attend.

• For a group of 2n patients in my surgery, I will randomise half

(n) of them to be reminded about their appointment by
telephone. The other n will not be reminded by telephone.

• How big should n be?

ST Ali CMED6100 – Session 7 Slide 61

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Example (dichotomous, 2 groups) – cont

• Define group 1 as the non-telephoned control group, and

group 2 as the telephoned group.

• From my background knowledge of the patient population, I

know that approximately 70% of patients attend their
appointments anyway.

• I think that a telephone reminder will cause an extra 10% of

patients to attend their appointment.

• So I specify p1 = 0.70, and p2 = 0.80.

ST Ali CMED6100 – Session 7 Slide 62

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Example (dichotomous, 2 groups) – cont

• I want to have 80% power and significance level 5%

(p1 − p2 )2 0.102
∆2 = =
p1 (1 − p1 ) + p2 (1 − p2 ) 0.7 × 0.3 + 0.8 × 0.2
8 8
n = 2
= = 296
∆ 0.0207

• Therefore I would need 296 patients in each group, or a total

of 592 patients.

ST Ali CMED6100 – Session 7 Slide 63

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Background

Continuous outcome

Dichotomous outcome

Exercise

Odds ratios

Other issues

ST Ali CMED6100 – Session 7 Slide 64

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Sample sizes for odds ratios

• For a rule of thumb calculation, odds ratios can be rephrased

as ratios of probabilities, and then we can use the calculations
for binary outcomes in two groups.

• Recall that

p1 /(1 − p1 )
OR = ,
p2 /(1 − p2 )

where p1 = {proportion exposed in the cases},

p2 = {proportion exposed in the controls}.

ST Ali CMED6100 – Session 7 Slide 65

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Example (odds ratio)

• In a case-control study, we would like to compare the

immunisation coverage in a group of tuberculosis cases (group
1) to a group of controls (group 2).

• I will find n cases and n controls.

• How big should n be?

ST Ali CMED6100 – Session 7 Slide 66

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Example (odds ratio) – cont

• A pilot study has suggested that approximately 30% of the

controls are vaccinated.

• An odds ratio of 2 would be considered an important

difference.

• So I specify p2 = 0.30, and calculate p1 = 0.46.

Note – For calculation ... OR = 2 = [p1 /(1 − p1 )]/(0.3/0.7).

So p1 /(1 − p1 ) = 0.86 and then p1 = 0.86/1.86 = 0.46.

ST Ali CMED6100 – Session 7 Slide 67

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Example (odds ratio) – cont

• I want to have 80% power and significance level 5%

(p1 − p2 )2 0.162
∆2 = =
p1 (1 − p1 ) + p2 (1 − p2 ) 0.46 × 0.54 + 0.3 × 0.7
8 8
n = 2
= = 143
∆ 0.0558

• Therefore I would need 143 patients in each group, or a total

of 286 patients.

ST Ali CMED6100 – Session 7 Slide 68

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Other effect measures

• It is also possible, though not so straightforward, to calculate
sample sizes for other outcome measures, such as:
– Categorical data;
– Correlation coefficients;
– Count data;
– Regression coefficients for regression models;
– Survival data;
• Note that for regression models (except with survival data) it
is usually the case that adjusting for covariates will improve
the power of a study, compared to a crude comparison

ST Ali between two groups.CMED6100 – Session 7 Slide 69

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Unbalanced designs

• Sometimes we may prefer to use unbalanced two-group

designs (i.e. one group is bigger than the other):
– One intervention is much more expensive?
– In a case-control study, cases are very rare?

• Again, this is possible

• One example of this in practical 3.

ST Ali CMED6100 – Session 7 Slide 70

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Practical 3

• Other levels of power.

• Other scenarios (e.g. fixed sample size).

• Use computer rather than calculating by hand.

• Presentation of sample size calculations.

ST Ali CMED6100 – Session 7 Slide 71

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Post-hoc power evaluation

• Suppose we conduct a cohort study, and find that the

association between smoking and breast cancer is fairly strong
but non-significant, with risk ratio 2.0 (95% CI: 0.91, 3.32;
p=0.09).

• Would it be a good idea to check what was the power of our

study to detect a risk ratio of at least 2.0, 3.0 or 4.0? . . .

ST Ali CMED6100 – Session 7 Slide 72

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Post-hoc power evaluation

• . . . not really. We already know what the answer will be,

• The power must be fairly low to detect a risk ratio of 2.0 or

smaller with statistical significance since our study did not
detect this risk ratio with statistical significance.

• Risk ratios as high as 4.0 are not consistent with our data
(the upper limit of our 95% confidence interval was 3.32).

ST Ali CMED6100 – Session 7 Slide 73

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Post-hoc power evaluation

• All relevant information is contained in the estimate and

confidence interval – our best estimate of the risk ratio is 2.0
and our findings are most consistent with risk ratios between
0.91 and 3.32.

• Power will be low to detect effects between the point estimate

and the null value (e.g. 1.5 in our example), and higher to
detect effects further away from the null value (e.g. 2.5).
Effects outside the confidence interval are not very consistent
with our data.
ST Ali CMED6100 – Session 7 Slide 74
Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Relative effect sizes

• In our example sample size calculations for continuous

outcome variables, we were careful to specify the effect size of
interest (µ1 − µ2 ), and to consider also the standard deviation
of measurements (σ).

• There is an alternative approach where we instead describe

the power to detect various effect sizes of the form
∆ = (µ1 − µ2 )/σ.

ST Ali CMED6100 – Session 7 Slide 75

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Cohen’s effect sizes

• Jacob Cohen has defined ‘small’ (0.2), ‘medium’ (0.5) and ‘large’
(0.8) effect sizes.

• An effect size of 0.8 would correspond to a difference between

groups of 0.8 standard deviations.

• With effect sizes, we avoid having to know anything about σ.

• Generally it is not a good idea to use these in sample size

calculations,
– because we should be honest about the clinical significance of the
effect size our study is powered to detect (1 mmHg, 0.1 kg/m2 )
rather than a vague statement about ‘a medium effect.’
– because effect sizes depend on the variability of measurements.
ST Ali CMED6100 – Session 7 Slide 76
Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Danger of effect sizes in power calculations

• If I want to compare the efficacy of two stimulants in

increasing pulse rate, my measurements would be more precise
(less variable) if I use an electronic pulse monitor (EPM) than
if I simply use a wristwatch.

• The mean pulse rate of the two groups will be compared with
a 2 sample t-test.

• I could state that I wish to be able to detect a medium effect

size ∆ = 0.5) . . .

ST Ali CMED6100 – Session 7 Slide 77

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Danger of effect sizes in power calculations

• Study should have 80% power to detect a medium effect size
of ∆ = 0.5 (i.e. µ1 − µ2 = 0.5σ).
• Using either measurement device, we include 128 participants.
• But what absolute difference will we have 80% power to
detect? 1bpm, 5bpm, 10bpm?

Choice of σ Absolute Sample size

method effect required
Wristwatch ? ? 128
EPM ? ? 128
ST Ali CMED6100 – Session 7 Slide 78
Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Danger of effect sizes in power calculations

• After further research, I decide that if I use the wristwatch I
will have σ1 = 4 bpm while if I use the EPM σ2 = 2 bpm.
• If I use the wristwatch I can only detect a difference of at
least 2 bpm, whereas using the EPM I can detect a difference
of 1 bpm.

Choice of σ Absolute Sample size

method effect required
Wristwatch 4 2 128
EPM 2 1 128
ST Ali CMED6100 – Session 7 Slide 79
Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Danger of effect sizes in power calculations

• What if I decide that I want to detect a difference of at least
1 bpm?

• We would require a much larger sample if we chose to use the

less precise assessment method:

Choice of σ Absolute Sample size

method effect required
Wristwatch 4 1 510
EPM 2 1 128

ST Ali CMED6100 – Session 7 Slide 80

Background Continuous outcome Dichotomous outcome Exercise Odds ratios Other issues

Try to avoid relative effect sizes

• The variance (or precision) of measurements can have a huge
effect on the power and sample size calculation, and should
not be ignored.
• If you are planning a study but really have no idea of what
absolute effect size it might be important to detect
– do further research, speak to some experts.
• If you are planning a study but really have no idea about the
variance of measurements
– make conservative assumptions, and provide power calculations
that allow for a range of plausible values.
ST Ali CMED6100 – Session 7 Slide 81
Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

Pragmatic approaches
• Norman et al. BMJ 2012; 345:e5278

• Note the difficulty in estimating important parameters

• Note the logistical and financial constraints (often we choose the

largest study that we can afford)

• Propose ‘pragmatic’ choice of sample size based on other studies in

the literature, and logistical/financial constraints.

• Criticisms:
– Ethics of inappropriate sample size (but small studies can still be
useful if included in meta analyses).
– Justification for conducting studies without enough background
ST Ali CMED6100 – Session 7 Slide 82
Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

Comment on multiple hypothesis tests

• Example: we conduct an RCT of drug A vs drug B for the
treatment of migraine. We perform a hypothesis test on the results,
with α = 0.05.

• The results are not statistically significant, i.e. no evidence that

either drug is superior,

• Then we split the data into two groups, a high risk and a low risk
group, and perform a hypothesis test on each group, each test with
α = 0.05.

• What is the chance of incorrectly rejecting a true null hypothesis in

the second analysis?
ST Ali CMED6100 – Session 7 Slide 84
Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

Multiple hypothesis tests

• In a test with a type I error risk of 0.05 and when the null
hypothesis is true, we have a 95% chance to correctly fail to reject
the null hypothesis.

• In two such tests (assuming independence), the chance to correctly

fail to reject the null hypotheses in both tests is 0.95 × 0.95 ≈ 0.90

• Therefore the chance of incorrectly rejecting at least one true null

hypothesis is increased to around 10%.

• In general for κ tests at α = 0.05 each with a true null hypothesis,

the chance of incorrectly rejecting at least one true null hypothesis
is 1 − 0.95κ
ST Ali CMED6100 – Session 7 Slide 85
Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

The Bonferroni correction

• When α is small, we can approximate (1 − α)κ ≈ 1 − κα.

– E.g. 0.95κ ≈ 1 − 0.05κ, and 1 − 0.95κ ≈ 0.05κ.

• Then we can simply divide our desired overall significance level by κ for
each hypothesis test

• E.g. if we perform 5 significance tests, and we want an overall 5% type I

error risk, then each hypothesis test would have to give a p-value below
0.01 (=0.05/5) to be considered statistically significant.

• This is known as the Bonferroni correction.

ST Ali CMED6100 – Session 7 Slide 86

Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

Review

• The significance level, or type I error risk (α), is a false

positive risk.

• The type II error risk (β) is a false negative risk.

• The power of a hypothesis test is (1 − β).

• Our test will have increase power if we:
– increase the sample size;
– increase the type I error risk (α)
– specify that we want to detect a larger difference.

ST Ali CMED6100 – Session 7 Slide 87

Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

Review (cont.)

• In determining the sample size for a study, we need to know:

– The objectives of the study and the study design;
– The outcome variables and the method of analysis;
– The variability of measurements;
– Depending on whether we have 1 or 2 groups:
• the width of the confidence interval that we want to produce;
• the magnitude of any treatment effect and the power required;

– The significance level that we want to use;

– The resources available; the estimated response fraction.

ST Ali CMED6100 – Session 7 Slide 88

Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

Different effect sizes

• Sometimes we will quote the power of a hypothesis test to

detect various effect sizes

• E.g. “Our study has 95%, 80% and 60% power to detect
relative risks of 2.0, 1.5 and 1.25, respectively.”

• Or we may present a short table (see practical 3).

ST Ali CMED6100 – Session 7 Slide 89

Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

Two important tips

• I would like to estimate mean height precisely, where the

width of the confidence interval is no more than 1cm
– mean ±1cm?
– mean ±0.5cm?
• I calculated sample size for a comparison of two groups, and
came up with n = 500.
– is that 500 in each group?
– or 500 overall (250 in each group)?

ST Ali CMED6100 – Session 7 Slide 90

Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

Not covered

• Sample size calculations for more complex studies.

– e.g. cluster randomized designs

• The 2-group approaches discussed today are meant for studies

that aim to detect a difference between two groups. Different
calculations are needed for studies that aim to demonstrate
equivalence between two groups.

ST Ali CMED6100 – Session 7 Slide 91

Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

Summary

• There are ethical considerations in study design, particularly in

choosing appropriate sample sizes.
– underpowered studies are a waste of time and can also lead to
misleading research findings (Ioannidis, 2005, PLoS Med)

• Consider consulting a statistician, if you have difficulty with a

sample size calculation.

ST Ali CMED6100 – Session 7 Slide 92

Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

References
• Bland JM. The tyranny of power: is there a better way to
calculate sample size? BMJ, 2009; 339:b3985.

• Gore SM, Altman DG (eds). Statistics in Practice: Articles

Published in the British Medical Journal. (1982).

• Ioannidis JP. Why most published research findings are false.

PLoS Med, 2005; 2(8):e124.

• Norman G, et al. Sample size calculations: should the

emperor’s clothes be off the peg or made to measure? BMJ,
2012; 345:e5278.
ST Ali CMED6100 – Session 7 Slide 93
Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

• BMJ Statistics at square one (chapter)

– http://bmj.bmjjournals.com/collections/statsbk

• Altman and Bland. Absence of evidence is not evidence of absence

– http://bmj.bmjjournals.com/cgi/content/full/311/7003/485
• Lenth RV. Some practical considerations for effective sample size
determination. American Statistician, 2001; 55(3), 187:193.

• Hoenig JM and Heisey DM. The abuse of power: the pervasive Fallacy of
power calculation for data analysis. American Statistician, 2001; 55(1),
19:24.

ST Ali CMED6100 – Session 7 Slide 94

Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

Appendix – Continuous outcome (2 groups)

• (No need to remember these formulae!)

• Where did the formula come from? Using Normal distribution

theory, the distribution of the sample difference X¯1 − X¯2
between the two groups will be Normal with mean µ1 − µ2 ,
and standard error, s, where
r
σ2 σ2
s= +
n n

• Or equivalently, s 2 = 2σ 2 /n

• Notice that s depends inversely on n.

ST Ali CMED6100 – Session 7 Slide 95
Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

Continuous outcome (2 groups) – cont

• The null hypothesis (µ1 − µ2 = 0) will be rejected if:

|X¯1 − X¯2 |
> z1−α/2 .
s

• Since the sample difference X¯1 − X¯2 follows a Normal

distribution, we can consider the ‘standardised’ distribution of
the sample difference:

X¯1 − X¯2 − (µ1 − µ2 ) X¯1 − X¯2 µ1 − µ2

= − .
s s s

ST Ali CMED6100 – Session 7 Slide 96

Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

Continuous outcome (2 groups) – cont

• The power of the test (1 − β) is given by 1 − Φ(u), where u is

the standardised distribution and where the null hypothesis is
rejected. In other words,

(µ1 − µ2 )
Φ z1−α/2 − =β
s

• Note that if Φ(u) = β, then u = zβ .

ST Ali CMED6100 – Session 7 Slide 97

Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

Continuous outcome (2 groups) – cont

• Therefore we want to have s such that

|µ1 − µ2 |
= z1−α/2 − zβ
s

• Squaring and re-arranging gives:

(µ1 − µ2 )2 n(µ1 − µ2 )2
= = (z1−α/2 − zβ )2
s2 2σ 2

• Then rearrange even further ...

ST Ali CMED6100 – Session 7 Slide 98

Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

Continuous outcome (2 groups) – cont

• We get

2(z1−α/2 − zβ )2 σ 2
n=
(µ1 − µ2 )2

• For the standard choices of α = 0.05 and 1 − β = 0.80, the

values of z1−α/2 and zβ are 1.96 and -0.84.

• Then 2(z1−α/2 − zβ )2 = 15.68 ≈ 16

• (No need to remember these formulae!)

ST Ali CMED6100 – Session 7 Slide 99

Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

Dichotomous outcome (2 groups) – cont

• For standard choices of (power) 1 − β = 0.80, and

(significance level) α = 0.05, the formula for the number of
subjects in each group is:

8
n = ,
∆2
(p1 − p2 )2
∆2 = .
p1 (1 − p1 ) + p2 (1 − p2 )

• We can call this the ‘rule of 8’.

ST Ali CMED6100 – Session 7 Slide 100

Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous

Dichotomous outcome (2 groups) – cont

• The derivation of this formula is very similar to the derivation

of the continuous outcome sample size formula. A more
detailed formula, for significance level α and power (1 − β) is:

(z1−α/2 − zβ )2 [p1 (1 − p1 ) + p2 (1 − p2 )]
n= .
(p1 − p2 )2

ST Ali CMED6100 – Session 7 Slide 101

Formula Cheat Sheet Cre
100% (5)
Formula Cheat Sheet Cre
40 pages
Essentials of Biostatistics - Second Edi
0% (2)
Essentials of Biostatistics - Second Edi
13 pages
Book On Bussiness Statisitics!!!
0% (1)
Book On Bussiness Statisitics!!!
565 pages
Main Title: Planning Data Analysis Using Statistical Data
100% (1)
Main Title: Planning Data Analysis Using Statistical Data
40 pages
One-Way ANOVA: What Is This Test For?
No ratings yet
One-Way ANOVA: What Is This Test For?
21 pages
The Correlation of Sleep and Academic Performance: February 2021
No ratings yet
The Correlation of Sleep and Academic Performance: February 2021
12 pages
Buku SPSS Complete
No ratings yet
Buku SPSS Complete
72 pages
Coolidge Chapter 6
No ratings yet
Coolidge Chapter 6
57 pages
BiostatsEpiSlides PDF
No ratings yet
BiostatsEpiSlides PDF
245 pages
MPRA Paper 113915
No ratings yet
MPRA Paper 113915
33 pages
Statistics and Probability: Exploring Rejection Region
No ratings yet
Statistics and Probability: Exploring Rejection Region
15 pages
Signal Detection Theory
No ratings yet
Signal Detection Theory
13 pages
Inferential Statistics, T Test, ANOVA & Proportionate Test
No ratings yet
Inferential Statistics, T Test, ANOVA & Proportionate Test
117 pages
A Level H2 Math Nanyang JC2 Prelim 2018
No ratings yet
A Level H2 Math Nanyang JC2 Prelim 2018
54 pages
Activity 4.2 Problems For Z-Test and T-Test Statistics: Group Work
100% (1)
Activity 4.2 Problems For Z-Test and T-Test Statistics: Group Work
4 pages
Vegan - 2013
No ratings yet
Vegan - 2013
255 pages
Research Methodology Notes Part 3
No ratings yet
Research Methodology Notes Part 3
30 pages
CB2200 Course Outline
No ratings yet
CB2200 Course Outline
6 pages
Lampiran 1 Tensimeter Digital Non Invasive Blood Pressure: Coda Kent Scientific Corporation
No ratings yet
Lampiran 1 Tensimeter Digital Non Invasive Blood Pressure: Coda Kent Scientific Corporation
11 pages
Sample Size Determination: Maj. Tun Tun Win
No ratings yet
Sample Size Determination: Maj. Tun Tun Win
38 pages
Sample Size Workshop 1
No ratings yet
Sample Size Workshop 1
51 pages
Chapter 4 Correlational Analysis
100% (2)
Chapter 4 Correlational Analysis
13 pages
Basics of Inference: Department of Statistics Ram Lal Anand College
No ratings yet
Basics of Inference: Department of Statistics Ram Lal Anand College
34 pages
The T
No ratings yet
The T
10 pages
LDA at Work
No ratings yet
LDA at Work
53 pages
Statistical Significance Versus Clinical Relevance
No ratings yet
Statistical Significance Versus Clinical Relevance
38 pages
Why Is It Important To Consider Sample Size?
No ratings yet
Why Is It Important To Consider Sample Size?
98 pages
2013 New Investigators Short Course Sample Size Handout
No ratings yet
2013 New Investigators Short Course Sample Size Handout
19 pages
Research Methodology: Dr. Anwar Hasan Siddiqui
100% (1)
Research Methodology: Dr. Anwar Hasan Siddiqui
30 pages
Sampling Risks Multiple Choice Quiz
No ratings yet
Sampling Risks Multiple Choice Quiz
3 pages
Academy of Management The Academy of Management Review
No ratings yet
Academy of Management The Academy of Management Review
26 pages
Mas S Mohktar Email: Mas - Dayana@um - Edu.my Phone (Office) : 0379677681
No ratings yet
Mas S Mohktar Email: Mas - Dayana@um - Edu.my Phone (Office) : 0379677681
25 pages
New Microsoft Word Document
No ratings yet
New Microsoft Word Document
7 pages
This Icon: Go Forwards Go Backwards Exit
No ratings yet
This Icon: Go Forwards Go Backwards Exit
25 pages
Sample Size Calculations: DR R.P. Nerurkar
No ratings yet
Sample Size Calculations: DR R.P. Nerurkar
48 pages
How To Read A Paper - The Statistics
No ratings yet
How To Read A Paper - The Statistics
62 pages
Study Designs: Sample Bias
No ratings yet
Study Designs: Sample Bias
4 pages
Biostatistics: I. Types of Data
No ratings yet
Biostatistics: I. Types of Data
14 pages
Bba QT
No ratings yet
Bba QT
1 page
Lecture Slides Sec 5 Power
No ratings yet
Lecture Slides Sec 5 Power
41 pages
Statistical Inference: Dr. Mona Hassan Ahmed
No ratings yet
Statistical Inference: Dr. Mona Hassan Ahmed
34 pages
Final Exam Review
No ratings yet
Final Exam Review
6 pages
6 Inferential Statistics VI - May 12 2014
No ratings yet
6 Inferential Statistics VI - May 12 2014
40 pages
Understanding P - Values and CI 20nov08
No ratings yet
Understanding P - Values and CI 20nov08
37 pages
AP Stats Chapter 11 Notes
No ratings yet
AP Stats Chapter 11 Notes
10 pages
Lesson 15-Test of Hypothesis
No ratings yet
Lesson 15-Test of Hypothesis
3 pages
Science
No ratings yet
Science
26 pages
XXXX Hypothesis Testing
No ratings yet
XXXX Hypothesis Testing
101 pages
Unit 5 Exam Review Answers
No ratings yet
Unit 5 Exam Review Answers
6 pages
2sample Size Determination Jan 2023
No ratings yet
2sample Size Determination Jan 2023
69 pages
CH 8 Part 2
No ratings yet
CH 8 Part 2
38 pages
14632practicalsignificance 161017020922
No ratings yet
14632practicalsignificance 161017020922
25 pages
Statistics For Dummies Rachel Enriquez
No ratings yet
Statistics For Dummies Rachel Enriquez
41 pages
Overview of Hypothesis Testing: Laura Lee Johnson, PH.D
No ratings yet
Overview of Hypothesis Testing: Laura Lee Johnson, PH.D
71 pages
Type 1 and Type 2 Errors
No ratings yet
Type 1 and Type 2 Errors
45 pages
As I Was Scrolling Through Job Listings
No ratings yet
As I Was Scrolling Through Job Listings
3 pages
Lesson Nov 18 2023
No ratings yet
Lesson Nov 18 2023
22 pages
What Do P-Values and Confidence Intervals Really Tell Us?
No ratings yet
What Do P-Values and Confidence Intervals Really Tell Us?
52 pages
Biostatistics Assignments
No ratings yet
Biostatistics Assignments
10 pages
PPT10 14 - Nov7
No ratings yet
PPT10 14 - Nov7
26 pages
Test 2 2019
No ratings yet
Test 2 2019
9 pages
Basic Biostats, 2
No ratings yet
Basic Biostats, 2
58 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
8 pages
Notes & Notes: Biostatistics & EBM
No ratings yet
Notes & Notes: Biostatistics & EBM
35 pages
P 4 C 07
No ratings yet
P 4 C 07
14 pages
Sample Size
No ratings yet
Sample Size
33 pages
Sample Size, Sampling, Stat Analysis
No ratings yet
Sample Size, Sampling, Stat Analysis
83 pages
G Power
No ratings yet
G Power
5 pages
Biostatistics Final
No ratings yet
Biostatistics Final
7 pages
Module 3b - Random Sampling and Sampling Error
No ratings yet
Module 3b - Random Sampling and Sampling Error
32 pages
Buy Ebook The Craft of Political Research 10th Edition W. Phillips Shively Cheap Price
100% (10)
Buy Ebook The Craft of Political Research 10th Edition W. Phillips Shively Cheap Price
84 pages
11-12 Hypothesis Tests
No ratings yet
11-12 Hypothesis Tests
29 pages
Statistics For Dummies
100% (3)
Statistics For Dummies
41 pages
Solution RM&IPR UTU 2022-23
No ratings yet
Solution RM&IPR UTU 2022-23
18 pages
Sample Size Calculation: Learning Objectives
No ratings yet
Sample Size Calculation: Learning Objectives
8 pages
Hypothesis Testing - Standard Error - Effect Size - Power
No ratings yet
Hypothesis Testing - Standard Error - Effect Size - Power
32 pages
Sample Size Calculation
No ratings yet
Sample Size Calculation
6 pages
EPIB 660 - 2008 - Session 8 - Sample Size Calculations
No ratings yet
EPIB 660 - 2008 - Session 8 - Sample Size Calculations
17 pages
L8 Hypothsis Testing
No ratings yet
L8 Hypothsis Testing
29 pages
ECT702 Lecture6 Hypothesis Testing-1
No ratings yet
ECT702 Lecture6 Hypothesis Testing-1
17 pages
CH 8
No ratings yet
CH 8
45 pages
Module 3 - Lecture Notes
No ratings yet
Module 3 - Lecture Notes
6 pages
23 +Lecture23+MAT361+ (08+APRIL+2025)
No ratings yet
23 +Lecture23+MAT361+ (08+APRIL+2025)
51 pages
Hypothesis Testing Practice Test
No ratings yet
Hypothesis Testing Practice Test
10 pages
6 - Hypothesis Testing
No ratings yet
6 - Hypothesis Testing
27 pages
Raghunath Chatterjee - Hypothesis Testing - Lecture
No ratings yet
Raghunath Chatterjee - Hypothesis Testing - Lecture
32 pages
Pi Is 2058534917300732
No ratings yet
Pi Is 2058534917300732
3 pages
Statistics II for Dummies
From Everand
Statistics II for Dummies
Deborah J. Rumsey
3.5/5 (31)
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Crush Hypothesis Testing
From Everand
Crush Hypothesis Testing
Allison Dillard
No ratings yet

Biostatistics of HKU MMEDSC Session 7

Uploaded by

Biostatistics of HKU MMEDSC Session 7

Uploaded by

Outline

School of Public Health

ST Ali CMED6100 – Session 7 Slide 2

• Practical 3 (sample size calculations) next week.

ST Ali CMED6100 – Session 7 Slide 3

After the lecture, students should be able to:

• Define type I and II errors, significance levels, and power;

• Describe the determinants of power;

• Practical 3 will cover the use of an online sample size

ST Ali CMED6100 – Session 7 Slide 9

Statistical hypothesis: Assertion or statement about the population characteristics (𝜇)

ST Ali CMED6100 – Session 7 Slide 10

• In a given scenario, classify the type of sample size calculation

Perspective of scientific investigation

Recall the lecture on hypothesis testing.

1. Set up a null hypothesis that NO effect exists.

2. Try to build sufficient evidence to DISPROVE the null

His idea was that no theory is completely correct, but if not

ST Ali CMED6100 – Session 7 Slide 12

• Calculate the p-value as the probability of observing such

• Take action accordingly.

The Neyman-Pearson approach

• Set a null hypothesis (not necessarily ‘no difference’)

• Decide what action will be taken if your study provides

• Set a decision rule as a ‘significance level’, α, for the p-value,

ST Ali CMED6100 – Session 7 Slide 14

The Neyman-Pearson approach (cont.)

• Calculate the p-value as the probability of observing such unusual

• Strict interpretation of p-value:

• Take action accordingly.

• What are the consequences of the resulting action?

Errors in drawing conclusions:

Is there a true difference?

The type I error (α) corresponds to the false-positive risk.

ST Ali CMED6100 – Session 7 Slide 16

Costs/benefits associated with each decision

Is there a true difference?

To determine the appropriate value of α and β, we need to know

ST Ali CMED6100 – Session 7 Slide 17

The clash of titans!

• These two approaches are often confused and mixed up in

ST Ali CMED6100 – Session 7 Slide 19

• Randomise 32 patients to each arm.

• On average 25% of patients would experience the primary outcome

• We use α = 0.05 as the level for statistical significance.

Example experiment 1 (v1)

• Maybe these were the results:

Drug Sample size Headache

• 95% confidence interval for this RR is (0.43, 2.34)

• p-value under the null hypothesis RR = 1 is 1.00.

ST Ali CMED6100 – Session 7 Slide 21

Example experiment 1 (v2)

• p-value under the null hypothesis RR = 1 is 0.09.

• We correctly fail to reject the null hypothesis.

Example experiment 1 (v3)

• Or in another scenario, maybe these were the results:

Drug Sample size Headache

• p-value under the null hypothesis RR = 1 is 0.01.

• This time we incorrectly reject the null hypothesis.

ST Ali CMED6100 – Session 7 Slide 23

• It is possible (but unlikely) that by chance we could observe

• At the 5% significance level, we will make this mistake in 5%

• The results of 40 repetitions of this experiment are shown on

ST Ali CMED6100 – Session 7 Slide 24

Example experiment 1 (v1-40)

Type I error (α)

• Randomise 32 patients to each arm.

• We use p < 0.05 as the level for statistical significance.

Example experiment 2 (v1)

• Maybe these were the results:

Drug Sample size Headache

• p-value under the null hypothesis RR = 1 is  0.001.

• SO we reject null hypothesis and conclude A is better.

ST Ali CMED6100 – Session 7 Slide 28

Example experiment 2 (v2)

• p-value under the null hypothesis RR = 1 is 0.08.

• p-value under the null hypothesis RR = 1 is 0.001.