Biostatistics of HKU MMEDSC Session 7
Biostatistics of HKU MMEDSC Session 7
Designing studies
CMED6100 – Session 7
ST Ali
30 October 2021
sli.do/#hkubiostat21
Announcements
Announcements
• Ask related questions via Moodle forums only (not via email)
– Mention clearly the questions numbers/slide numbers/assignment or
practical session details
– Put your queries to the respective forums
• For any further clarifications, you need to discuss personally, come to office
hours
• Don’t ask questions, which you are supposed to answer (assignments questions
before deadline)
ST Ali CMED6100 – Session 7 Slide 4
Outline
Objectives
Critical Value
𝐻! : 𝜇 = 𝜇! 𝐻" : 𝜇 = 𝜇" (> 𝜇! )
Pr (Type I Error) = 𝛼 (Level of significance)
Pr (Type II Error) = 𝛽 (1 − 𝛽 is Power of the test)
Practical outcomes
You should be able to:
• Define key terms such as type I and II errors, power,
significance confidence etc.
• Describe the determinants of power
Fisher’s approach
• Set a null hypothesis (usually ‘no effect’ or ‘no difference
between two groups’)
• Perform the study, collect and analyse data.
Synthesis of approaches
Example experiment 1
• A small RCT comparing drug A and drug B for migraine (drug A vs.
drug B).
• After conducting the study, we analyse the data and estimate the
relative risk of headache on A vs B.
Example experiment 1
Approximately 5% of
these intervals will not
include 1, since we
pre-specified a
significance level of 0.05
ST Ali CMED6100 – Session 7 Slide 25
Introduction Errors Power Summary
Example experiment 2
• Another small RCT comparing two treatments for migraine (drug A
vs. drug C). Outcome is the same as before (1+ severe headache
within one month of randomisation).
• Drug A is same as before i.e. 25% is the true value for the risk of
the outcome on drug A. However, drug C is much less effective with
a 75% risk of the outcome, and the true RR for A vs C is 0.33.
• After conducting the study, we analyse the data and estimate the
relative risk of headache on A vs C.
1.0
We would have lower power
0.8 RR=0.33
to detect a smaller effect or RR=3.0
RR=0.67
1.0) at any given 0.2
or RR=1.5
n=20
0.6
Power
β)
(1−β n=10
0.4
A smaller sample would
0.2
reduce our power to detect
0.0
this effect size, at a given
0.0 0.1 0.2 0.3 0.4
significance level. α)
Significance level (α
• We are not looking to detect smaller effects (we are only interested in
larger effects).
• The sample size is larger
• Also note that some statistical methods are inherently more powerful
than others
– e.g. parametric tests are more powerful than non-parametric; most
sample size calculations are based on parametric tests unless there
is a strong reason to believe this is inappropriate.
ST Ali CMED6100 – Session 7 Slide 36
Introduction Errors Power Summary
Part II
Ethics in RCTs
Research Ethics
The previous comments are not only true for RCTs ...
• Research objectives;
• Outcome measures;
Background
Continuous outcome
Dichotomous outcome
Exercise
Odds ratios
Other issues
16σ 2
n=
w2
√
• Why? A 95% confidence interval is the estimate ±1.96σ/ n,
√
so the width w = 2 × 1.96σ/ n.
16σ 2 16 × 102
n= = = 100.
w2 42
• How can we calculate the required sample size n of each of the two
groups?
16
n = ,
∆2
µ 1 − µ2
∆= .
σ
µ1 − µ 2 −4 − (−5)
∆= = = 0.4
σ 2.5
16
n = = 100
∆2
Background
Continuous outcome
Dichotomous outcome
Exercise
Odds ratios
Other issues
• Determine the probability p that the outcome variable will take the
value 1 rather than 0, or equivalently the prevalence.
16p(1 − p)
n=
w2
• (2 × 1.96)2 ≈ 16.
8
n = ,
∆2
(p1 − p2 )2
∆2 = .
p1 (1 − p1 ) + p2 (1 − p2 )
(p1 − p2 )2 0.102
∆2 = =
p1 (1 − p1 ) + p2 (1 − p2 ) 0.7 × 0.3 + 0.8 × 0.2
8 8
n = 2
= = 296
∆ 0.0207
Background
Continuous outcome
Dichotomous outcome
Exercise
Odds ratios
Other issues
• Recall that
p1 /(1 − p1 )
OR = ,
p2 /(1 − p2 )
(p1 − p2 )2 0.162
∆2 = =
p1 (1 − p1 ) + p2 (1 − p2 ) 0.46 × 0.54 + 0.3 × 0.7
8 8
n = 2
= = 143
∆ 0.0558
Unbalanced designs
Practical 3
• Risk ratios as high as 4.0 are not consistent with our data
(the upper limit of our 95% confidence interval was 3.32).
• The mean pulse rate of the two groups will be compared with
a 2 sample t-test.
Pragmatic approaches
• Norman et al. BMJ 2012; 345:e5278
• Criticisms:
– Ethics of inappropriate sample size (but small studies can still be
useful if included in meta analyses).
– Justification for conducting studies without enough background
ST Ali CMED6100 – Session 7 Slide 82
Pragmatic Multiple testing Review Further reading Appendix – Continuous Appendix – Dichotomous
• Then we split the data into two groups, a high risk and a low risk
group, and perform a hypothesis test on each group, each test with
α = 0.05.
Review
Review (cont.)
• E.g. “Our study has 95%, 80% and 60% power to detect
relative risks of 2.0, 1.5 and 1.25, respectively.”
Not covered
Summary
References
• Bland JM. The tyranny of power: is there a better way to
calculate sample size? BMJ, 2009; 339:b3985.
Further reading
– http://bmj.bmjjournals.com/collections/statsbk
– http://bmj.bmjjournals.com/cgi/content/full/311/7003/485
• Lenth RV. Some practical considerations for effective sample size
determination. American Statistician, 2001; 55(3), 187:193.
• Hoenig JM and Heisey DM. The abuse of power: the pervasive Fallacy of
power calculation for data analysis. American Statistician, 2001; 55(1),
19:24.
• Or equivalently, s 2 = 2σ 2 /n
|X¯1 − X¯2 |
> z1−α/2 .
s
|µ1 − µ2 |
= z1−α/2 − zβ
s
(µ1 − µ2 )2 n(µ1 − µ2 )2
= = (z1−α/2 − zβ )2
s2 2σ 2
• We get
2(z1−α/2 − zβ )2 σ 2
n=
(µ1 − µ2 )2
8
n = ,
∆2
(p1 − p2 )2
∆2 = .
p1 (1 − p1 ) + p2 (1 − p2 )
(z1−α/2 − zβ )2 [p1 (1 − p1 ) + p2 (1 − p2 )]
n= .
(p1 − p2 )2