0% found this document useful (0 votes)
82 views

Randomised Controlled Trials (RCTS) - Sample Size. The Magic Number

This document discusses sample size calculations for randomized controlled trials (RCTs). It explains that the sample size needs to be large enough to provide a reliable answer to the research question but not too large as to include unnecessary subjects. The key factors in determining sample size are the type of comparison being made (e.g. superiority, equivalence, non-inferiority), the expected effect size of the therapies, the type of primary outcome (e.g. proportion, mean), and the desired power and significance level of the statistical test. Formulas are provided to calculate the required sample size based on these factors.

Uploaded by

razoblanco
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views

Randomised Controlled Trials (RCTS) - Sample Size. The Magic Number

This document discusses sample size calculations for randomized controlled trials (RCTs). It explains that the sample size needs to be large enough to provide a reliable answer to the research question but not too large as to include unnecessary subjects. The key factors in determining sample size are the type of comparison being made (e.g. superiority, equivalence, non-inferiority), the expected effect size of the therapies, the type of primary outcome (e.g. proportion, mean), and the desired power and significance level of the statistical test. Formulas are provided to calculate the required sample size based on these factors.

Uploaded by

razoblanco
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Basic Statistics For Doctors Singapore Med J 2003 Vol 44(4) : 172-174

Randomised
Controlled
Trials (RCTs) –
Sample Size:
The Magic Number?
Y H Chan
Tel: (65) 6317 2121 Fax: (65) 6317 2122 Email: chanyh@ white cat could be lying under a table
cteru.gov.sg
INTRODUCTION somewhere. But once you bring me a white
A common question posed to a biostatistician cat, the hypothesis of ‘all cats are black’ is
from a medical researcher is “How many disqualified.
subjects do I need to obtain a significant result Hence if we are interested to compare two
for my study?”. That magic number! In the therapies, the null hypothesis will be “there is
manufacturing industry, it is permitted to test no difference” versus the Alternative
thousands of components in order to derive a Hypothesis of “there is a difference”. From the
conclusive result but in medical research, the above philosophical argument, not being able
sample size has to be “just large enough” to to reject the null hypothesis
provide a reliable answer to the research does not mean that that it is true (just that we
question. If the sample size is too small, it’s a do not have enough evidence to reject).
waste of time doing the study as no conclusive We want to reject the null hypothesis but could
results are likely to be obtained and if the be committing a Type I Error: rejecting the
sample size is too large, extra subjects may be null hypothesis when it’s true. In a research
given a therapy which perhaps could be proven study, there’s no such thing as “my results are
to be non-efficacious with a smaller sample correct” but rather “how much error I am
size(1). committing”. For example, if in the population,
Another major reason, besides the scientific there are actually no differences between two
justification for doing a study, why a researcher therapies (but we do not know, that’s why we
wants an estimate of the sample size is to are doing the study) and after conducting the
calculate the cost of the study which will study, a significant difference was found which
determine the feasibility of conducting the is given by p<0.05.
study within budget. This magic number will There are only two reasons for this significant
also help the researcher to estimate the length difference (assuming that we have controlled
of his/her study – for example, the calculated for bias of any kind). One is, there’s actually a
sample size may be 50 (a manageable number) difference between the two therapies and the
but if the yearly accrual of subjects is 10 other is by chance. The p-value gives us this
(assuming all subjects give consent to be in the “amount of chance”. If the p-value is 0.03, then
study), it will take at least five years to the significant difference due to chance is 3%.
complete the study! In that case a multicentre If the p-value is very small, then this difference
study is encouraged. happening by chance is “not possible” and thus
should be due to the difference in therapies
STATISTICAL THEORY ON SAMPLE (still with a small possibility of being
SIZE CALCULATIONS “wrong”).
The Null Hypothesis is set up to be rejected. The other situation is not being able to reject
Clinical Trials and Epidemiology The philosophical argument is: it is easier to the null hypothesis when it is actually false
Research Unit (Type II Error). As mentioned, the main aim
226 Outram Road Blk A #02-02
prove a statement is false than to prove it’s
Singapore 169039 true. For example, we want to prove that “all of a clinical research is to reject the null
Y H Chan, PhD Head of Biostatistics cats are black”, and even if you point to me hypothesis and we could achieve this by
(2)
Correspondence to: Y H Chan black cats everywhere, there’s still doubt that a controlling the type II error . This is given by
the Power of the study (1 – type II error): the sample size required) is usually carried out To estimate a sample size which will ethically
probability of rejecting the null hypothesis compared to a one-sided test which has the answer the research question of an RCT with a
when it is false. Conventionally, the power is assumption that the test therapy will perform reliable conclusion, the following information
set at 80% or more, the higher the power, the clinically better than the standard or control should be available.
bigger the sample size required. therapy.
To be conservative, a two-sided test (more
SAMPLE SIZE CALCULATIONS
Singapore Med J 2003 Vol 44(4) : 173

Type of comparison(3) Effect size of therapies


Superiority trials The effect size specifies the accepted clinical difference
To show that a new experimental therapy is superior to a between two therapies that a researcher wants to observe
control treatment in a study.
Null Hypothesis: The test therapy is not better than the
control therapy by a clinically relevant amount. There are three usual ways to get the effect size: a.
Alternative Hypothesis: The test therapy is better than the from past literature.
control therapy by a clinically relevant amount. b. if no past literature is available, one can do a small pilot
study to determine the estimated effect sizes.
Equivalence trials c. clinical expectations.
Here the aim is to show that the test and control therapies
are equally effective. To calculate the sample size, besides knowing the type
Null Hypothesis: The two therapies differ by a of design to be used, one has to classify the type of the
clinically relevant amount. primary outcome.
Alternative Hypothesis: The two therapies do not
differ by a clinically relevant amount. Proportion outcomes
The primary outcome of interest is dichotomous
Non-inferiority trials (success/failure, yes/no, etc). For example, 25% of the
For non-inferiority, the aim is to show that the new therapy subjects on the standard therapy had a successful outcome
is as effective but need not be superior compared to the and it is of clinical relevance only if we observe a 40%
control therapy. This is when the test therapy could be (effect size) absolute improvement for those on the study
cheaper in cost or has fewer side effects, for example. therapy (i.e. 65% of the subjects will have a successful
Null Hypothesis: The test therapy is inferior to the outcome). How many subjects do we need to observe a
control therapy by a clinically relevant amount. significance difference?
Alternative Hypothesis: The test therapy is not inferior to For a two-sided test of 5%, a simple formula to
the control therapy by a clinically relevant amount. A calculate the sample size is given by
1-sided test is performed in this case.
π1(1 – π1) + π2(1 – π2)

Type of configuration(4)
Parallel design subjects are
(π1 – π2)2
Most commonly used design. The m (size per group) = c X
where c = 7.9 for 80% power and 10.5 for 90% power, π1
randomised to one or more arms of different therapies and π2 are the proportion estimates. Thus from the above
treated concurrently. example, π1 = 0.25 and π2 = 0.65. For a 80% power, we
have
Crossover design m (size per group) = 7.9 X [0.25 (1 – 0.25) + 0.65
For this design, subjects act as their own control, will be (1 – 0.65)]/(0.25-0.65)2
randomised to a sequence of two or more therapies with a = 20.49
washout period in between therapies. Appropriate for
chronic conditions which will return to its original level Hence 21 X 2 = 42 subjects will be needed.
once therapy is discontinued.
Table I shows the required sample size per group for
Type I error and Power(5) π1 & π2 in steps of 0.1for powers of 80% & 90% at
The type I error is usually set at two-sided 5% and power two-sided 5%.
is at 80% or 90%.

Table I
π 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
0.1 199 (266) 62 (82) 32 (42) 20 (26) 14 (17) 10 (12) 7 (9) 5 (6) 0.2 – 294 (392) 82 (109) 39 (52) 23 (30) 15 (19) 10 (13) 7 (9) 0.3 –
356 (477) 93 (125) 42 (56) 24 (31) 15 (19) 10 (12) 0.4 – 388 (519) 97 (130) 42 (56) 23 (30) 14 (17) 0.5 – 388 (519) 93 (125) 39 (52)
20 (26) 0.6 – 356 (477) 82 (109) 32 (42) 0.7 – 294 (392) 62 (82) 0.8 – 199 (266)
Numbers in ( ) are for 90% power
174 : 2003 Vol 44(4) Singapore Med J the magic number being generated is accepted by the user.
For this number to be “correct”, the right formula must be
used for the right type of design and primary outcome. It
is important to note that nearly all the programs would
provide the sample size for one group and not the total
Continuous outcomes
(except for paired designs).
Two independent samples
A simple-to-use PC-based sample size software,
The primary outcome of interest is the mean difference in
affordable in cost, is Machin’s et al(6) Sampsize version 2.1
an outcome variable between two treatment groups. For
but it could only be installed for Windows 98 and below.
example, it is postulated that a good clinical response
Software with network capabilities are SPSS
difference between the active and placebo groups is 0.2
units with an SD of 0.5 units, how many subjects will be
required to obtain a statistical significance for this clinical
difference?
A simple formula, for a two-sided test of 5%, is 2c

and unless an error message is obtained, it is most likely


m (size per group) = ∝2 -∝1 2
δ +1 (www.stata.com) and Power & Precision
(www.PowerAnalysis.com), just to
(www.spss.com), STATA
where δ = is the standardised effect size and σ various statistical power analysis software, comparing the
pros and cons.
∝1 and ∝2 are the means of the two treatment groups σ is
the common standard deviation
CONCLUSIONS
c = 7.9 for 80% power and 10.5 for 90% power
This article has thus far covered the basic discussions for
From the above example, δ = 0.2/0.5 = 0.4 and for a simple sample size calculations with two aims in mind.
80% power, we have m (size per group) = (2 X 7.9)/ (0.4 Firstly, a researcher could calculate his/her own sample
X 0.4) + 1 = 99.75 size given the types of design and measures of outcome
Hence 100 X 2 = 200 subjects will be needed. mentioned above; secondly, it is to provide some
knowledge on what information will be needed when
Table II shows the required sample size per group coming to see a biostatistician for sample size
for values of δ in steps of 0.1 for powers of 80% & 90% at determination. If one is interested in doing an
2-sided 5% equivalence/non-inferiority study or with survival
outcomes analysis, it is recommended that a biostatistician
should be consulted.
Paired samples
In this case, we have the pre and post mean difference of
REFERENCES
the two treatment groups and a simple formula is

mention a few. Thomas & Krebs(7) gave a review of the


2 subjects patients are necessary!. British Journal of
Total sample size = δ +2 Cancer 1995; 72:1-9.
c 1. Fayers PM & D Machin. Sample size: how many
main point to note in using a software is to understand the
Table III shows the total size required for values of δ
proper instructions of getting the sample size. One could
in steps of 0.1 for powers of 80% and 90% at two-sided
enter some data into a program,
5%.
Table II
SAMPLE SIZE SOFTWARE 2. Muller KE & Benignus VA. Increasing scientific power with statistical
power. Neurotoxicology & Teratology, 1992; 14:211-9. 3. Schall R, Luus H
There are many sample size calculations software
& Erasmus T. Type of comparison, introduction to clinical trials, editors
available in the Internet and even on most computers. The Karlberg J & Tsang K, 1998; pp:258-66. 4. Chan YH. Study design
considerations — study configurations, introduction to clinical trials, editors
Karlberg J & Tsang K. 1998; pp:249-57. Bulletin of the Ecological Society of America 1997, 78(2): 126-39.
5. Thomas L & Juanes F. The importance of statistical power analysis: an
example from animal behaviour. Animal Behaviour, 1996; 52:856-9. 6. D
Machin, M Campbell, Fayers P & Pinol A. Sample size tables for clinical
studies, 2nd edition. Blackwell Science, 1997. δ
7. L Thomas & CJ Krebs. A review of statistical power analysis software.
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 80% power 1,571 394 176 100 64 45 33 26 21 90% power 2,103 527 235
133 86 60 44 34 27

Table III
δ
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 80% power 792 200 90 52 34 24 19 15 12 90% power 1,052 265 119 68 44
32 24 19 15

You might also like