0% found this document useful (0 votes)
24 views5 pages

Sample Size Calculation in Clinical Research.10

Uploaded by

eternityharsh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views5 pages

Sample Size Calculation in Clinical Research.10

Uploaded by

eternityharsh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Statistics

Sample size calculation in clinical research


Downloaded from http://journals.lww.com/picp by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AW

Priya Ranganathan, Vishal Deo1, C. S. Pramesh2


Department of Anaesthesiology, Tata Memorial Centre, Homi Bhabha National Institute, 2Department of Surgical Oncology and
Administration, Tata Memorial Centre, Homi Bhabha National Institute, Mumbai, Maharashtra, 1National Institute for Research in Digital
Health and Data Science, Indian Council of Medical Research, New Delhi, India
nYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC1y0abggQZXdgGj2MwlZLeI= on 07/06/2024

Abstract Calculation of sample size is an essential part of research study design since it affects the reliability and
feasibility of the research study. In this article, we look at the principles of sample size calculation for
different types of research studies.

Keywords: Epidemiologic methods, research design, sample size

Address for correspondence: Dr. Priya Ranganathan, Department of Anaesthesiology, Tata Memorial Centre, Homi Bhabha National Institute,
Mumbai ‑ 400 012, Maharashtra, India.
E‑mail: [email protected]
Received: 30‑05‑24, Accepted: 04‑06‑24, Published: 04-07-24.

INTRODUCTION discussed in the next section, a study that is too small may
not be able to detect an effect reliably, as the smaller sample
In previous articles, we have looked at a variety of research size leads to a wider confidence interval of the estimated
study designs.[1‑5] Once the research design, objectives and effect. This, in turn, increases the chance of an observed
endpoints have been finalized, the researcher needs to effect to be statistically insignificant, and the study may fail
calculate the sample size for the study. The sample size to detect an existing important treatment effect. On the
is the number of participants or other units required in a other hand, a study that is too large is overpowered and
study to be able to answer the research question reliably. may detect effects that are statistically significant but are too
The sample size drives the budget of the study, allows small to be of clinical relevance. Even if the real effect size
the researcher to determine the feasibility of the study, is considerably large and clinically relevant, opting for larger
and could lead to changes in the proposed study design, than the required sample size may be deemed unethical for
methodology, or outcome. unnecessarily subjecting more participants to an inferior
treatment in the control arm. In addition, this may lead to
WHY IS SAMPLE SIZE DETERMINATION
IMPORTANT?
substantial wastage of resources and affect the feasibility
of the study. Thus, it is important that the sample size is
The sample size of a study determines the reliability of calculated properly using relevant methods to power the
the study results. The internal validity of a study, especially study adequately.
in terms of precision or power, depends largely on the
sample size.[6] The sample size also impacts the cost and Irrespective of the type of study, the determination of
duration of the study. A study with an adequate sample sample size requires information on certain parameters
size is sufficiently powered to detect a treatment effect. As associated with the outcome of interest. Since many

Access this article online This is an open access journal, and articles are distributed under the terms of the Creative Commons
Attribution‑NonCommercial‑ShareAlike 4.0 License, which allows others to remix, tweak, and
Quick Response Code:
Website: build upon the work non‑commercially, as long as appropriate credit is given and the new creations
are licensed under the identical terms.
www.picronline.org
For reprints contact: [email protected]

DOI:
10.4103/picr.picr_100_24 How to cite this article: Ranganathan P, Deo V, Pramesh CS. Sample size
calculation in clinical research. Perspect Clin Res 2024;15:155-9.

© 2024 Perspectives in Clinical Research | Published by Wolters Kluwer - Medknow 155


Ranganathan, et al.: Sample size calculation

of these parameters may be estimates (and not true in agreement with the research hypothesis. The null and
measurements), the researcher should note that sample size alternate hypotheses are statistical hypotheses. Based on
calculations are approximations and not absolute measures. the empirical evidence produced by the study, we either
In the subsequent sections, we will discuss sample size accept or reject the null hypothesis. Rejection of the null
calculation for some commonly used research study hypothesis results in acceptance of the alternate hypothesis,
designs, such as descriptive and comparative or analytical suggesting the presence of a statistically significant effect.
Downloaded from http://journals.lww.com/picp by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AW

studies (clinical trials, cross‑sectional studies, case–control For example, for comparative studies, the null hypothesis
studies, and cohort studies). may be stated as “there is no difference between groups”
and an alternate hypothesis as “there is a difference between
KEY CONCEPTS IN SAMPLE SIZE CALCULATION groups.” For single‑arm studies such as prevalence studies,
nYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC1y0abggQZXdgGj2MwlZLeI= on 07/06/2024

where we want to detect or estimate the prevalence of a


Population versus sample
disease or condition, the prevalence is taken as zero in
Studies are based on samples that are considered to be
the null hypothesis and is taken as the estimate, say “x”
representative of the population. Since the sample is a
percent (x > 0), in the alternate hypothesis based on the
subset of the population, the sample‑based estimates are
sample estimate. To be precise, the research hypothesis
likely to differ from the true population values. Moreover,
needs to be laid out clearly at the start of the study as it
since samples are selected randomly from the population,
determines the primary objective and the basis for sample
if the same study is conducted repeatedly, we will get
size calculation.
different estimates of the true but unknown population
measure.[7] Thus, we may practically visualize a range of Summary statistic
estimates being produced if the same study is repeated The study outcome is reported as a single statistic that
under the same setting. With a valid study design and an serves as an estimate of the population measure of interest
adequate sample size, the width of this range is minimized and is calculated as a function of sample observations. The
at a given level of confidence (discussed later in this choice of statistic depends on the population measure to be
section). Therefore, the sample result is always reported as estimated, which further depends on the objectives of the
a summary statistic with a range of values that represent study. This could be a mean, a proportion, an odds ratio, a
the possible interval containing the true population values, risk ratio, a hazard ratio, etc. The calculation of sample size
and the width of the interval depends on the level of depends on the type of statistic that will be used.
confidence determined by how certain the researcher wants
to be about the estimate. Expected event rates
Both single‑arm and comparative studies require knowledge
Size of the population of the baseline event rate (prevalence in single‑arm studies
Sample size estimates require knowledge of the size of and control group event rate in comparative studies). In
the population. Most often, this is unknown and is taken addition, for comparative studies, the researcher needs to
as infinite (for very large populations). Occasionally, the know the event rate in the comparator group (the effect
researcher may have a more definite idea of the size of size). This can be estimated by scoping the published
the population and may be able to provide an estimate. literature, reviewing clinical data, by carrying out pilot
For example, Alghamdi et al looked at work‑related stress studies, or by “guesstimation” but only in case none of
and burnout levels among Saudi commercial pilots.[8] Since the other options are feasible.
this was a defined and small population, they were able to
estimate the total number of pilots (the population size) Variance
to be approximately 2000. It is a concept used for numerical data and refers to the
degree of spread of data within the population. The larger
Hypothesis the spread (the more heterogenous the data), the higher
Every research study starts with a hypothesis, which is a the sample size that will be needed to estimate outcomes.
statement of belief that the researcher wishes to prove. Since it is not possible to directly measure the variance in
Since this statement forms the basis of the research, it is the population, we often assume that the sample variance
often called a research hypothesis.[9] While constructing represents the population variance.
a statistical test within a study to scientifically prove
or disprove this belief, a null hypothesis is defined, Type 1 error/Level of confidence
suggesting that the effect being claimed does not exist. This represents the certainty with which the estimated
This is complemented by an alternate hypothesis which is intervals of the summary statistic contain the true
156 Perspectives in Clinical Research | Volume 15 | Issue 3 | July-September 2024
Ranganathan, et al.: Sample size calculation

value of the population measure being estimated. It is among the general public in Jeddah and Makkah.[13] At
conventionally set at 5% (which means that at least 95 out a 95% confidence level with an estimated 50% response
of the 100 times we conduct the study under the same distribution, a margin of error ± 5%, and accounting for
setting, the resultant intervals of the estimate will contain a 5% nonresponse rate, the estimated sample size was 404.
the true population value).[7,9]
SAMPLE SIZE FOR CLINICAL TRIALS
SAMPLE SIZE FOR SINGLE GROUP
Downloaded from http://journals.lww.com/picp by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AW

STUDIES (DESCRIPTIVE OR CROSS‑SECTIONAL A typical two‑arm phase 3 clinical trial with a superiority
SURVEYS OR PREVALENCE STUDIES) hypothesis aims to prove that a treatment is superior to
a control. The inputs needed for sample size calculation
These studies aim to measure the prevalence of a particular
are:[14]
nYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC1y0abggQZXdgGj2MwlZLeI= on 07/06/2024

factor in the population. The inputs needed for sample


a. Type 1 error: This refers to the probability of falsely
size calculation are:
rejecting the null hypothesis. For a superiority study,
a. The predicted or expected prevalence in the
it refers to the probability of finding a difference by
population ‑ Sample sizes are very high for extremely
chance where a true difference does not exist
low (<10%) and extremely high (>90%) prevalence
b. Type 2 error: This refers to the probability of falsely
rates. Between prevalence rates of 10 and 90%, the
accepting the null hypothesis. For a superiority study,
sample size is maximum for prevalence rates of 50%.
it refers to the probability of missing a true treatment
Sometimes, it is suggested that an arbitrary value of
effect. The power of the study is the converse of the
50% may be used if the prevalence is completely
type 2 error and refers to the ability to find the existing
unknown. However, this approach has its drawbacks
difference. Conventionally, the type 2 error is set as
b. The degree of precision ‑ This represents the margin
of error with which we want to estimate the prevalence 10% or 20%, which means that the power of the study
and determines the width of the confidence intervals is 90% or 80%
of the estimated prevalence. For example, if we want c. Expected effect size: Refers to the minimum difference
to estimate a prevalence of 25% with a 2% margin of in effect between the study arms that would be
error, it means that we want to estimate the prevalence considered significant. The effect size is inversely
to be between 23% and 27% proportional to the sample size. Very small effect
c. The level of confidence sizes require large studies and may not be feasible or
d. Population size – It can be either known (a finite clinically meaningful. Very large effect sizes require
number) or be assumed to be unknown (infinite). small studies but may not be scientifically plausible
The mathematical formula for calculating sample and will result in negative studies
size is different for the two scenarios. In general, it d. Variance of the outcome measure
is preferred to assume the population to be infinite, e. Allocation ratio between control and experimental
unless the target population is small arms: Most often, participants are randomized in a
e. Degree of attrition or nonresponse. 1:1 ratio between experimental and control arms.
However, sometimes, investigators may choose to allot
Readers can refer to other papers for further information additional patients to the experimental arm or to the
on this topic.[10,11] control arm, which will impact sample size.

EXAMPLES IN PUBLISHED LITERATURE EXAMPLES IN PUBLISHED LITERATURE

Al‑Ramahi carried out a cross‑sectional study of Fernandez‑Lopez compared flexible and conventional
adherence to medications among Palestinian hypertensive treatments for gestational diabetes mellitus.[15] They
patients.[12] Since there was no available literature showing found that the variance in the weight of newborn babies
the prevalence of adherence among the Palestinian of mothers with gestational diabetes mellitus was 1.8 kg.
community, a 50% expected prevalence was used. The To detect a difference in mean weight of 700 g between
calculated sample size was 384, and to account for attrition, newborns in the two groups, at 95% confidence and 80%
500 patients were included in the study. In the study by power, it would be necessary to have 82 patients per group.
Alghamdi et al, the sample size of 311 was determined Assuming 15% attrition, 96 patients were needed per group.
considering a 5% margin of error, 95% confidence
level, and a 60.3% burnout prevalence rate.[8] Zaidi et al. The DREAMS study hypothesized that preoperative
surveyed the awareness and practice of self‑medication dexamethasone would reduce postoperative vomiting in
Perspectives in Clinical Research | Volume 15 | Issue 3 | July-September 2024 157
Ranganathan, et al.: Sample size calculation

patients undergoing elective bowel surgery.[16] The initial it is best to take expert input from a statistician rather
sample size was calculated at 80% power and 5% type I than rely completely on these tools.
error, based on a 24% proportional reduction in the
number of participants experiencing vomiting in the first Financial support and sponsorship
24 h after surgery (corresponding to a reduction from 37% Nil.
to 28% based on an earlier large trial). After accounting for
Conflicts of interest
Downloaded from http://journals.lww.com/picp by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AW

a 10% loss to follow‑up, the sample size was determined


There are no conflicts of interest.
to be 950 patients.
REFERENCES
SAMPLE SIZE FOR COMPARATIVE COHORT
STUDIES
nYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC1y0abggQZXdgGj2MwlZLeI= on 07/06/2024

1. Aggarwal R, Ranganathan P. Study designs: Part 2 – Descriptive studies.


Perspect Clin Res 2019;10:34‑6.
Cohort studies include groups of exposed and nonexposed 2. Ranganathan P, Aggarwal R. Study designs: Part 3 – Analytical
individuals who are followed up to determine outcomes. observational studies. Perspect Clin Res 2019;10:91‑4.
3. Aggarwal R, Ranganathan P. Study designs: Part 4 – Interventional
The inputs needed for sample size calculation are similar studies. Perspect Clin Res 2019;10:137‑9.
to that of a clinical trial, and include the power of the 4. Ranganathan P, Pramesh CS, Aggarwal R. Equivalence trials. Perspect
study, the type 1 error, the probability of outcomes in the Clin Res 2022;13:114‑7.
exposed and unexposed groups, and the ratio of exposed 5. Ranganathan P, Pramesh CS, Aggarwal R. Non‑inferiority trials.
Perspect Clin Res 2022;13:54‑7.
to unexposed participants.[17] 6. Shih WJ, Aisner J. Statistical Design, Monitoring, and Analysis of
Clinical Trials: Principles and Methods. Boca Raton: Chapman and
SAMPLE SIZE FOR COMPARATIVE Hall/CRC; 2021.
CASE–CONTROL STUDIES 7. Ranganathan P, Pramesh CS, Buyse M. Common pitfalls in statistical
analysis: “P” values, statistical significance and confidence intervals.
Case–control studies include individuals with the Perspect Clin Res 2015;6:116‑7.
8. Alghamdi AA, Alghamdi AH. Determining the best approach:
outcome (cases) and without the outcome (controls) Comparing and contrasting the impact of different coping strategies
who are then assessed for exposure status. The sample on work‑related stress and burnout among Saudi commercial pilots.
size depends on the power of the study, the type 1 error, Cureus 2023;15:e41948.
9. Ranganathan P, Cs P. An introduction to statistics: Understanding hypothesis
the ratio of cases to controls, and the expected odds of
testing and statistical errors. Indian J Crit Care Med 2019;23:S230‑1.
exposure in the cases versus the controls.[18] 10. Naing L, Nordin RB, Abdul Rahman H, Naing YT. Sample size
calculation for prevalence studies using Scalex and ScalaR calculators.
ADDITIONAL ASPECTS BMC Med Res Methodol 2022;22:209.
11. Khaled Fahim N, Negida A. Sample size calculation guide – Part 1:
• The calculated sample size for comparative studies How to calculate the sample size based on the prevalence rate. Adv J
Emerg Med 2018;2:e50.
usually refers to the sample required per arm of the 12. Al‑Ramahi R. Adherence to medications and associated factors:
study. A cross‑sectional study among Palestinian hypertensive patients.
• Guidelines for reporting research studies mandate J Epidemiol Glob Health 2015;5:125‑32.
13. Zaidi SF, Hakami AY, Khan MA, Khalid AA, Haneef AK, Natto SS,
that all aspects of sample size calculation should et al. The awareness and practice of self‑medication among the general
be reported in the research manuscript. This allows public in Jeddah and Makkah. Cureus 2023;15:e39706.
readers and reviewers to assess whether there has been 14. Negida A, Fahim NK, Negida Y, Ahmed H. Sample size calculation
any bias in the conduct of the study. guide – Part 5: How to calculate the sample size for a superiority clinical
trial. Adv J Emerg Med 2019;3:e49.
• In some cases, sample size calculations may be more 15. Fernández‑López M, Blanco‑Carnero JE, Guardia‑Baena JM,
complex. The sample size may need to be adjusted de Paco‑Matallana C, Aragón‑Alonso A, Hernández‑Martínez AM.
for interim analyses if the overall type 1 error is to be Flexible treatment of gestational diabetes mellitus adjusted according to
intrauterine fetal growth versus treatment according to strict maternal
maintained at the desired level. Factorial designs, cluster
glycemic parameters: A randomized clinical trial. BMJ Open Diabetes
randomized designs, equivalence and noninferiority Res Care 2022;10:e002915.
trials, and early phase trials, all require additional 16. DREAMS Trial Collaborators and West Midlands Research
inputs for sample size calculation.[19] Similarly, sample Collaborative. Dexamethasone versus standard treatment for
postoperative nausea and vomiting in gastrointestinal surgery:
size calculation may be based on study outcomes such Randomised controlled trial (DREAMS trial). BMJ 2017;357:j1455.
as correlation, association, agreement, sensitivity, or 17. Khaled Fahim N, Negida A. Sample size calculation guide – Part 2:
specificity.[20,21] How to calculate the sample size for an independent cohort study.
Adv J Emerg Med 2019;3:e12.
• A variety of online tools are available for the calculation
18. Fahim NK, Negida A, Fahim AK. Sample size calculation
of sample size. However, readers need to be aware that guide – Part 3: How to calculate the sample size for an independent
there are many nuances to sample size calculation, and case‑control study. Adv J Emerg Med 2019;3:e20.

158 Perspectives in Clinical Research | Volume 15 | Issue 3 | July-September 2024


Ranganathan, et al.: Sample size calculation

19. Negida A. Sample size calculation guide – Part 6: How to calculate the based on sensitivity, specificity, and the area under the ROC curve.
sample size for a non‑inferiority or an equivalence clinical trial. Adv J Adv J Emerg Med 2019;3:e33.
Emerg Med 2020;4:e15. 21. Negida A. Sample size calculation guide – Part 7: How to calculate
20. Negida A, Fahim NK, Negida Y. Sample size calculation guide – Part 4: the sample size based on a correlation. Adv J Emerg Med
How to calculate the sample size for a diagnostic test accuracy study 2020;4:e34.
Downloaded from http://journals.lww.com/picp by BhDMf5ePHKav1zEoum1tQfN4a+kJLhEZgbsIHo4XMi0hCywCX1AW
nYQp/IlQrHD3i3D0OdRyi7TvSFl4Cf3VC1y0abggQZXdgGj2MwlZLeI= on 07/06/2024

Perspectives in Clinical Research | Volume 15 | Issue 3 | July-September 2024 159

You might also like