Sample Size in Medical Research Ethics

This document discusses sample size and statistical power in medical research. It explains that studies with too small of a sample size will lack the power to detect clinically important effects, and may thus be unethical. The document provides an example to illustrate how statistical power increases with larger sample sizes, and how sample size calculations can be done prospectively to ensure adequate power to detect a clinically meaningful effect if present.

Uploaded by

mghasegh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views3 pages

Sample Size in Medical Research Ethics

Uploaded by

mghasegh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

1336 BRITISH MEDICAL JOURNAL VOLUmE 281 15 NOVEMBER 1980

Medicine and Mathematics

Statistics and ethics in medical research

III How large a sample?
DOUGLAS G ALTMAN

Whatw~r type of statistical design is used for a study, the significant) a clinically relevant difference. More importantly,
problem of sample size must be faced. This aspect, which it may be used prospectively to calculate a suitable sample size.
causes considerable difficulty for researchers, is perhaps the If the smallest difference of clinical relevance can be specified
most common reason for consulting a statistician. There are we can calculate the sample size necessary to have a high
also, however, many who give little thought to sample size, probability of obtaining a statistically significant result-that is,
choosing the most convenient number (20, 50, 100, etc) or time high power-if that is the true difference. For a continuous
period (one month, one year, etc) for their study. They, and variable, such as weight or blood pressure, it is also necessary
those who approve such studies, should realise that there are to have a measure of the usual amount of variability. A simple
important statistical and ethical implications in the choice of example will, I hope, illustrate the relation between the sample
sample size for a study. size and the power of a test.
A study with an overlarge sample may be deemed unethical
through the unnecessary involvement of extra subjects and the
correspondingly increased costs. Such studies are probably rare. 1.0
On the other hand, a study with a sample that is too small will
be unable to detect clinically important effects. Such a study
may thus be scientifically useless, and hence unethical in its
use of subjects and other resources. Studies that are too small 0.

are extremely common, to judge by surveys of published

research.1 2 The ethical implications, however, have only rarely
been recognised.3'
The approach to the calculation of sample size will depend on 0.
the complexity of the study design. I will discuss it here in the
context of trying to ascertain whether a new treatment is Li
0~

0
better than an existing one, since it will help if the ideas are
illustrated by one of the most common types of research. 0.4

Significant tests and power

0.2
Despite their widespread use in medical research significance
tests are often imperfectly understood. In particular, few
medical researchers know what the power of a test is. This is
perhaps because most simple books and courses on medical 0.
statistics do not discuss it in any detail, even though it is a 0 200 400 600 600 1000 1200
concept fundamental to understanding significance tests. Some TOTAL sTuDr SIZE
of the general implications, however, are well appreciated, such FIG 1-Relation between sample size and power to detect
as the awareness that the more subjects there are, the greater as significant (p<005 or p<001) a difference of 05 cm
the likelihood of statistical significance. when standard deviation is 2 cm.
Formally, the power of a significance test is a measure of how
likely that test is to produce a statistically significant result for a
population difference of any given magnitude. Practically, it AN EXAMPLE
indicates the ability to detect a true difference of clinical
importance. The power may be calculated retrospectively to Suppose we wish to carry out a milk-feeding trial on 5-year-
see how much chance a completed study had of detecting (as old children when a random half of the children are given extra
milk every day for a year. We know that at this age children's
height gain in 12 months has a mean of about 6 cm and a standard
deviation of 2 cm. We consider that an extra increase in height
Division of Computing and Statistics, Clinical Research Centre, in the milk group of 0 5 cm on average will be an important
Harrow, Middx HAl 3UJ difference, and we want a high probability of detecting a true
DOUGLAS G ALTMAN, BSC, medical statistician (member of scientific difference at least that large.
staff)
Figure 1 shows the power of the test for a true difference of
BRITISH MEDICAL JOURNAL VOLUME 281 15 NOVEMBER 1980 1337

0.0 - 0-995
- 0-99
0'1
098
0.2 - O-97
- 0*96
- 0-95
0-3
- 090
0*4
0-85
c
1- 0-5 - O'80
I.-
- 0-75
v06 - 070 -v
0
0-6
-o N
N
065
- O'60
as 0 7 - 0*55
c - 0*50
-
(I) 0*45
0-8 - 01,0
0-35
0'9
- O'30
Y 005 - 0'25
0-20
1*0
0'15
0.01
1*1 -0.10

1*2 -J SIG L 0-05

LEVEL
FIG 2-Nomogram for a two-sample comparison of a continuous variable, relating power, total study size, the standardised difference,
and significance level.

0 5 cm. The increase in power with increasing sample size is they are to some extent arbitrary, it is generally advisable to
clearly seen, as is the relation with the significance level. For stick closely to the prestated criteria.
any given sample size the probability of obtaining a result
significant at either the 5% or 1% level, given a true difference
in growth of 0-5 cm, can be read off. Power of 80-90% is A NEW SIMPLE METHOD
recommended; fig 1 shows that to achieve an 85% chance of
detecting the specified difference of 0 5 cm significant at the The formula on which these calculations are based is not
1 % level, we would need a total of about 840 children. particularly simple. Graphs are preferable, but because so
If we are told that we can have at most 500 children in all, many variables are concerned, a large set of graphs like fig 1
what will the power be now ? Figure 1 shows that the power would be necessary to calculate sample size for any problem.
drops from 85% to 60%. We are now more than twice as Greater flexibility, however, is achieved by the nomogram shown
likely to miss a true difference of 0 5 cm at the 1% level, although in fig 2. This makes use of the standardised difference, which is
the power is still about 80% for a test at the 5% level of equal to the postulated true difference (usually the smallest
significance. Alternatively, and not shown by fig 1, this size of medically relevant difference) divided by the estimated standard
study achieves the same power as the larger one for a difference deviation. So in the previous example the standardised difference
of 0-65 cm instead of 0-5 cm. Whether or not this is thought of interest was 0 5/2 0=0 25. The nomogram is appropriate
sufficient will depend on how far one is prepared to alter one's for calculating power for a two-sample comparison of a con-
criteria of acceptability for the sake of expediency. Although tinuous measurement with the same number of subjects in each
1338 BRITISH MEDICAL JOURNAL VOLUME 281 15 NOVEMBER 1980
group. The only restriction is the common requirement that power of their study. Obviously in most of these studies such
the variable that is being measured is roughly Normally calculations were not done.
distributed. It is surprising and worrying that in such an ethically
The nomogram gives the relation between the standardised sensitive area as clinical trials so little attention has been given
difference, the total study size, the power, and the level of to an aspect that can have major ethical consequences. If the
significance. Given the significance level (5% or 1°h),* by sample size is too small there is an increased risk of a false-
joining with a straight line the specific values for two of the negative finding. A recent survey' of 71 supposedly negative
variables the required value for the other variable can easily trials found that two-thirds of them had at least a 10% risk of
be read off the third scale. By using this nomogram, it is both missing a true improvement of 50%. In only one of the 71
simple and quick to assess the effect on the power of varying studies was power mentioned as having been considered before
the sample size, the effect on the required sample size of changing carrying out the study. It is surely ethically indefensible to
the difference of importance, and so on. It is easy to confirm carry out a study with only a small chance of detecting a
the earlier calculations for the milk-feeding trial. treatment effect unless it is a massive one, and with a con-
An estimate of the standard deviation should usually be sequently high probability of failure to detect an important
available, either from previous studies or from a pilot study. therapeutic effect.
Note that the nomogram is not strictly appropriate for retro-
spective calculations. Although it will be reasonably close for This is the third in a series of eight articles.
samples larger than 100, for smaller samples it will tend to No reprints will be available from the authors.
overestimate the power.

QUALITATIVE DATA References

For many studies the outcome measure is not continuous but Freiman JA, Chalmers TC, Smith H, Kuebler RR. The importance of
qualitative-for example, where one is looking for the presence beta, the type II error and sample size in the design and interpretation
or absence of some condition or comparing survival rates. of the randomized control trial. N EnglJ7 Med 1978;299:690-4.
2 Ambroz A, Chalmers TC, Smith H, Schroeder B, Freiman JA, Shareck
Peto et al5 have discussed calculating sample size for such EP. Deficiencies of randomized control trials. Clinical Research 1978;
studies, and they emphasise the problem of getting enough 26:280A.
subjects when either the condition is rare or the expected 3 Newell DJ. Type II errors and ethics. Br MedJ 1978;iv:1789.
4 Anonymous. Controlled trials: planned deception? Lancet 1979;i:534-5.
improvement is not large. For example, about 1600 subjects 5Peto R, Pike MC, Armitage P, et al. Design and analysis of randomized
would be needed to have a power of 90% of detecting (at p <0 05) clinical trials requiring prolonged observation of each patient. I Intro-
a reduction in mortality from 15% to 10%. Although the sample duction and design. BrJ' Cancer 1976;34;585-612.
size will in general need to be much larger for studies including 6 Aleong J, Bartlett DE. Improved graphs for calculating sample sizes
qualitative outcome measures, the logic behind the calculations when comparing two independent binomial distributions. Biometrics
is exactly the same as with continuous data, except that a prior 1979 ;35 :875-81.
Boag JW, Haybittle JL, Fowler JF, Emery EW. The number of patients
estimate of the standard deviation is not needed. Several required in a clinical trial. BrJ Radiol 1971 ;44:122-5.
authors have published graphs for general use.6-8 8 Mould RF. Clinical trial design in cancer. Clin Radiol
1979;30:371-81.

OTHER TYPES OF STUDY

A right-handed 46-year-old stonemason developed a right axillary vein
Sequential designs are similarly amenable to the incorporation thrombosis. No haematological, biochemical, or physical abnormalities
of considerations of power at the design stage. Indeed, it is were found to account for his thrombosis, and he has recovered well
probably much more common here than for ordinary randomised taking anticoagulants. Might his condition have been related to his
studies. For these, and for more complicated designs, it may occupation ?
be particularly helpful to enlist the aid of a statistician when
thinking about sample size. It might have been, especially if he had had a spell off work. Axillary
vein thrombosis commonly results from unaccustomed use of the arm,
including upward movements that compress the vein between clavicle
and first rib.
Conclusions
The idea behind using the concept of power to calculate What are the health hazards of taking small babies to public swimming
sample size is to maximise, so far as practicable, the chances of pools ?
finding a real and important effect if it is there, and to enable
us to be reasonably sure that a negative finding is strong grounds Mother and baby bathing is a rewarding experience for both parent
for believing that there is no important difference. The effect and child. It aids physical development of the baby and augments the
of the approach outlined above is to make clinical importance psychological "bonding." Many public bathing pools have special
and statistical significance coincide, thus avoiding a common mother (father) and baby bathing sessions, and those interested are
advised to try to use this facility. There is the safety advantage of a
problem of interpretation. poolside attendant being present. The best age to start for the baby is
Before embarking on a study the appropriate sample size from 9 to 12 months, although some enthusiasts may start earlier.
should be calculated. If not enough subjects are available then Much depends on the development of the baby and the confidence of
the study should not be carried out or some additional source the parent. The pool should be reasonably warm, between 80-85°F
of subjects should be found.5 (It should also be borne in mind (26-30°C) (most public baths are 70-75°F (21-240C)), and it is most
that expected accession rates tend to be over-optimistic.) The important to let the baby gain confidence by holding him and only
calculations affecting sample size and power should be reported gradually allowing independence in the water. It is preferable to
when publishing results. A study2 of 172 randomised controlled have only parents and babies in the pool, as excited older children
trials published in the New England J7ournal of Medicine and shouting and splashing may be frightening. It is unwise to take a baby
the Lancet from 1973 to 1976 found that none mentioned a bathing until at least 1-1 hours after his last meal. There is no more
risk of contracting any infection than in any other social activity, and
prior estimate of the required sample size, and none specified a provided the parent is not over-enthusiastic the chance of an accident
clinically relevant difference that might allow calculation of the is negligible. Small babies take to bathing readily, and parents who
have used the special sessions confirm that parent and baby bathing is
*As in the example these are two-tailed significance levels. well worth while.

Understanding Effect Size in Research
No ratings yet
Understanding Effect Size in Research
4 pages
Jgme Article p279
No ratings yet
Jgme Article p279
4 pages
Power and Sample Size Analysis Guide
No ratings yet
Power and Sample Size Analysis Guide
33 pages
Determinacion Tamaños Muestra Exp Clinicos (Correlación)
No ratings yet
Determinacion Tamaños Muestra Exp Clinicos (Correlación)
7 pages
Importance of Sample Size in Trials
No ratings yet
Importance of Sample Size in Trials
98 pages
G*Power: Sample Size & Power Analysis
100% (1)
G*Power: Sample Size & Power Analysis
9 pages
Sample Size Calculation in Clinical Studies
No ratings yet
Sample Size Calculation in Clinical Studies
4 pages
Sample Size Calculation in Clinical Studies
No ratings yet
Sample Size Calculation in Clinical Studies
3 pages
Some Practical Guidelines For Effective Sample Size Determination
No ratings yet
Some Practical Guidelines For Effective Sample Size Determination
7 pages
404 Research Methodology
No ratings yet
404 Research Methodology
1,236 pages
Putting Clinical Studies Into Better Perspective by Defini 2026 Clinica Chim
No ratings yet
Putting Clinical Studies Into Better Perspective by Defini 2026 Clinica Chim
7 pages
Power and Sample Size Analysis Guide
No ratings yet
Power and Sample Size Analysis Guide
16 pages
Statistical Notes For Clinical Researchers - Effect Size
No ratings yet
Statistical Notes For Clinical Researchers - Effect Size
4 pages
Sample Size Calculation Simplified
No ratings yet
Sample Size Calculation Simplified
5 pages
Introduction To Inference: Use and Abuse of Tests Power and Decision
No ratings yet
Introduction To Inference: Use and Abuse of Tests Power and Decision
15 pages
Sample Size Calculation
No ratings yet
Sample Size Calculation
6 pages
Sample Size Calculations in Clinical Studies
No ratings yet
Sample Size Calculations in Clinical Studies
22 pages
Sample Size Calculation Essentials
No ratings yet
Sample Size Calculation Essentials
30 pages
New Biostatistical Methods for Clinical Research
No ratings yet
New Biostatistical Methods for Clinical Research
5 pages
Power and Sample Size Estimation Guide
No ratings yet
Power and Sample Size Estimation Guide
9 pages
Sample Size Calculation in Health Research
No ratings yet
Sample Size Calculation in Health Research
33 pages
Sample Size and Power in Experiments
No ratings yet
Sample Size and Power in Experiments
19 pages
Clinician's Guide to Effect Size and Power
No ratings yet
Clinician's Guide to Effect Size and Power
8 pages
Sample Size and Power: What Is Enough?: Ceib Phillips
No ratings yet
Sample Size and Power: What Is Enough?: Ceib Phillips
10 pages
Sample Size and Power Analysis Guide
No ratings yet
Sample Size and Power Analysis Guide
7 pages
Importance of Sample Size Calculation
No ratings yet
Importance of Sample Size Calculation
3 pages
Effect Sizes in Research: Adjusted vs Unadjusted
No ratings yet
Effect Sizes in Research: Adjusted vs Unadjusted
6 pages
Understanding Effect Size in Medical Research
No ratings yet
Understanding Effect Size in Medical Research
2 pages
Sample Size in Randomised Trials
No ratings yet
Sample Size in Randomised Trials
3 pages
Sample Size in Randomised Trials
No ratings yet
Sample Size in Randomised Trials
3 pages
G*Power for Sample Size Analysis
No ratings yet
G*Power for Sample Size Analysis
12 pages
Sample Size Calculation in Health Research
No ratings yet
Sample Size Calculation in Health Research
7 pages
Power & Sample Size Estimation Guide
No ratings yet
Power & Sample Size Estimation Guide
6 pages
Power and Sample Size Estimation Guide
No ratings yet
Power and Sample Size Estimation Guide
6 pages
Importance of Sample Size in Trials
No ratings yet
Importance of Sample Size in Trials
4 pages
Sample Size Calculation for RCTs
No ratings yet
Sample Size Calculation for RCTs
4 pages
Sample Size Postgrad 2021
No ratings yet
Sample Size Postgrad 2021
70 pages
Understanding The Effect Size and Its Measures
No ratings yet
Understanding The Effect Size and Its Measures
14 pages
Sample Size Calculation in Research
No ratings yet
Sample Size Calculation in Research
44 pages
Power and Sample Size Estimation Guide
No ratings yet
Power and Sample Size Estimation Guide
50 pages
Sample Size and Power
No ratings yet
Sample Size and Power
24 pages
Sample Size Calculation in Clinical Research
No ratings yet
Sample Size Calculation in Clinical Research
6 pages
Sample Size Calculation Methods
No ratings yet
Sample Size Calculation Methods
5 pages
Sample Size Determination in Biostatistics
No ratings yet
Sample Size Determination in Biostatistics
28 pages
Sathian Et Al 2010
No ratings yet
Sathian Et Al 2010
7 pages
Lessons in Biostatistics: Effect Sizes For Nonparametric Tests
No ratings yet
Lessons in Biostatistics: Effect Sizes For Nonparametric Tests
12 pages
Torgerson Et Al-2007-Journal of Evaluation in Clinical Practice
No ratings yet
Torgerson Et Al-2007-Journal of Evaluation in Clinical Practice
2 pages
Sample Size Determination Guide
No ratings yet
Sample Size Determination Guide
28 pages
Understanding Effect Size in Research
No ratings yet
Understanding Effect Size in Research
3 pages
Sample Size and Power in Hypothesis Testing
No ratings yet
Sample Size and Power in Hypothesis Testing
41 pages
Power and Sample Size Estimation Guide
No ratings yet
Power and Sample Size Estimation Guide
42 pages
Sample Size in Clinical Trials
No ratings yet
Sample Size in Clinical Trials
22 pages
Essential Statistics for Research Analysis
No ratings yet
Essential Statistics for Research Analysis
32 pages
Current Sample Size Conventions: Flaws, Harms, and Alternatives
No ratings yet
Current Sample Size Conventions: Flaws, Harms, and Alternatives
7 pages
Sample Size Flaws
No ratings yet
Sample Size Flaws
7 pages
Using Power Analysis To Estimate Appropr
No ratings yet
Using Power Analysis To Estimate Appropr
12 pages
Statistical Significance Versus Clinical Importance
No ratings yet
Statistical Significance Versus Clinical Importance
5 pages
Understanding Statistical Significance in Education
No ratings yet
Understanding Statistical Significance in Education
9 pages
Sample Size Calculation Basics
No ratings yet
Sample Size Calculation Basics
5 pages
Children Ethics and The Law
100% (1)
Children Ethics and The Law
174 pages
Everyday Ethics in Social Work Practice
No ratings yet
Everyday Ethics in Social Work Practice
19 pages
Evolution of Medical Ethics: A Sociological Study
No ratings yet
Evolution of Medical Ethics: A Sociological Study
16 pages
Ian Manners The Normative Ethics of The European Union International Affairs 2008 Proof
No ratings yet
Ian Manners The Normative Ethics of The European Union International Affairs 2008 Proof
17 pages
Toward Fin de Siecle Ethics Some Trends
No ratings yet
Toward Fin de Siecle Ethics Some Trends
76 pages
Irwin Terence - The Theory of Forms
No ratings yet
Irwin Terence - The Theory of Forms
24 pages
Ethics in Competitive Economics
No ratings yet
Ethics in Competitive Economics
47 pages
Agamemnon by Aeschylus: Free eBook
No ratings yet
Agamemnon by Aeschylus: Free eBook
85 pages
Leadership Ethics: An Introduction PDF
No ratings yet
Leadership Ethics: An Introduction PDF
15 pages
Pirates: A Comedy in One Act
No ratings yet
Pirates: A Comedy in One Act
33 pages
Social Constructionism in Social Problems
No ratings yet
Social Constructionism in Social Problems
15 pages
Defining Social Problems in Sociology
No ratings yet
Defining Social Problems in Sociology
2 pages
Social Problems Course Overview - UNL
No ratings yet
Social Problems Course Overview - UNL
5 pages
Data Collection Methods Explained
No ratings yet
Data Collection Methods Explained
11 pages
Job Analysis at Bengkalis District Secretariat
No ratings yet
Job Analysis at Bengkalis District Secretariat
12 pages
Market Research in Textile Management
No ratings yet
Market Research in Textile Management
12 pages
Theoretical Distributions & Inference Quiz
No ratings yet
Theoretical Distributions & Inference Quiz
7 pages
Defining Research Objectives Clearly
No ratings yet
Defining Research Objectives Clearly
3 pages
Examples of Research Instruments
No ratings yet
Examples of Research Instruments
77 pages
JBI Critical Appraisal-Checklist For Quasi - Experimental Studies
100% (1)
JBI Critical Appraisal-Checklist For Quasi - Experimental Studies
7 pages
Factors Influencing 3PL Performance Analysis
No ratings yet
Factors Influencing 3PL Performance Analysis
8 pages
Types of Measurement Variables
No ratings yet
Types of Measurement Variables
16 pages
Understanding Conceptual Frameworks in Research
No ratings yet
Understanding Conceptual Frameworks in Research
13 pages
Entrepreneurial Orientation and Firm Performance Analysis
No ratings yet
Entrepreneurial Orientation and Firm Performance Analysis
15 pages
Research Methodology Overview
0% (1)
Research Methodology Overview
3 pages
Two-Way ANOVA in Decision Analytics
No ratings yet
Two-Way ANOVA in Decision Analytics
25 pages
CCE Project: Strengthening Concrete Columns
No ratings yet
CCE Project: Strengthening Concrete Columns
49 pages
Research Methods: Quantitative, Qualitative, Mixed
No ratings yet
Research Methods: Quantitative, Qualitative, Mixed
3 pages
Understanding Descriptive Analytics Functions
No ratings yet
Understanding Descriptive Analytics Functions
2 pages
Qualitative vs Quantitative Research Guide
No ratings yet
Qualitative vs Quantitative Research Guide
22 pages
Qualities and Characteristics of Researchers
No ratings yet
Qualities and Characteristics of Researchers
46 pages
Emotional Intelligence in Nursing Commitment
No ratings yet
Emotional Intelligence in Nursing Commitment
104 pages
Predicting Student Success with Big Data
No ratings yet
Predicting Student Success with Big Data
17 pages
Feminist Design in Post-Work Automation
No ratings yet
Feminist Design in Post-Work Automation
13 pages
Academic Project Proposal Guidelines
No ratings yet
Academic Project Proposal Guidelines
3 pages
EDTP221 Teaching Practice Assignment
No ratings yet
EDTP221 Teaching Practice Assignment
38 pages
Key Components of Research Design
88% (8)
Key Components of Research Design
20 pages
Plea Bargaining and Fair Trial Rights
No ratings yet
Plea Bargaining and Fair Trial Rights
272 pages
Advanced Data Analysis Project Guide
No ratings yet
Advanced Data Analysis Project Guide
3 pages
Advanced Research Methods Checklist
100% (2)
Advanced Research Methods Checklist
4 pages
Hypothesis Testing Fundamentals
No ratings yet
Hypothesis Testing Fundamentals
14 pages
Single-Subject Research in Special Education
No ratings yet
Single-Subject Research in Special Education
16 pages
Golafshani - Understanding Reliability and Validity in Qualitative Research
No ratings yet
Golafshani - Understanding Reliability and Validity in Qualitative Research
10 pages

Sample Size in Medical Research Ethics

Uploaded by

Sample Size in Medical Research Ethics

Uploaded by

1336 BRITISH MEDICAL JOURNAL VOLUmE 281 15 NOVEMBER 1980

Medicine and Mathematics

Statistics and ethics in medical research

are extremely common, to judge by surveys of published

Significant tests and power

1*2 -J SIG L 0-05

QUALITATIVE DATA References

OTHER TYPES OF STUDY

You might also like