Primer: Methods Primers
Primer: Methods Primers
Mendelian randomization
Eleanor Sanderson 1,2 ✉, M. Maria Glymour3, Michael V. Holmes 1,4,5, Hyunseung Kang6,
Jean Morrison7, Marcus R. Munafò1,8,9, Tom Palmer 1,2, C. Mary Schooling 10,11,
Chris Wallace12,13, Qingyuan Zhao14 and George Davey Smith 1,2,9
Abstract | Mendelian randomization (MR) is a term that applies to the use of genetic variation
to address causal questions about how modifiable exposures influence different outcomes. The
principles of MR are based on Mendel’s laws of inheritance and instrumental variable estimation
methods, which enable the inference of causal effects in the presence of unobserved confounding.
In this Primer, we outline the principles of MR, the instrumental variable conditions underlying
MR estimation and some of the methods used for estimation. We go on to discuss how the
assumptions underlying an MR study can be assessed and describe methods of estimation that
are robust to certain violations of these assumptions. We give examples of a range of studies in
which MR has been applied, the limitations of current methods of analysis and the outlook for MR
in the future. The differences between the assumptions required for MR analysis and other forms
of epidemiological studies means that MR can be used as part of a triangulation across multiple
sources of evidence for causal inference.
Instrumental variable
Mendelian randomization (MR) uses genetic variation outcome. However, there are important additional
(IV). A variable associated to address causal questions about whether modifiable assumptions required for causal inference and effect
with an exposure that is not exposures influence health, developmental or social estimation that are different to those used in other causal
associated with the outcome outcomes1. Exposures can be any factor robustly associ- effect estimation methods. Causal effect estimates from
through any other pathway.
ated with genetic variation in individuals; for example, MR can be evaluated within a triangulation of evidence
Natural experiment an exposure could include measurable characteristics framework, which involves interpreting findings along-
Natural experiments are of an individual such as body mass index (BMI) or less side results from complementary approaches that rely on
variation in any exposures directly observable traits such as the expression of a different assumptions. When using this approach, it is
or risk factors that occurred
particular gene in a specific tissue. important that sources of bias in different study modal-
by chance in the population
without conscious or deliberate
The statistical methodology for MR is generally based ities are unrelated to each other so that the magnitude
intervention from investigators on instrumental variable (IV) analysis. An IV, or ‘instru- and direction of the bias in one study will not predict the
or scientists. ment’, is related to the exposure but not to the outcome size and direction of bias in the others6–8.
of interest, other than through its association with the MR studies — especially two-sample studies using
exposure. IV analysis was first proposed a century ago previously published summary-level genetic associa-
and is an approach to causal inference that uses an tion data — provide a rapid and affordable approach
IV to make causal effect estimates in the presence of to evaluating causal questions. There is an urgent need
unobserved confounding of the exposure and the out- for these tools because many causal questions in health
come. IV analyses can be applied to any source of var- research cannot be adequately answered with conven-
iation in an exposure that is unrelated to the outcome, tional observational study designs and are not amenable
including investigator-initiated treatment randomiza- to evaluation with RCTs for logistical or ethical reasons.
tion in a randomized controlled trial (RCT) or when MR is especially appealing because it relies on assump-
a natural experiment provides a plausible source of exo tions that differ from those of conventional observa-
genous or unconfounded variation2–4. MR is based on the tional studies and therefore circumvents some of their
assumption that genetic variants provide a source of such common biases1,8. The range of applications of MR and
exogenous variation in the exposure and can therefore act closely related methods for understanding causal mech-
as an IV1. MR can be applied using any genetic variation anisms has increased rapidly in the past 20 years. The
that satisfies the requirements of an IV5, although it is usu- increasing availability of data and the vast expansion of
✉e-mail: eleanor.sanderson@ ally implemented using single-nucleotide polymorphisms IV methods have overcome some of the original barriers
bristol.ac.uk (SNPs). Box 1 further outlines the principles of MR. to MR caused by lack of data and the inability to assess
https://doi.org/10.1038/ Using genetic variants in this way, MR avoids bias the robustness of results obtained1. Major investments in
s43586-021-00092-5 from unobserved confounding of the exposure and collecting genetic data within large research studies have
0123456789();:
Primer
Author addresses randomization and the outcome and has no other plausi-
ble mechanism to influence health outcomes other than
1
Medical Research Council (MRC) Integrative Epidemiology Unit, University of Bristol, through treatment (Fig. 1). In RCTs, randomly assigned
Bristol, UK. treatment therefore evaluates the effect of treatment on
2
Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK. the outcome, whereas in MR, a genetic variant is treated
3
Department of Epidemiology and Biostatistics, University of California, San Francisco,
as a naturally occurring form of randomization.
CA, USA.
4
MRC Population Health Research Unit, University of Oxford, Oxford, UK. As an example, Fig. 2a shows a directed acyclic graph
5
Clinical Trial Service Unit and Epidemiological Studies Unit, Nuffield Department for an RCT aimed at estimating the causal effect of
of Population Health, University of Oxford, Oxford, UK. lowering circulating levels of the inflammatory marker
6
Department of Statistics, University of Wisconsin-Madison, Madison, WI, USA. C-reactive protein (CRP) on systolic blood pressure
7
Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA. (SBP), in which participants are randomized to receive
8
School of Psychological Science, University of Bristol, Bristol, UK. a CRP-lowering medication or placebo. Alternatively, the
9
National Institute for Health Research (NIHR), Biomedical Research Centre, University effect of long-term differences in circulating CRP could
of Bristol, Bristol, UK. be estimated with MR by considering a genetic variant
10
School of Public Health, Li Ka Shing, Faculty of Medicine, The University of Hong Kong, that is known to alter CRP levels (Fig. 2b). The directed
Hong Kong, China.
acyclic graphs for both studies are the same as long as
11
School of Public Health, City University of New York, New York, NY, USA.
12
MRC Biostatistics Unit, University of Cambridge, Cambridge, UK. certain assumptions are satisfied (discussed below).
13
Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), In our hypothetical RCT, an intention-to-treat analy
University of Cambridge, Cambridge, UK. sis can be conducted to determine whether the treat-
14
Statistical Laboratory, University of Cambridge, Cambridge, UK. ment influences the outcome by comparing SBP among
individuals randomly assigned to the CRP-lowering
enabled numerous applications of MR and allowed for medication to SBP in participants randomly assigned
increased statistical power and more precise effect esti- to placebo10,11. Intention-to-treat analysis estimates the
mates. Further, methodological innovation to enhance effect on the outcome of being assigned to the group
MR analyses is flourishing and innovations aim to allow allocated to treatment, rather than receiving that treat-
for correct estimation with more plausible assumptions ment. A frequently used approach for analysis is to
and estimate more complex effects, which include inde- compare the mean SBP among individuals randomized
pendent effects of multiple phenotypes or age-sensitive to treatment to the mean SBP among individuals
exposures. We therefore focus on the principles of MR randomized to control:
and detail a few core MR estimation methods. The meth-
ods for MR listed here should not be taken as a definitive β1 = E (SBP G = 1) − E (SBP G = 0) (1)
list of all potential methods available.
In this Primer, we provide guidance on the underly- where β1 is the effect on SBP of being assigned to the
ing principles of MR, discuss the information necessary treatment group, G is an indicator of randomization and
to decide whether an MR approach is appropriate and SBP is measured systolic blood pressure. Alternatively, a
feasible, and review best contemporary practices for MR. linear regression can be used:
We outline the principles and assumptions underlying
MR, along with the data required. Next, we detail the E (SBP G ) = β0 + β1G (2)
core methods for estimation of causal effects and explain
how the assumptions underlying MR can be verified or where β0 is a constant. As there are no confounders of
subjected to sensitivity analyses. We then describe a randomization and SBP, there is no need to control for
range of studies that have applied MR in different set- any variables to derive an unconfounded estimate of the
tings, detail the importance of triangulating MR results effect of randomization. Therefore, in a setting where
with findings using other study designs and discuss steps G is binary, β1 as estimated in Eq. 1 is identical to β1 as
to improving the openness of research involving MR. estimated in Eq. 2 and both estimate the causal effect
Finally, we outline sources of bias that may affect MR of randomized treatment groups on SBP. Being rand-
studies that cannot be corrected for with current meth- omized to CRP-lowering medication should only affect
ods and discuss some of the challenges and opportunities SBP if there is a causal effect of CRP on SBP.
for MR in the future. A potential disadvantage of the intention-to-treat
estimate, for many questions of substantive interest, is
Experimentation that it does not give the magnitude of the effect of the
The essence of an MR design is that the association exposure on the outcome — for example, of CRP on SBP
between a genetic variant (G) and an outcome (Y) can in the above example. It only determines whether or not
be used to test whether and by how much the exposure there is a causal effect. To estimate the size of that causal
of interest (X) influences the outcome, provided that effect, the degree to which the instrument affects the
the genetic variant is associated with the exposure of exposure must be taken into account. IV analyses are an
interest and has no other source of association with the alternative estimation method that can be used to derive
outcome 1,8,9. Bias originating from confounding of an estimate of the causal effect of the treatment (here,
the exposure and outcome should not influence the MR CRP) on the outcome (SBP) by accounting for the size of
Confounder
A trait that influences both
estimate. The rationale of MR studies parallels that of the association between randomization and CRP3,4,12–15.
the exposure and outcome RCTs in which randomization influences the treatment In this scenario, randomization becomes the instrument
of interest. received by participants, there are no confounders of for the estimation. In its simplest form, IV analysis takes
0123456789();:
Primer
the ratio of the effect of randomization on SBP to the level owing to population stratification or assortative
effect of randomization on CRP: mating. The particular genetic variants used in the MR
may also have effects on the outcome that are not due to
E[SBP G = 1] − E[SBP G = 0] the exposure received by the individual22. These issues
γ1 = (3)
E[CRP G = 1] − E[CRP G = 0] all represent violations of the conditions required for IV
estimation, which are described in detail below. How
where γ1 is known as the Wald ratio estimator and CRP these violations may occur in MR studies and potential
is the level of circulating C-reactive protein. The numer- mechanisms to detect such violations are discussed in
ator of Eq. 3 is simply Eq. 1, but here the association is the ‘Results’ and ‘Limitations and optimizations’ sections
scaled by the effect of randomization on CRP. Under the of this Primer.
IV conditions described in Box 2, this estimator provides
a test of whether there is a causal effect of CRP on SBP. Conditions required for MR estimation
IV analyses can be applied to any potential source of Interpretation of results from MR studies relies on
randomization, including intentionally designed RCTs four conditions12,23. The first three of these conditions
or quasi-randomization in natural experiments15,16. The are usually referred to as the conditions for a valid IV
term MR is applied when the quasi-randomization arises and are required for any IV analysis to test whether the
from genetic variation and a phenotype influenced by exposure has a causal effect on the outcome. These are
the genetic variant is the exposure of interest17,18. The described in Box 2. In our simplified example of CRP
genetic variant is referred to as the genetic instrument. and SBP, we imagine only a single IV; however, MR is
For example, naturally occurring genetic variants in the easily extendable to take advantage of multiple genetic
gene encoding CRP regulate blood levels of CRP and variants that influence the same exposure24. When mul-
such variants have been used to estimate the effects of tiple genetic variants can be identified that fulfill the IV
circulating CRP levels on SBP19,20. conditions, they can be used to improve the statistical
The above example highlights an important dif- power of MR analyses25,26.
ference between RCTs and MR: RCTs estimate the The three IV conditions described in Box 2 are
effect of a particular intervention or treatment over sufficient to test the exact null hypothesis as they can
the timeframe of the study, whereas MR estimates the determine the presence or lack of a causal effect of the
lifetime effects of the genetic variants, as discussed in exposure on the outcome. However, they are not suffi-
a recent preprint21. This can lead to substantial differ- cient to derive a point estimate of the size of the effect of
ences in the effect estimates obtained, owing to the the exposure on the outcome27,28. This requires an addi-
differences in the time period over which the effects tional condition27 known as a point-estimate-identifying
are estimated. There are a number of other differences condition or fourth IV condition. Several alternative
between RCTs and MR. Although MR was first pro- point-estimate-identifying conditions — which permit
posed using family data where the difference in alleles subtly different interpretations of the IV estimate — have
between siblings is random, data limitations mean that been described and researchers can adopt the version
most MR is conducted using data on unrelated individu of the condition which seems most plausible for the
als1. In MR using unrelated individuals, the similarity setting at hand17,28. Box 3 outlines the most popular
between the allele groups is not guaranteed as it is in of these alternative point-estimate-identifying condi-
a well-conducted RCT. Further, associations between tions and the effect estimate obtained from each one.
allele distribution and traits can exist at a population Additionally, the vast majority of MR estimation meth-
ods (with non-linear MR29 being the notable exception)
Box 1 | The principles of the MR approach impose the assumption that the relationship between
The Mendelian randomization (MR) approach draws on Mendel’s first and second laws
the exposure and the outcome is linear across different
of genetic inheritance: the law of segregation and the law of independent assort- values of the exposures.
ment206. The law of segregation indicates that at every point in the autosomal genome, Biases that compromise the interpretation of an RCT
offspring randomly inherit one allele from their mother and one allele from their father. can also undermine MR studies. For example, if random
The law of independent assortment implies that these alleles will be passed to offspring assignment in an RCT influences who participates in
independently of each other, except in regions of the genome that are genetically follow-up assessments, typical analyses of the RCT are
linked in the DNA of the parents. biased. Similarly, if the genetic variants used in MR influ-
The first extended exposition of MR1 was in the context of family-based studies. ence who has available outcome data — either owing
Its analogy with randomized controlled trials was in the context of the random allocation to differential survival or study participation — the
of variants from parents to their children. At the time of this first description, adequate
MR study will be biased30.
family-based data were not available and ‘approximate’ MR in population studies was
advocated for instead; indeed, family-based data are still only used in a small minority
Finally, data used in MR additionally require the
of published MR studies. The advocacy of population studies was based on the premise assumption that changes in genetic variation are equiv-
that at a population level, genetic variants can identify groups that differ, on average, alent in their effects to changes in the exposure through
with respect to a modifiable exposure. In these studies, genetically defined group mem- environmental or pharmaceutical manipulation — a
bership should be unrelated to factors that may confound conventional observational concept known as gene–environment equivalence31.
associations, including behavioural, social and physiological exposures that occur after Given that genetic variants will influence the developing
conception1,4,6,206,207. Therefore, genetic associations between traits should be free from human from conception, the interpretation is applied to
confounding and any difference in outcomes between groups defined by genetic varia- the influence of the variants from conception onwards.
tion can be attributed to the genetic variation, assuming no selection bias owing to that These particular MR-related issues are discussed
genetic variation.
in Box 4.
0123456789();:
Primer
Mendelian randomization Randomized controlled trial the variance in the exposure explained by the genetic
instruments. When summary data are used, the preci-
Population Sample sion of the MR estimate depends on how precisely the
associations between the genetic variants and each of
Randomization step the exposure and the outcome have been estimated — in
other words, how large the standard error of the esti-
Random segregation of alleles Random allocation to groups
mated association is. Genetic variants typically explain
only a small proportion of the variation in the relevant
phenotype; as a result, low statistical power and impre-
cise effect estimates are common in MR studies and
Wild-type allele Variants Control Treatment
well-powered studies usually require large datasets.
Power calculators are available for simple MR studies to
determine whether a particular sample size is sufficient
Disease Disease Disease Disease
outcomes outcomes outcomes outcomes
for the estimation to give reasonably precise results32–35.
Simulation studies to determine power are also usually
Statistical Statistical used to accommodate unique data features36.
tests tests
The association of the proposed genetic instrument
Fig. 1 | An overview of MR studies. This overview compares and contrasts the parallels with the exposure can be estimated in a sample other than
between Mendelian randomization (MR) and randomized controlled trials (RCTs). In MR, that used to estimate the effect of the proposed genetic
randomization is due to the random allocation of alleles. This conceptualization was instrument on the outcome37. MR conducted in this way
originally based on between-sibling variation, where allocation of alleles is random and is referred to as ‘two-sample MR’. The capacity to use two
not dependent on population-level variation (see also Box 1). Inference from MR in this
different samples for MR analyses has dramatically broad-
way relies on the assumption of gene–environment equivalence — that a change in the
exposure caused by genetic variation has the same effect on the outcome as a change
ened the scope of MR studies because when either the
in that exposure caused by environmental factors. desired exposure or outcome of a study is rare or expensive
to measure, it can be difficult to identify a dataset with data
on the genetic instrument, exposure and outcome. An
Data used for MR estimation important assumption for two-sample MR estimation is
MR studies can be conducted using individual-level data that the two samples are from the same underlying popu
(including genetic and phenotype measures for each lation, or more narrowly that the association between the
individual in the study) or summary data (on the asso- genetic variants and exposure is the same in both samples,
ciation between each genetic instrument and the expo- although that exposure may not be measured or reported
sure and the outcome phenotypes of interest). Summary in the sample included in the outcome dataset38. To satisfy
data are often obtained from genome-wide association this assumption, two-sample approaches usually use data
studies (GWAS), which estimate the association between from the most similar populations possible, with respect
SNPs and the exposure and SNPs and the outcome traits. to genetic ancestry and contextual factors such as the prev-
When individual-level data are used for esti- alence of environmental exposures and the timeframe in
mation, the statistical power of an MR analysis (or, which the measurements were taken.
equivalently, the precision of the estimate that can be The method of estimation and applicable sensitivity
derived) increases in proportion to the sample size and analyses used in MR depend on whether individual par-
ticipant or summary-level data are used to conduct the
a An RCT to test whether lowering CRP lowers SBP analyses39. Using multiple genetic instruments in combi-
nation improves statistical power because the combina-
Confounders (U) tion increases the total fraction of the exposure variance
explained by the instruments26,40. The availability of mul-
tiple genetic instruments is also valuable for detecting or
avoiding bias if one or more of the IV conditions are not
Randomization to
CRP (X) SBP (Y) met for some or all of the instruments.
CRP-lowering medication
Instrument selection
Genetic variants used as instruments for MR should be
b An MR study to test whether lowering CRP lowers SBP associated with the exposure of interest, so that they sat-
isfy IV condition 1 (Box 2). This can be through the use
Confounders (U)
of variants with known functionality or through the
selection of variants that are robustly associated with
the exposure. GWAS can potentially identify a large
number of SNPs that predict a selected phenotype and
Genetic variant associated
with lower CRP (G) CRP (X) SBP (Y) many MR studies use SNPs identified in credible GWAS
as genome-wide significant predictors of the exposure of
Fig. 2 | Illustration of a randomized control study and instrumental variable estima- interest for estimation, that is those SNPs associated with
tion. A randomized controlled trial (RCT) (panel a) and a Mendelian randomization (MR) the exposure with P < 5.0 × 10–8 (ref.41).
study (panel b) to estimate the effect of lowering C-reactive protein (CRP) on systolic When using individual data, overlap between the
blood pressure (SBP). The arrows highlighted in red show the causal effect of interest. dataset used for instrument discovery and the dataset
0123456789();:
Primer
G X Y G X Y G X Y
used for estimation can introduce a bias known as ‘win- main methods used for estimation. A number of other
ner’s curse’. The goal of IV is to remove the effect on papers are available that cover guidelines for reading48,
the exposure of variation due to confounders of the conducting39 and interpreting49 results from MR studies.
exposure and outcome. However, the best fitting model STROBE guidelines for the consistent reporting of MR
for the association of a SNP and the exposure will, by studies have also been published50,51. Additionally, the
chance, pick up some variation owing to confounders. MR dictionary provides an extensive glossary of terms
Although this bias is small and unimportant if the SNP used in MR.
has a very strong effect on the exposure, this is rarely the
case. When many SNPs are used as IVs, each with a very Individual-level data
small effect, this can create a non-trivial bias towards the Estimating causal effects. When using individual level
conventional effect estimate, known as weak instrument data in MR estimation, genetic variants can either be
bias42. This can be avoided through bias correction calcu- used as separate instruments or combined into an allele
lations or by using a two-sample approach and applying score25. An allele score is generated by adding up the
jackknife resampling to the estimation43–45. In a jackknife number of risk-increasing alleles for all the variants
estimation, the data are divided into groups and each selected as instruments. This score can be unweighted,
is then used for estimation, with instrument discovery so that each SNP makes the same contribution, or
conducted in the rest of the sample. The results for each weighted, so that the number of risk-increasing alleles
group are then meta-analysed to obtain a result for the at each SNP is multiplied by the estimated effect of
whole dataset (see preprint46). that SNP on the exposure25. Weighted scores provide
Bias due to overfitting is a concern when summary- increased instrument strength and power, although
level data are used for estimation if the effect of the SNP there are cases in which the unweighted approach is
on the exposure is in a dataset that overlaps with the preferable — for example, if the definition of the expo-
dataset used to estimate the SNP–outcome association. sure in the discovery dataset differs from the exposure
Recent research has suggested that overlap between the variable in the estimation data. In such a case the weights
samples used may not bias the results obtained by as will reflect the weight of the SNP on an exposure that is
much as previously thought, unless the instruments are not the exposure included in the estimation. The more
not strongly associated with the exposure, and methods similar the definition of the exposure is in each sam-
have been proposed to estimate the size of this bias and ple the more preferable the weighted approach will be.
to correct for it43 (see preprints44,47). Differences in scaling alone will not affect the prefer-
ence for a weighted score. Ideally, both SNPs and weights
Results should be selected from a dataset that does not overlap
This section outlines methods used for MR estimation, with the dataset used to obtain the MR estimates, such as
tests for violation of the first IV condition and methods those from GWAS in non-overlapping datasets52. If many
of estimation that are robust to particular violations of SNPs that each have only a small effect on the exposure
the second and third IV conditions. Here, we cover the are being used, combining them into a single score can
0123456789();:
Primer
Box 3 | Point-estimate-identifying conditions term assumed to be unrelated to vx. The four conditions
for IV estimation imply that the assumption of inde-
The instrumental variable (IV) conditions described in Box 2 are sufficient to test for the pendence of u and vx is met and the estimated value of
presence of a causal effect. However, performing estimation and interpretation of β — that is, β, obtained from Eq. 5 — is a consistent
the causal effect requires at least one additional assumption. The effect of the exposure estimator for the effect of X on Y. If the estimation is
(X) on the outcome (Y) may differ for different people. These differences require
implemented using an allele score, Eq. 4 is replaced with:
additional assumptions to be placed on the relationship between the instruments,
exposure and outcome to identify both the causal effect of the exposure on the
outcome, and to whom that causal effect estimate applies. Each assumption gives X = π0 + πS + vx (6)
a slightly different interpretation for the causal effects obtained from Mendelian
randomization (MR) analysis. where S is the allele score (weighted or unweighted) and
There are two frequently used asumptions for point-estimate-identifying conditions. π is a single coefficient for the association of the genetic
The first option is homogeneity of the effect of the exposure on the outcome, or that score with the exposure. The second stage of the analy
either (a) the effect of the exposure on the outcome is the same for everyone, regardless sis, Eq. 5, is the same whether we are using individual
of the starting value of X or any other individual characteristics, or (b) the effect of the SNPs as instruments or an allele score. In both cases, the
exposure on the outcome does not depend on the value of the instrument. Option (a) standard error should not be computed using the stand-
gives the interpretation that the causal effect estimate is ‘the causal effect of the
ard formula for linear models and should be corrected
exposure on the outcome’, whereas option (b) gives the interpretation that the effect
estimate obtained is the ‘population average of the causal effect of the exposure on the for the additional uncertainty owing to the inclusion of
in the estimation. IV estimation software packages
X
outcome’. The second assumption is monotonicity in the association between the genetic
variants and the exposure — that the direction of the effect of the genetic variant on the implement this correction as standard.
exposure is the same for everyone2,27,210–212. This gives the interpretation that the effect Additional measured covariates can be incorporated
estimate is the effect of the exposure on the outcome in those people whose exposure is into both stages of the estimation. The use of additional
changed by the instrument. In MR, this is the average effect of differences in the exposure covariates should be considered carefully because covar-
that are attributable to differences in the genetic variants. For continuous exposures or iates can be influenced by the exposure or the outcome.
outcomes, violation of the IV conditions allowing point identification can be accessed In either of these situations, controlling for such a
through examination of the variance of the trait by the level of the instrument (either covariate could bias the MR effect estimate54–56.
per-allele or in a binary dominant model, as appropriate). Violations will lead to
differences in the variance of the trait across the level of the instrument213.
Which assumption is most relevant will depend on the particular estimation; however, Assessment of IV conditions. Regardless of the statisti-
the assumption of monotonicity is usually relevant for MR estimation. The point-estimate- cal method being used, it is important to assess the IV
identifying condition remains an area of debate and methodology development, with conditions. The first IV condition can be tested using a
researchers identifying additional possible assumptions that would support a causal first-stage F statistic, which tests the association between
interpretation of the IV effect estimate. the SNPs and the exposure. If the genetic instruments
are not strongly associated with the exposure, then weak
instrument bias can be introduced into the estimation42.
increase the power of the analysis and reduce the risk The first-stage F statistic should be reported in all MR
of bias from many weak instruments26. However, if any analyses. As a general rule, if the first-stage F statistic is
SNPs violate IV conditions 2 or 3 (if any of the compo- greater than 10, the level of this bias is small57,58. A cut-off
nent SNPs influence the outcome through a mechanism of F >10 has been used as a conventional threshold for
other than the exposure of interest) then the allele score a strong instrument in some studies. We note that this
will also violate that condition. should not be used as a rigid rule and an F statistic
Estimation of causal effects using individual-level data <10 does not indicate that this instrument should not
is usually implemented with some version of two-stage be used, rather that weak instrument bias should be
least-squares (2SLS) estimation (alternative methods considered as an issue in analysis.
include likelihood approaches that are common in struc- Although the second and third IV conditions cannot
tural equation modelling)53. 2SLS estimation for MR uses be proved to be true, they can sometimes be disproved.
genetic variants to obtain a predicted value of the expo- Assessment of these conditions therefore focuses on dis-
) that is not associated with any of the unmeasured
sure (X proving them, and failure to disprove the conditions is
confounders. The first stage can be written as: interpreted as supporting the validity of the proposed IV.
Genetic variants are fixed at conception, so it is not pos-
X = π0 + Gπ + vx (4) sible for conventional confounders such as age, sex or
environmental risk factors to influence them. However,
where X is the exposure of interest; G is a n × L matrix confounding of the genetic variants with the outcome
of genetic variants, where n is the number of individuals in a sample can be induced by population stratification,
in the dataset and L is the number of SNPs; π is a vector dynastic effects and assortative mating59, violating the
of the effect of each genetic variant on the exposure of second IV condition. This confounding is not easily
length L; π0 is a constant and Vx is a random error term. corrected with current MR methods and is discussed
The outcome is then regressed upon the predicted value in more detail in the ‘Limitations and optimizations’
First-stage F statistic of the exposure, X : section.
Test statistic used to test Violations of the third IV condition can be caused
the strength of association +u
Y = α + βX (5) by pleiotropy, where genetic variants have effects on
between the instrument(s)
and the exposure in an
multiple phenotypes60,61. This can include misspecifica-
instrumental variable where Y is the outcome, α is a constant, β is the effect of tion of the primary phenotype where the phenotype of
estimation. the exposure on the outcome and u is a random error interest is not the phenotype that the SNP is primarily
0123456789();:
Primer
Linkage disequilibrium associated with8,61,62. Additionally, linkage disequilibrium not identify the bias; for example, over-identification
Correlation between genetic means that the effects of neighbouring genetic vari- tests can incorrectly suggest a lack of pleiotropy even
variants located closely ants can introduce additional associations between the when it is present if similar pleiotropic pathways are
together on the genome. variant of interest — and thus the exposure it relates likely to affect many or all proposed IVs or if there is
to — and the outcome, creating a bias analogous to that population stratification biasing the association between
caused by pleiotropy. Pleiotropy in the context of MR is many SNPs and the outcome in the same way24. They
described in Fig. 3. Many MR methods are available that also rely on the assumption that each IV estimates the
are robust to different forms of pleiotropy and analyses same causal effect, which may not be true for complex
using these different methods should be carried out in traits where different genetic variants potentially act as
any MR study to determine how sensitive the results are genetic instruments for different aspects of the trait.
to an assumption of no pleiotropy. The weaker the effect of an IV on an exposure, the more
A final important source of bias in MR, and indeed imprecise the IV effect estimate will be and therefore
all studies of observational data, is selection bias63,64. This the more likely it becomes that an instrument will fail to
selection could occur either from differential selection reject an over-identification test.
into the sample or selection on a competing risk for the One further method for identifying potential viola-
outcome. Selection bias cannot be accounted for easily tions of the IV conditions when the exposure is binary or
with existing MR methods and is discussed further in categorical is using IV inequality constraints28,66,67. The
the ‘Limitations and optimizations’ section. IV conditions described above imply a set of mathemat-
An approach for assessing the IV assumptions that ical patterns that must be true if the conditions are true;
is applicable when there are more IVs than exposures these patterns can be used to demonstrate that the IV
of interest is based on over-identification tests. These conditions are not met if the equalities defined by those
tests, such as the Sargan test65, leverage the expectation patterns do not hold. IV inequalities are rarely espe-
that if all proposed IVs are valid, they should deliver cially informative because they identify only extreme
identical IV effect estimates. If the IV effect estimates violations of the conditions. These inequalities can
from multiple IVs differ to a greater extent than expected also be used to define non-parametric bounds for an
owing to sampling error, then at least one is not valid IV estimate (those that would hold without the fourth,
for the exposure–outcome effect of interest. If all IVs point-estimate-identifying condition discussed above).
are biased in the same way, over-identification tests will Although these bounds are often very wide, they can
give a sense of how much an IV analysis depends on
the point-estimate-identifying condition. An alterna-
Box 4 | Issues interpreting MR results tive approach for identifying violations of the IV condi-
Gene–environment equivalence tions is to examine the association between the genetic
Typically, Mendelian randomization (MR) considers exposures that are modifiable and variants and other measured causes of the outcome,
so evidence of a causal effect of the exposure on the outcome can be used to infer that excluding any variables that are themselves on the same
an exposure intervention will lead to a change in the outcome. However, making such pathway as the exposure of interest (see preprint68)69. If a
an inference depends on the exposure of interest fulfilling the consistency criterion proposed genetic instrument predicts other causes of the
that however the intervention is applied to alter the exposure, the effect on the outcome that are not thought to be along the same causal
outcome is the same. This means that changes in an exposure by either a hypothetical
pathway as the exposure, it indicates that the proposed
change in genotype or by a change in the environment should produce the same
downstream effect on an outcome31,214–216. For example, genotypic influences on
instrument is not valid.
circulating cholesterol level or a similar change in cholesterol level induced by dietary Methods, such as sisVIVE70 and adaptive LASSO71,
influences should lead to the same effect on coronary heart disease. Although many provide MR estimates that are robust to pleiotropy under
exposures can be closely proxied by genetic variation, for others — such as those that certain assumptions. These methods assume that mul-
reflect aspects of social deprivation and income — it is unlikely that genetic variation tiple IVs are available and that a majority or plurality of
will mimic environment changes exactly217. Gene–environment equivalence is a the proposed IVs are valid. Given this assumption, it is
fundamental principle in MR and consideration should be given to how likely it is possible to estimate the magnitude of pleiotropic bias.
to hold when interpreting the results from any MR study. An alternative approach is to adjust for pleiotropic effects
Interpretation of results for time-varying exposures of the genetic variants by accounting for the association
Genetic variants are fixed throughout an individual’s lifetime and MR estimates can between the genetic variants and potentially pleiotropic
therefore be interpreted as the ‘lifetime effect’ of the exposure on the outcome1,9. If the phenotypes. Methods that apply this approach include
association between the genetic variants and the exposure is constant across the life constrained IVs72 and multivariable MR73.
course, this lifetime effect can be interpreted as the effect of having a level of exposure Tests to invalidate proposed IVs often draw on sub-
that is a unit higher at every time point across the life course218. However, for many
ject matter knowledge, such as an understanding of set-
exposures the association between genetic variants and the exposure may vary across
the life course; for example, genetic variants associated with body mass index have been
tings in which a genetic variant does not influence the
shown to have a wide range of differential effects between childhood and adulthood141. exposure, where the genetic variant may have different
In this scenario, MR estimates can be interpreted as the lifetime effect of being on a effects based on the level of an environmental variable
trajectory for the exposure associated with having an exposure level that is a unit higher (known as gene–environment interactions) or where
at the time it is measured21. Multivariable MR can be used to estimate causal effects the exposure should have no effect on the outcome,
of the different time periods and potentially to identify particularly relevant periods such as a negative control or zero-relevance point. The
across the life course141,219. That MR estimates the lifetime effect of the exposure on proposed genetic instrument should not be associated
the outcome means that MR estimates can be larger than estimates obtained from with the outcome in an environmental setting where it is
alternative methods of estimation, such as randomized controlled trials, because the not associated with the exposure unless there are pleio-
total length of time over which the exposure can have an effect is much longer.
tropic pathways from the genetic variant to the outcome.
0123456789();:
Primer
a Horizontal pleiotropy (causes bias) b Horizontal pleiotropy (no bias) GENIUS, have extended and formalized these concepts
and enable the estimation of causal effects in more gen-
U U
eral settings. MR GxE uses an interaction between the
genetic variant and a covariate to create a new IV (see
preprint75)76; MR GENIUS uses variation that occurs
G X Y G X Y
owing to unobserved interactions between the genetic
variants and covariates as the instrument75,77.
C C
Summary-level data
c Confounding by linkage disequilibrium d Vertical pleiotropy Estimating causal effects. MR estimation with summary
level data requires estimates of πl , the estimated effect
U U
of genetic variant l on the exposure with variance σx2, l ,
and Γl , the estimated effect of genetic variant l on the
outcome with variance σ y2, l . Inverse-variance weight-
G1 X Y G X C Y ing (IVW) estimation is a meta-analysis of the variant
specific Wald ratios for each variant, which are given as:
G2 C
Γ
βl = l
e Misspecification of the primary phenotype f Correlated pleiotropy πl
U U
where βl is the effect estimated using genetic variant l.
These individual ratios are weighted by their associated
G X Y G X Y uncertainty; the IVW estimator βIVW can therefore be
computed as:
C C
∑ lL=1 πl Γlσy−2, l
Fig. 3 | Types of pleiotropy. Figure showing different types of pleiotropy in Mendelian βIVW =
randomization (MR), where G is a genetic variant or set of genetic variants associated ∑ lL=1 πl2σy−2, l
with the exposure, X is the exposure of interest, Y is the outcome of interest, U is an
unmeasured confounder and C is another (potentially unmeasured) phenotype that is where L is the total number of genetic variants included
also associated with the genetic variants. a,b | Horizontal pleiotropy. Sometimes referred
as potential IVs37. The IVW estimate can equivalently
to as biological pleiotropy, this occurs where a genetic variant is associated with multiple
phenotypes and these phenotypes lie on different pathways. In horizontal pleiotropy be obtained by regressing the genetic variant–outcome
with bias (panel a), the third instrumental variable condition (IV3) is violated because association, Γl , on the genetic variant–exposure associ-
there is a pathway from the genetic variant to the outcome that does not occur via the ation, πl , (without an intercept) weighted by the inverse
exposure. In horizontal pleiotropy with no bias (panel b), as the genetic variants are not variance of the SNP–outcome association (1/σy2, l ):
associated with other phenotypes on the pathway to the outcome, MR estimates are not
biased. c | Confounding by linkage disequilibrium. When G2 has an effect on the outcome Γl = βIVW πl + ul weighted by 1/σy2, l
through a pathway that is not via the exposure, correlation between G1 and G2 creates a
bias that is indistinguishable from that shown in panel a. d | Vertical pleiotropy. Another This equation describes a linear regression with the
phenotype lies on the genetic variant–exposure–outcome pathway. This could occur intercept fixed to zero as ul ~ N(0, 1), and is based on a
either before or after the exposure of interest. Sometimes referred to as mediated
dataset with L observations.
pleiotropy, this form of pleiotropy does not bias MR studies and can even be used to
elucidate causal intermediaries41. e | Misspecification of the primary phenotype. Vertical One important assumption for IVW estimation is
pleiotropy can bias MR estimates if the wrong phenotype is specified as the primary that the genetic variants are independent of each other40.
phenotype. Here the genetic variants are primarily associated with C. If X is misspecified This assumption is usually satisfied by removing one of
as the primary phenotype, MR estimation of the effect of X on Y would be biased by the each pair of genetic variants that are in linkage disequi-
alternative pathways from C to Y8,61. f | In correlated pleiotropy, genetic variants for the librium. However, methods are available that can take
exposure are also associated with a confounder of the exposure and outcome. In this linkage disequilibrium into account between genetic
setting, the size of the pleiotropic effect is correlated with the size of the association variants in summary-level MR78,79. It is also important
between the genetic variant and the exposure. This form of pleiotropy is particularly hard to ensure that data are harmonized to ensure that the
to detect and correct for. The scenarios in panels b and d produce settings where the values of Γl and πl refer to the same effect alleles80.
pleiotropy will not bias the MR estimation. All other settings violate assumptions IV2 or
IV3 and can cause meaningful bias in MR estimation.
Assessment of IV conditions. As with individual level
data analysis, IV conditions need to be assessed for
A classic example of this type of analysis is examining any summary-data MR. A number of different meth-
the effect of alcohol consumption in populations where ods are available to correct for horizontal pleiotropy — a
subgroups of the population (for example, women in violation of the third IV condition — under different
some cultures) do not drink or drink very little74. If the assumptions about the causal structure of that pleiot-
Vertical pleiotropy IV conditions are satisfied, there should be no associa- ropy. Table 1 lists some of these methods, which pri-
The phenomenon of a
genetic variant associated
tion between genetic variants for alcohol consumption marily draw on three approaches: outlier removal,
with multiple phenotypes and the outcome under consideration among women in outlier adjustment and adjustment for specific forms
on the same pathway. the previous example. Two methods, MR GxE and MR of pleiotropy. Many methods combine more than one of
0123456789();:
Primer
Horizontal pleiotropy
these approaches. Outlier removal estimation involves estimation to have pleiotropic effects on the outcome
The phenomenon of a genetic the identification and removal of individual genetic and to place other constraints on the pleiotropic effects.
variant associated with variants for which the causal effect estimate obtained These methods include MR Egger90 and multivariable
multiple phenotypes using that variant alone lies outside the expected range MR73,91. Each of these methods imposes strong assump-
on different pathways.
given the estimates obtained from other variants, so tions on the nature of the pleiotropy. MR Egger analysis
that they do not have an effect on the result obtained. assumes that across all instruments, the magnitude of
Traditionally, summary-data MR is visualized as a scat- the pleiotropic effect is unrelated to the strength of the
ter plot plotting associations of the variant and expo- association between the genetic variant and the pheno-
sure against associations of the variant and outcome type of interest (known as the InSIDE assumption). This
(Fig. 4a,b); however, this can limit the identification of assumption will not hold when there is correlated pleiot-
outliers. Radial MR is a method for visualizing the data ropy (Fig. 3f). Multivariable MR assumes that pleiotropic
that can make outlying data points easier to detect81 pathways operate through known phenotypes that are
(Fig. 4c). An additional approach is to explore the effect also included in the estimation.
of individual SNPs on the overall IV estimate, by meth- None of the methods described above is truly robust
ods such as leave-one-out analyses (Fig. 4d). Methods of to all types of pleiotropy and each imposes different
estimation that use outlier removal include weighted assumptions on the nature of the pleiotropy and how
median82, weighted mode83 and MR LASSO84. Outlier the pleiotropic effects are accounted for. Furthermore,
adjustment methods identify outlying variants and then many methods have less statistical power than conven-
perform an adjustment to either the effect obtained tional MR, leading to very wide confidence intervals.
from that genetic variant or to the weight given to Therefore, a few methods should be selected on the
the estimate from that variant so that the variant has basis of the most plausible assumptions for the appli-
less influence on the overall estimation result. Many cation in question and used alongside an IVW MR
pleiotropy-robust MR methods fall into this category, estimation to perform a sensitivity analysis; this can
including MR Tryx85, MR PRESSO86, MR Robust84, determine how robust MR results are to the assump-
MR RAPS87, MR GRAPPLE88 and MR CAUSE89. The tion that genetic variants have no pleiotropic effects
final broad category of pleiotropy-robust methods for on the outcome under different alternative specifica-
summary-data MR estimation are methods that allow tions. As a minimum, any summary-data MR estima-
for most or all of the genetic variants included in the tion usually includes weighted median and weighted
0123456789();:
Primer
a b
MR test MR test
Inverse variance weighted Inverse variance weighted Weighted median
MR Egger Weighted mode
0.00
0.00
−0.04 −0.04
−0.08 −0.08
0.02 0.04 0.06 0.08 0.02 0.04 0.06 0.08
SNP effect on body mass index SNP effect on body mass index
c d
IVW radial rs7903146
rs13078960
10 rs11727676
1.5 rs3817334
1 rs1516725
rs11057405
0.563 rs12286929
5 rs2112347
0.5 0.446 rs16851483
0.328 rs6091540
rs12429545
rs657452
rs1928295
βj Wj
0 0 rs12448257
rs13329567
^
rs1167827
rs17001654
rs977747
−0.5 rs17381664
−5 Variant rs10182181
rs1528435
−1 Outlier rs13201877
rs2183825
−1.5 IVW rs11030104
−2.5 rs3849570
−10 rs11672660
All
0 2.5 5.0 7.5 10.0 12.5 0.0 0.2 0.4 0.6
Wj
MR leave-one-out sensitivity analysis for
'Body mass index' on 'coronary heart disease'
Fig. 4 | Data visualization. Figure showing different visualizations of a summary-data Mendelian randomization (MR)
analysis. The example shown is estimating the effect of body mass index on coronary heart disease (CHD). a | A scatter
plot of the single-nucleotide polymorphism (SNP)–exposure and SNP–outcome associations for each SNP with an
inverse-variance weighting (IVW)-estimated line fitted. The error bars around each point show the standard error of the
estimated association between the SNP and the exposure and the SNP and the outcome. b | The same plot with the robust
approaches of weighted mode, weighted median and MR Egger added (note that the weighted median line is obscured
by the weighted mode line). c | The same data plotted using a radial MR framework to identify outliers; the horizontal
axis shows the weight given to each point and the vertical axis shows the weight multiplied by the effect estimate.
The IVW-estimated fitted line is shown. d | A leave-one-out analysis where the IVW estimate has been recalculated,
excluding one SNP at a time, to look for SNPs that highly influence the overall result. These graphs were created using
the ‘TwoSampleMR’ and ‘RadialMR’ R packages, using data from the OpenGWAS project. Code used to create these
figures is detailed in the Supplementary information for illustrative purposes.
mode approaches, although these can be replaced with instruments for the exposure under investigation
appropriate alternatives for the application in question. (Fig. 3e).For example, BMI influences circulating CRP
Additionally, these estimation methods will not neces- and if a genetic variant primarily associated with BMI is
sarily identify violations of any IV conditions that are included as a genetic variant for CRP, misleading effect
not due to pleiotropy of the nature interrogated by the estimates of the causal effect of CRP on other pheno-
method. Consequently, consistent results across a range types — including BMI — can be generated61,92. These
of methods is not a guarantee that results are free from issues are increasingly important to consider because
bias. Potential violations of the IV assumptions not the sample sizes used in GWAS are increasing, making
due to pleiotropy are discussed in the ‘Limitations and it more likely that a primary phenotype has been mis-
optimizations’ section. specified (in the context of GWAS, this could refer to
Another form of pleiotropy arises when the expo- the detection of genetic variants for an upstream phe-
sure for the MR estimation is misspecified and genetic notype of the exposure which potentially confounds
variants associated with a confounder are used as the exposure and outcome, or genetic variants for the
0123456789();:
Primer
outcome if the direction of effect has been misspecified). Multivariable MR. Multivariable MR is an extension
Steiger filtering attempts to correct for this misspecifi- of standard MR that includes multiple exposures, pre-
cation by removing SNPs that explain more variation in dicted by a set of genetic variants used as instruments.
the outcome than the exposure93. Any genetic variant Figure 5 illustrates a multivariable MR with two expo-
should explain more variation in the phenotypes that sures. Although multiple exposures can be included in a
it is more proximal to; however, differing measurement multivariable MR, there must be at least as many genetic
error, substantially different sample sizes for each phe- variants or scores included as instruments as there are
notype, or the presence of binary or categorical pheno- exposures. Multivariable MR can be estimated with either
types can lead to phenotypes that are less proximal to individual-level or summary-level data using exten-
the genetic variant appearing to have more variation sions of the 2SLS or IVW approaches, respectively73,101.
explained by the variant than more proximal phenotypes Conditions required for estimation are adapted from
in the observed data. Additional methods are now being the standard IV conditions and are defined as follows:
developed that attempt to resolve misspecification and each exposure must be robustly predicted by the instru-
confounding89,94,95. ments, conditional on the other exposures included
in the estimation (multivariable instrumental variable
Software packages condition 1, or MVIV1); there must be no confounders
Any statistical package can be used for simple MR esti- of the outcome and any of the instruments (MVIV2)
mates as the core IV estimate is derived from a two-step and none of the instruments can have an effect on
regression model. Deriving correct standard errors the outcome that does not act through at least one of the
requires special calculations and variations on the exposures (MVIV3). If the above conditions are met,
standard model have been implemented as packages in the estimates obtained from multivariable MR will be a
common statistics packages such as Stata and R. A range direct effect of each exposure included on the outcome,
of software packages are available in both Stata and given the other exposures included in the estimation73.
R to conduct MR estimation, many of which include Multivariable MR can be used as an approach to
a range of assumption tests and options to conduct address pleiotropic violations of the IV conditions. In
robust methods. The TwosampleMR R package links to a univariable MR where IV3 is violated and the genetic
the OpenGWAS project database (see preprint96), a large variants used as instruments for an exposure of interest
database of GWAS results that can be used in the estima- are also thought to be associated with another trait on
tion. Table 2 gives details of the most popular software the path to the outcome, that trait can be included as an
packages currently available; an extended list is given in additional exposure in the multivariable MR estimation.
the Supplementary information. Multiple, correlated exposures can be included in a mul-
tivariable MR; however, including multiple exposures
Further extensions of MR methods can reduce power and potentially instrument strength
Bidirectional MR. In bidirectional MR, two MR analyses and thus the benefit of adding extra exposures must be
are conducted on the same pair of phenotypes by revers- considered carefully. Bayesian approaches have been
ing the exposure and the outcome. This method can be proposed for selecting a set of exposures where multi-
used to establish the direction of effect between two ple highly correlated exposures are potentially relevant
variables. For example, extensive observational evidence for an outcome102. In addition, multivariable MR can be
indicates that hearing loss predicts dementia and it is used for mediation analysis, as described below.
hypothesized to be an important causal determinant of
dementia97; however, it is possible that the neurodegen- MR mediation analysis. MR can be used to estimate the
erative disease that leads to dementia also causes hearing proportion of the effect of an exposure on an outcome
loss and thus the causal direction between hearing loss that is mediated by an intermediate phenotype103,104.
and dementia is unclear. There are known genotypes Network MR and two-step MR use two univariable MR
for both hearing loss and Alzheimer disease — the most estimates to do this, estimating the effect of the primary
common cause of dementia98–100 — and a bidirectional exposure on the intermediate phenotype and the effect
MR would first conduct an MR analysis of the effect of of the intermediate phenotype on the outcome105,106.
liability to dementia on hearing and then for the effect Alternatively, multivariable MR can estimate the direct
of hearing on dementia. If genetic variants known to effect of each exposure on the outcome that is not medi-
associate with dementia influence hearing loss and ated by the other exposures included in the estimation.
genetic variants known to associate with hearing loss do If all of the IV conditions are satisfied, this estimate will
not influence dementia risk, this suggests that hearing differ from a univariable MR estimate where all or part
loss is a causal determinant of dementia. of the effect of the exposure on the outcome acts through
Results from bidirectional MR studies should be a mediating phenotype included in the multivariable
interpreted with caution. Evidence of an effect in both MR estimation103. Both two-step and multivariable MR
directions could indicate a true bidirectional relationship can therefore be used as part of a mediation analysis to
between the exposures or be a product of bias from hori- estimate how much of the effect of an exposure on an
zontal pleiotropic effects in the variants, misspecification outcome acts through an intermediate phenotype103,104.
Bidirectional relationship of the primary phenotype, or a violation of the second When multiple intermediate phenotypes are thought to
Where an effect acts in both
directions between a pair of
IV condition owing to confounding of genetic vari- be potential mediators, two-step MR can estimate the
traits so that changing one will ants and outcome caused by factors such as population proportion of the outcome mediated through each of
change the other. stratification and dynastic effects. these, whereas multivariable MR including all of the
0123456789();:
Primer
mediators considered will estimate the total propor- it has also suggested that the J-shape could be caused by
tion of the effect of the exposure on the outcome that the relationship between BMI and mortality risk differing
is mediated by the set. If the intermediate phenotype for ever-smokers and never-smokers108.
mediators are correlated, or one also mediates the effect
of another on the outcome, the total proportion of the Testing for interactions between exposures. With
outcome mediated by all of the intermediate phenotypes individual-level data, it is possible to test for inter-
may be less than the sum of the proportion mediated actions between two exposures using MR. When
by each one individually; therefore, each of the above individual-level data are available to conduct a multi-
approaches will estimate different effects. A detailed variable MR, interactions between the exposures can be
description of the use of MR for mediation analysis is included as additional exposures in the estimation109,110
given elsewhere104. This requires a multivariable MR estimation including
the exposure, the potential effect modifier and the inter-
Non-linear MR. Standard MR provides only a single actions between them included as exposures. The inclu-
effect estimate, which may not be informative if the effect sion of these additional terms decreases the statistical
of the exposure varies in a non-linear way — for example, power for detecting an effect and should be limited to
a dose–response curve. With individual level data and a a single interaction. An alternative approach is to split
continuous exposure, non-linear MR can be applied to the allele scores for each exposure into high and low
estimate whether the causal effect of the exposure on the values and to compare outcomes across the resulting
outcome varies across different levels of the exposure29,107. four groups by dividing participants up on the basis of
For example, although mortality risk generally increases their score for each exposure, mimicking a 2 × 2 factorial
with BMI, an increase is also seen at very low BMIs; this randomized trial. It should be noted that this approach
J-shaped relationship may reflect weight loss in individ- can have low power compared with the inclusion of an
uals who are unwell, potentially before their illness is interaction term in a 2SLS regression110.
diagnosed. Non-linear MR has supported this, although
Colocalization and MR
Table 2 | Summary of selected software packages for performing MR analyses Ever larger GWAS have now provided evidence that
hundreds of genetic variants can be associated with
Package name Software Description many human phenotypes. This, together with the ten-
Individual-level data dency for neighbouring genetic variants to be corre-
AER R Includes the ivreg function for 2SLS lated owing to linkage disequilibrium, could lead to the
estimation violation of IV condition 2, in which different neigh-
OneSampleMR R Various functions for one-sample IV bouring variants happen to be causally associated to
analyses, including the Sanderson– the exposure and outcome through different pathways
Windmeijer F statistic, and various (Fig. 6a). The bias in this situation is equivalent to that
estimators (two-stage predictor caused by pleiotropy (Fig. 3) and, although it is unlikely
substitution, two-stage residual inclusion,
structural mean models)
that this pattern will arise at many independent genetic
locations in MR studies with multiple IVs, it should be
ivmodel R Various functions for individual-level IV a consideration in single-IV studies.
analyses, includes LIML, weak instrument
tests and sensitivity analyses Colocalization analysis can be used to determine
whether two traits share causal variants in a single
ivtools R Various functions for individual-level IV
analyses, including functions to fit structural
genetic region, without prior knowledge of which vari
mean models ant is causal for either trait. It was originally used to
identify potential molecular causes of single GWAS
ivonesamplemr Stata Includes various estimators (two-stage
predictor substitution, two-stage residual associations and considers the patterns of association
inclusion, structural mean models) for across multiple neighbouring genetic variants for the
one-sample IV analyses GWAS and exposure traits (including molecular traits).
ivreg2 Stata Stata module for extended IVs/2SLS and Although this involves an implicit assumption of direc-
generalized method of moments estimation tionality in its interpretation, the test is not dependent
ivregress Stata Linear IV estimators including 2SLS on this assumption and indeed a single pleiotropic var-
iant would satisfy the statistical definition of a shared
Summary-level data
causal variant (Fig. 6b). Unlike in MR with multiple IVs,
MendelianRandomization R Implements several methods for performing the majority of multiple neighbouring genetic variants
MR analyses with summarized data and an considered in this analysis are expected to be associ-
interface with the PhenoScanner database
ated with either trait solely through linkage disequilib-
TwoSampleMR R/web-app MR-base is an analytical platform for MR. rium with one or a small number of causal variants in
and MR-Base app TwoSampleMR is the R package providing
the functions to perform MR estimation. the region. This explicit use of linkage disequilibrium
Both are linked to the OpenGWAS project, means that colocalization can be used to check for the
a large database of GWAS summary statistics violation of IV condition 2 in the form shown in Fig. 6a
mrrobust Stata Provides various programs for two-sample (and Fig. 3c).
MR analyses in Stata One colocalization method originally proposed
2SLS, two-stage least-squares; GWAS, genome-wide association study; IV, instrumental by Plagnol et al.111 frames shared causality as the null
variable; LIML, limited information maximum likelihood; MR, Mendelian randomization. hypothesis, and rejection of this would indicate violation
0123456789();:
Primer
0123456789();:
Primer
a Distinct causal variants b Shared causal variant, c Shared causal variant, multivariable model would suggest that high adiposity
horizontal pleiotropy vertical pleiotropy in childhood has a long-term effect on health outcomes in
G1 G2 G G adulthood — suggesting that focusing on early inter-
ventions in childhood to minimize excess body weight
would be helpful in lowering the risk of diseases that typ-
X Y X Y X Y ically present in adulthood. As UK Biobank participants
were asked for information on their body size at 10 years
U U U of age and BMI was measured at recruitment into the
study142, these data provided an opportunity to conduct
Fig. 6 | Illustration of variants in linkage disequilibrium and shared causal variants
GWAS on body size during childhood and adulthood
identified by colocalization. a | An example of distinct causal variants that violate the
instrumental variable assumption IV2. G1 and G2 represent two genetic variants and
for the same group of individuals and detected 295 and
the link between them is non-directional, reflecting linkage disequilibrium. b,c | Examples 557 independent SNPs associated with childhood
of a shared causal variant are a violation of assumption IV2 (panel b) and a situation that and adulthood body size, respectively, with a high level
satisfies the IV assumptions (panel c). of overlap in the SNPs associated with each time period,
as expected141. Univariable MR analysis showed that
both genetically predicted body size in early life and
when other lipid fractions were accounted for, suggesting adulthood were individually related to higher risks of
this association was not owing to confounding133. CHD and T2D and a lower risk of breast cancer. By
MR studies have provided accumulating evidence contrast, multivariable MR analysis identified that only
against the observational results above135–138. Such MR adult body size showed an independent causal effect for
studies used a range of genetic variants that act through CHD and T2D, suggesting that the relationship between
different mechanisms and showed no protective effect of early-life body size was mediated through adult body
increased levels of HDL-C on CHD risk. These studies size. By contrast, the inverse relationship between genet-
were published alongside the results of several large-scale ically predicted body size and breast cancer was stronger
RCTs of pharmacological interventions that specifically for early-life body size than adult body size in the multi
increased HDL-C without a noticeable change in other variable MR analysis, suggesting an age-dependent
blood lipids such as LDL-C, which also failed to show a relationship between adiposity and the risk of different
protective effect139,140. These data indicate that the asso- diseases in adults. This suggests that for children that
ciation observed in the more traditional observational are overweight, losing weight in their adulthood can
studies was likely to have been due to confounding. It is still effectively lower risk of T2D and CAD and in this
worth reflecting on whether the RCTs would have been case a metabolically unhealthy childhood can poten-
embarked upon if the MR study findings were known at tially be offset by healthy lifestyle approaches adopted
the time of their inception134. Indeed, where data already in adulthood.
exists, MR studies are relatively cheap to conduct — Such study designs can be applied to other exposure–
particularly compared with a large RCT — and can outcome relationships to determine whether risk fac-
provide additional evidence that can be used to direct tors have cumulative effects or differential influences
which studies are worth following up with RCTs. at different periods of the life course. This information
However, it must be noted that MR studies are them- could allow for fine-tuned, age-specific public health
selves not free from issues of bias or lack of power; interventions that minimize the effects of deleterious,
evidence from MR studies for the presence or absence time-dependent risk factors. However, it is very impor-
of an effect should be triangulated with findings from tant to bear in mind that effects of harmful exposures
studies using different methods that would be expected may become less evident with increasing age because of
to have different sources of bias6,7. selection bias owing to the almost inevitable selection
of survivors143.
Testing causation across the life course
A key issue in preventing disease in adulthood is iden- Estimation of health-care costs
tifying when in the life course harmful exposures must A clear understanding of the health-care costs arising
be minimized. For example, if the contribution of expo- from individual diseases and risk factors is needed
sures in childhood is non-reversible, this evidence would to ensure that public health resources are distribu
argue in favour of early intervention. This is challenging ted judiciously. RCTs are typically not designed to
to appraise using conventional observational epidemi- estimate health-care costs as an outcome and con-
ology owing to various features such as time-dependent ventional observational studies aimed at assessing
confounding. health-care costs can be hampered by selection bias
One example of this issue is the relationship between and confounding.
adiposity and adult-onset diseases such as CHD and type 2 Dixon and colleagues144 described a potential appli-
diabetes (T2D). An MR study141 took an innovative cation for MR in quantifying the effects of genetically
approach by constructing separate genetic instruments predicted BMI on health-care costs. Their method
for early-life body size and adult body size. The authors used data from the UK Biobank, which provided a rich
were able to fit a multivariable MR model to elucidate source of data for exploring the causal relationship of
whether childhood body size was detrimental to the lifelong exposures to certain traits and genetic liability
risk of CHD or T2D after taking adult body size into to diseases and their economic impact. Using genetic
account. A direct effect of childhood body size in the variants associated with higher BMI as instruments in
0123456789();:
Primer
an individual-level MR study to estimate the effect of that findings may be replicated in settings with nearly
BMI on hospitalization costs145, the authors found that identical sources of bias. Data with such replication
higher BMI increased hospital costs with little evidence provide little independent confirmation of the initial
for non-linearity in this effect. In addition to physi- result and thus even highly consistent replicated find-
ological consequences, body weight has social con- ings may not reflect true causal effects. An example is
sequences such as increasing exposure to stigma and the J-shaped association between alcohol consumption
discrimination and these MR analyses include the con- and cardiovascular disease; there is now consensus that
sequences of all such mechanisms for hospitalization this apparent protective effect of moderate levels of
costs. consumption is artefactual, as discussed above129. One
simple step authors can take to ensure that MR findings
Testing treatment response factors are robust and reproducible is to use the STROBE-MR
Identifying whether individuals are likely to respond to a guidelines50,51, which outline how MR studies should be
specific therapy is an important component of so-called reported to make the approach used in any particular
‘precision medicine’, whereby the goal is to individualize study clear for readers.
patient care based on genetic, environmental and life- The first aim of all studies should be to ensure that
style factors. This can be done in conventional pharma- steps are taken to detect and minimize biases such as
cogenetic studies and RCTs, although the risk of bias in selection bias or bias caused by violation of one of the
the former and the sample size constraints of the latter IV conditions. Triangulation of evidence from multi-
mean that neither provide a reliable means of assess- ple methodologies — using different methodologies
ing interactions between an individual’s genotype and that are subject to different sources and directions of
treatment response. potential bias — can help to identify bias in MR stud-
A recent study by Xu and Burgess146 used a drug- ies6,7,152. Alignment of results across these different
target MR design41,147 to investigate polygenic deter- methodologies can improve confidence in an initial
minants of the response of LDL-cholesterol levels to causal interpretation. Among the most promising strat-
treatment with statins. The authors used SNPs in and egies for triangulation is contrasting MR results with
around the HMGCR locus as a mimic of the pharma- results using other IVs — such as policy-based IVs —
cological inhibition of HMG-CoA reductase by statins, or results from conventional analyses. For example,
and explored genetic variants that might act as effect there is clear evidence from both MR and the natural
modifiers of the association between the statin genetic experiment of an increase in the school leaving age that
instrument and LDL-cholesterol levels. Polygenic scores an increase in the number of years in education has
did not identify any effect-modifying genetic groups; a causal protective effect on health behaviours such
however, a single variant (rs162724) proximal to the glu- as smoking153–156 (see preprint157). Within MR, using
tamate receptor gene GRM7 and previously associated methods that make different assumptions (such as
with major depressive disorder was found to potentially those regarding pleiotropy) and are therefore subject
be of interest. The authors postulated that this vari- to different sources and directions of potential bias
ant could be related to statin response via concurrent can support this approach, although some important
pharmacotherapies for major depressive disorder or via assumptions may be shared by many methods, reduc-
poorer adherence to statin treatment reducing the effect ing the potential independent insight to be gained from
of statins on LDL-cholesterol. comparing studies.
Although the above study did not find evidence of Open research can increase the robustness of data
reliable polygenic effect modification, it introduces the by enabling greater scrutiny of data and increased error
concept of agnostic identification of pharmacogenetic detection by researchers and the wider research com-
interactions within the context of a population-based munity. Open research approaches for increasing data
study. This approach benefits from lack of confound- transparency include protocol pre-registration and
ing by indication, compared with a conventional phar- sharing of data, code and materials. Summary data from
macoepidemiology study design148. However, using a GWAS are often a source of data for MR analysis and
genetic instrument for treatment as part of a drug-target are typically publicly available, such as those listed on
MR means that the underlying magnitude of the effect the OpenGWAS project. Although individual-level data
for which potential genetic effect modifiers are investi- are not made publicly available owing to the sensitive
gated is very small and thus very large sample sizes are nature of the data, there are a number of large datasets
needed to identify effects. When using MR in this way, that are accessible to any researchers on application,
it is important to identify appropriate instruments for such as the UK Biobank. Any MR estimation should
estimating the effect of a particular drug. Instruments clearly indicate the data sources they have used and link
that are associated with the target of that drug should be to the dataset used if it is publicly available. The source
used, rather than those associated with the risk factor code for many software packages is openly available (for
that the drug acts on41,149,150. example, TwoSampleMR and mrrobust on GitHub, and
MendelianRandomization on CRAN). Although the
Reproducibility and data deposition analysis code from MR studies is not routinely shared,
There has been substantial discussion of the importance we would encourage readers of this Primer to do so to
of ensuring that published research findings are robust, enable errors in coding to be more readily identified.
replicable and reproducible in recent years151. In the con- Pre-registration of study protocols has not been widely
text of epidemiological research, one area of concern is adopted in observational epidemiology, although it
0123456789();:
Primer
could in principle be applied and would help to pro- are generally fixed at conception, it is not intuitively clear
tect against biases such as publication bias against null how confounding of the instrument and the outcome
results or findings that do not fit with the anticipated can occur in MR studies. However, population stratifi-
conclusion158 (see preprint159). cation, dynastic effects and assortative mating all induce
bias by creating an artefactual relation at the population
Limitations and optimizations level between the genetic variants and the outcome,
An important limitation of MR studies is the potential violating the second IV condition64,160–163. Each of these
confounding of the genetic variants and the outcome sources of confounding are described in detail in Box 5.
(violation of IV condition 2; Box 2). As genetic variants This correlation between genetic variants and the out-
come can potentially affect most (or all) of the genetic
Box 5 | Sources of instrument–outcome confounding in MR studies variants used as instruments; it is therefore not easy to
correct for using current MR methods given that most
Population stratification
Population stratification is the association between genetic variants and phenotypes assume that the majority of genetic variants satisfy all of
that occurs because of underlying structure within the population52,220. This underlying the IV conditions60. Considering the potential for biases
structure reflects the fact that genetic mutations accrue and accumulate across of the sort described here is therefore crucial in the
generations, and that individuals differentially select partners who are geographically interpretation of any MR result.
proximal. Within genome-wide association studies, population stratification is often One solution that can account for confounding
controlled for by adjusting for the top principal components from a principal compo- owing to dynastic effects and assortative mating is the
nent analysis of the genetic variants or through the use of linear mixed models221–223. use of family data to conduct the MR analyses164,165.
However, there is increasing evidence that these approaches do not fully account for Within-family MR requires data from either pairs
the underlying structure for a number of phenotypes224,225. Population stratification can
of siblings or mother–father–child trios and allows
bias estimates from Mendelian randomization (MR) studies by creating an association
for the estimation of causal effects using MR after
between the genetic variants and the outcome as illustrated in panel a of the
figure161,224. In the figure, G represents genetic variants, X represents exposure family-level structure has been taken into account161,164.
and Y represents outcome in a MR study. Within-family MR using sibling pairs will also account
for any factors acting at a population level that affect
Dynastic effects
siblings equally, such as population stratification.
Dynastic effects are the direct effects on an individual’s phenotypes of the phenotypes
of their parents, and (potentially to a lesser extent) more distantly related relatives such A key limiting factor for within-family MR is the lack
as grandparents. As parental genotypes have a direct effect on the genotype of an of available data and the low power of these studies as
individual, if a parent’s phenotype is influenced by their genotype and influences the a result; however, a GWAS of family data for a range
individual’s phenotype this will induce confounding between the genetic variants and of phenotypes has recently been published, enabling
phenotype of the offspring, as illustrated in panel b of the figure162. If the exposure has further within-family MR in the future166.
a non-null causal effect on the outcome in a MR study, these dynastic effects will induce Another type of bias that can arise in MR studies
instrument–outcome confounding and bias the results of the MR study161. In the figure, that cannot be easily corrected for is selection bias63. In
GA, XA and YA are the genetic variants, exposure and outcome respectively for ancestors an MR study, an example of selection bias would be if
(such as parents) of the individuals under consideration in the MR estimation. an individual’s exposure and outcome values affected
Assortative mating their participation56. When these phenotypes are par-
Assortative mating occurs when individuals select partners who are more similar tially determined by genetic variants, this will also
to themselves than would be expected by chance, with respect to one or multiple induce an association between those genetic variants
phenotypes226,227. If the genetically influenced level of the phenotype influences and participation. Study participation has been shown
selection, this assortment can lead to spurious genetic associations with the phenotype
to be heritable and is influenced by a number of different
or phenotypes on which the assortment is based or that are causally dependent on
the assortment phenotypes. This consequently biases MR estimates involving these
traits, and large studies such as the UK Biobank have
phenotypes160,161. been shown to have high levels of selection in those who
participate30,167–169.
Transmission ratio distortion
In addition, most studies recruit survivors of the orig-
Transmission ratio distortion occurs when the transmission of alleles from parents
to offspring deviates from the expected probability of 50:50. This can occur owing to
inal birth cohorts. This means all participants must have
processes during meiosis and fertilization favouring one parental allele over another survived in order to observe whether they get the out-
or if the viability of the offspring depends on their genotype. If environmental factors come of interest. Selection of participants on surviving
influence the transmission ratio distortion, those environmental factors will become their genetic make-up and the outcome of interest or a
associated with genotype in the offspring. The association between any environmental competing risk of the outcome effectively applies covar-
factor and the genotype can lead to the potential for instrument–outcome confounding iable adjustment on survival into the estimates170–172.
in MR if the environmental factor also influences the outcome164. Until recently, data This form of selection bias is likely to be particularly
on parent–offspring trios were not available at the scale required to investigate this problematic for studies of harmful exposures on disease
possibility, but this is now becoming possible228. outcomes that occur in later life and will be least evident
a b in studies where the exposure does not affect survival
GA XA YA
to recruitment173. As such, consideration of whether the
genetically instrumented exposures would affect survival
to recruitment, age at recruitment or any competing risk
U
of the outcome may help to identify bias. This type of
survival bias will affect observational studies of the same
research question in similarly aged populations, so it is
not an obvious explanation for discrepancies between
G X Y G X Y
MR and conventional results. All forms of selection bias
0123456789();:
Primer
could bias MR estimates and so careful assessment of As most contemporary MR studies rely on available
the potential for selection into the sample or samples GWAS data, they unfortunately suffer from considera-
used in an MR study is important64. Novel methods are ble bias with respect to the representativeness of pop-
being developed that attempt to detect and correct for ulations according to geography and ancestry181. This
selection bias171,174; however, this is an area in which can influence the generalizability of MR findings and
further research is required. exacerbate existing inequity in medical research. It can
Finally, MR uses genetic variants that are fixed also restrict the scope of MR studies, as some forms of
across the life course to estimate the lifetime effects genetic variation are restricted to particular populations.
of the exposure of interest. This introduces a potential For example, a large-effect genetic variant influencing
limitation in the form of canalization, which refers to alcohol consumption that has been of considerable value
a natural tendency for the suppression of phenotypic in MR studies of the effects of alcohol74,129 is only prev-
variation among individuals despite contrasting geno- alent in East Asian populations. Current international
types. Canalization can occur when polymorphic phe- efforts to equalize inclusion of different populations
notypes expressed during fetal development lead to the in genetic studies will hopefully begin to address this
development of compensating pathways to mitigate important issue.
the effects of that expression1,175,176. For example, indi- A large area of medical research is aimed at iden-
viduals with genetically elevated fibrinogen levels could tifying potentially therapeutic influences on disease
become resistant to the effects of higher fibrinogen progression once the disease is established. However,
owing to permanent changes in tissue structure during MR studies usually rely on GWAS of the initial develop-
fetal development. Canalization is seen following dra- ment of disease for their outcome data. This means that
matic genetic or environmental changes, for example in although MR has been a powerful tool for confirming or
gene-knockout studies177,178. Such compensation would discovering factors that cause disease, it does not often
potentially limit the ability of MR to identify the causal identify therapeutic targets182. For example, although
effect of the change in the exposure because the effect MR studies have shown that smoking causes lung can-
of a genetically induced change from conception would cer183, this is not useful therapeutically following the
be different from the effect of a change in later life. onset of the disease given that smoking cessation is not
This is an example of a violation of the assumption of a useful treatment once lung cancer has developed. It is
gene–environment equivalence (Box 4). Further work is plausible that in many cases, factors that cause a disease
required to understand whether small changes induced do not relate to its progression once it is established. For
by the common polymorphisms used to estimate example, the onset and progression of Crohn’s disease
causal effects in MR lead to the same compensatory are associated with different genetic variants, indicating
effects. that different risk factors play a part in onset and devel-
opment184. It is also possible that the same risk factor
Outlook could have opposite effects on incidence and progres-
The rapid increase in MR publications demonstrates the sion, as has been suggested for folate intake and colon
need for approaches that can contribute to strengthen- cancer185. MR of factors influencing disease progression
ing causal inference. This growth in the quantity of pub- is needed to identify useful treatments186; however, such
lished MR studies comes with anxiety regarding their estimation requires appropriate datasets and as there
quality. Papers reporting two-sample MR have grown are currently few of these in existence, efforts should
rapidly over recent years and now constitute a large be focused on increasing the availability of such data.
majority of published studies8,80. These are relatively Importantly, case-only study designs may be severely
easy to conduct — perhaps too easy — and they can compromised by collider bias55,63, which must be taken
contain obvious errors, as discussed and demonstrated into account in data analysis182. Further methodological
elsewhere80. Indeed, many such papers simply report development is required in this domain.
MR estimates obtained from applying open-access soft- Although the increasing size of GWAS datasets
ware to open-access data and in these cases the analyses appears to be good news for MR studies, it can also
have, in essence, already been conducted by automated introduce problems; smaller and smaller effect sizes are
tools — an observation detailed in a preprint article179. being identified as ‘genome-wide significant’ in GWAS
The situation with MR is now moving towards the one and it is increasingly likely that such variants affect the
seen in the meta-analysis literature, with the mass pro- trait of interest through an upstream phenotype that
duction of redundant, misleading and conflicted publi- might in turn influence the outcomes under investiga-
cations180. The current explosion in predatory journals tion. For example, as the GWAS of CRP and vitamin D
unfortunately means that this situation is very unlikely increased in size, multiple variants that primarily influ-
to change. There are now a number of guidelines availa- ence adiposity were identified — with adiposity being
ble for MR estimation, and those regarding the conduct39 a confounder of the observational association of these
and reporting of MR studies50,51 are useful for under- exposures with health outcomes. If these variants are
Collider bias standing and identifying whether a MR study has been used as instruments for CRP or vitamin D, they will pro-
Bias occurring owing to well conducted and reported properly. For those aiming duce highly misleading results. The resulting bias can be
conditioning on a variable that to keep up with the MR literature, the twitter account accounted for through multivariable MR if the upstream
is dependent on both the
exposure and outcome or is
@MR_lit searches for papers and preprint articles and factor is known; however, in many cases it is not known
dependent on causes of the allows readers to rapidly review abstracts to identify and so the bias will remain undetected. This issue of
exposure and outcome. papers of interest. misspecification of the primary phenotype8,61 requires
0123456789();:
Primer
more research to identify the extent of the problem of follow to establish the effects of interventions. This con-
recapitulating confounding in MR studies as GWAS size clusion remains unchanged, although moving towards
increases. formal triangulation6,7,152 of all pertinent evidence as dis-
When initially presented, it was concluded that cussed in this Primer should be the goal of all research
“[MR] offers a more robust approach to understand- aimed at identifying causal influences on health and
ing the effect of some modifiable exposures on health development outcomes.
outcomes than does much conventional observational
epidemiology”1 and that, where possible, RCTs should Published online xx xx xxxx
1. Davey Smith, G. & Ebrahim, S. ‘Mendelian analogy between Mendelian randomization and endogenous explanatory variable is weak. J. Am. Stat.
randomization’: can genetic epidemiology contribute randomized trials. Epidemiology 28, 653–659 (2017). Assoc. 90, 443–450 (1995).
to understanding environmental determinants of 23. Didelez, V., Meng, S. & Sheehan, N. A. Assumptions 43. Burgess, S., Davies, N. M. & Thompson, S. G. Bias
disease? Int. J. Epidemiol. 32, 1–22 (2003). of IV methods for observational epidemiology. due to participant overlap in two-sample Mendelian
2. Angrist, J. D., Imbens, G. W. & Rubin, D. B. Statist. Sci. 25, 22–40 (2010). randomization. Genet. Epidemiol. 40, 597–608
Identification of causal effects using instrumental 24. Palmer, T. M. et al. Using multiple genetic (2016).
variables. J. Am. Stat. Assoc. 91, 444–455 (1996). variants as instrumental variables for modifiable 44. Mounier, N. & Kutalik, Z. Correction for sample
3. Hernán, M. A. & Robins, J. M. Causal Inference: risk factors. Stat. Methods Med. Res. 21, 223–242 overlap, winner’s curse and weak instrument bias in
What If (Chapman & Hall/CRC, 2020). (2011). two-sample Mendelian Randomization. Preprint at
4. Greenland, S. An introduction to instrumental 25. Burgess, S. & Thompson, S. G. Use of allele scores as bioRxiv https://doi.org/10.1101/2021.03.26.437168
variables for epidemiologists. Int. J. Epidemiol. 29, instrumental variables for Mendelian randomization. (2021).
722–729 (2000). Int. J. Epidemiol. 42, 1134–1144 (2013). 45. Angrist, J. D. & Krueger, A. B. Split-sample
5. Zuccolo, L. & Holmes, M. V. Commentary: 26. Davies, N. M. et al. The many weak instruments instrumental variables estimates of the return to
Mendelian randomization-inspired causal inference problem and Mendelian randomization. Stat. Med. schooling. J. Bus. Econ. Stat. 13, 225–235 (1995).
in the absence of genetic data. Int. J. Epidemiol. 46, 34, 454–468 (2015). 46. Fang, S., Hemani, G., Richardson, T. G., Gaunt, T. R. &
962–965 (2017). 27. Hernán, M. A. & Robins, J. M. Instruments for causal Davey Smith, G. Evaluating and implementing block
6. Munafò, M. R., Higgins, J. P. & Davey Smith, G. inference: an epidemiologist’s dream? Epidemiology jackknife resampling Mendelian randomization to
Triangulating evidence through the inclusion of 17, 360–372 (2006). mitigate bias induced by overlapping samples.
genetically informed designs. Cold Spring Harb. 28. Swanson, S. A., Hernán, M. A., Miller, M., Robins, J. M. Preprint at medRxiv https://doi.org/10.1101/2021.
Perspect. Med. 11, a040659 (2021). & Richardson, T. S. Partial identification of the average 12.03.21267246 (2021).
7. Lawlor, D. A., Tilling, K. & Davey Smith, G. treatment effect using instrumental variables: review 47. Sadreev, I. I. et al. Navigating sample overlap,
Triangulation in aetiological epidemiology. Int. J. of methods for binary instruments, treatments, winner’s curse and weak instrument bias in Mendelian
Epidemiol. 45, 1866–1886 (2017). and outcomes. J. Am. Stat. Assoc. 113, 933–947 randomization studies using the UK Biobank. Preprint
8. Richmond, R. C. & Davey Smith, G. Mendelian (2018). at medRxiv https://doi.org/10.1101/2021.06.28.
randomization: concepts and scope. Cold Spring Harb. 29. Staley, J. R. & Burgess, S. Semiparametric methods 21259622 (2021).
Perspect. Med. https://doi.org/10.1101/cshperspect. for estimation of a nonlinear exposure–outcome 48. Davies, N. M., Holmes, M. V. & Davey Smith, G.
a040501 (2022). relationship using instrumental variables with Reading Mendelian randomisation studies: a guide,
9. Davey Smith, G. & Ebrahim, S. Mendelian application to Mendelian randomization. Genet. glossary, and checklist for clinicians. BMJ 362, k601
randomization: prospects, potentials, and limitations. Epidemiol. 41, 341–352 (2017). (2018).
Int. J. Epidemiol. 33, 30–42 (2004). 30. Tyrrell, J. et al. Genetic predictors of participation in 49. Holmes, M. V., Ala-Korpela, M. & Davey Smith, G.
10. Gupta, S. K. Intention-to-treat concept: a review. optional components of UK Biobank. Nat. Commun. Mendelian randomization in cardiometabolic disease:
Perspect. Clin. Res. 2, 109–112 (2011). 12, 886 (2021). challenges in evaluating causality. Nat. Rev. Cardiol.
11. Ellenberg, J. H. Intent-to-treat analysis versus 31. Davey Smith, G. Epigenesis for epidemiologists: 14, 577–590 (2017).
as-treated analysis. Drug Inf. J. 30, 535–544 (1996). does evo-devo have implications for population 50. Skrivankova, V. W. et al. Strengthening the reporting
12. Glymour, M. M. Natural experiments and health research and practice? Int. J. Epidemiol. 41, of observational studies in epidemiology using
instrumental variable analyses in social epidemiology. 236–247 (2012). Mendelian randomisation (STROBE-MR): explanation
Methods Soc. Epidemiol. 1, 429 (2006). 32. Freeman, G., Cowling, B. J. & Schooling, C. M. and elaboration. BMJ 375, n2233 (2021).
13. Martens, E. P., Pestman, W. R., de Boer, A., Power and sample size calculations for Mendelian 51. Skrivankova, V. W. et al. Strengthening the reporting
Belitser, S. V. & Klungel, O. H. Instrumental variables: randomization studies using one genetic instrument. of observational studies in epidemiology using
application and limitations. Epidemiology 17, Int. J. Epidemiol. 42, 1157–1163 (2013). Mendelian randomization: the STROBE-MR
260–267 (2006). 33. Walker, V. M., Davies, N. M., Windmeijer, F., statement. JAMA 326, 1614–1621 (2021).
14. Lousdal, M. L. An introduction to instrumental Burgess, S. & Martin, R. M. Power calculator for 52. Lawlor, D. A., Harbord, R. M., Sterne, J. A., Timpson,
variable assumptions, validation and estimation. instrumental variable analysis in pharmacoepidemiology. N. & Davey Smith, G. Mendelian randomization: using
Emerg. Themes Epidemiol. 15, 1 (2018). Int. J. Epidemiol. 46, 1627–1632 (2017). genes as instruments for making causal inferences in
15. Angrist, J. D. & Krueger, A. B. Instrumental variables 34. Burgess, S. Sample size and power calculations in epidemiology. Stat. Med. 27, 1133–1163 (2008).
and the search for identification: from supply and Mendelian randomization with a single instrumental 53. Wooldridge, J. M. Econometric Analysis of Cross
demand to natural experiments. J. Econ. Perspect. variable and a binary outcome. Int. J. Epidemiol. 43, Section and Panel Data (MIT Press, 2010).
15, 69–85 (2001). 922–929 (2014). 54. Cole, S. R. et al. Illustrating bias due to conditioning
16. Rassen, J. A., Brookhart, M. A., Glynn, R. J., 35. Brion, M.-J. A., Shakhbazov, K. & Visscher, P. M. on a collider. Int. J. Epidemiol. 39, 417–420 (2009).
Mittleman, M. A. & Schneeweiss, S. Instrumental Calculating statistical power in Mendelian 55. Munafò, M. R., Tilling, K., Taylor, A. E., Evans, D. M.
variables I: instrumental variables exploit natural randomization studies. Int. J. Epidemiol. 42, & Davey Smith, G. Collider scope: when selection bias
variation in nonexperimental data to estimate causal 1497–1501 (2012). can substantially influence observed associations.
relationships. J. Clin. Epidemiol. 62, 1226–1232 36. Morris, T. P., White, I. R. & Crowther, M. J. Using Int. J. Epidemiol. 47, 226–235 (2018).
(2009). simulation studies to evaluate statistical methods. 56. Hernán, M. A., Hernández-Díaz, S. & Robins, J. M. A
17. Didelez, V. & Sheehan, N. Mendelian randomization Stat. Med. 38, 2074–2102 (2019). structural approach to selection bias. Epidemiology
as an instrumental variable approach to causal 37. Burgess, S., Butterworth, A. & Thompson, S. G. 15, 615–625 (2004).
inference. Stat. Methods Med. Res. 16, 309–330 Mendelian randomization analysis with multiple 57. Staiger, D. & Stock, J. H. Instrumental variables
(2007). genetic variants using summarized data. Genet. regression with weak instruments. Report No. 0898-
18. Davey Smith, G. Capitalizing on Mendelian Epidemiol. 37, 658–665 (2013). 2937 (National Bureau of Economic Research, 1994).
randomization to assess the effects of treatments. 38. Zhao, Q., Wang, J., Spiller, W., Bowden, J. & 58. Stock, J. H. & Yogo, M. Testing for weak instruments
J. R. Soc. Med. 100, 432–435 (2007). Small, D. S. Two-sample instrumental variable in linear IV regression. Report No. 0898-2937
19. Carlson, C. S. et al. Polymorphisms within the analyses using heterogeneous samples. Stat. Sci. 34, (National Bureau of Economic Research, 2002).
C-reactive protein (CRP) promoter region are 317–333 (2019). 59. Brumpton, B. et al. Within-family studies for
associated with plasma CRP levels. Am. J. Hum. 39. Burgess, S. et al. Guidelines for performing Mendelian Mendelian randomization: avoiding dynastic,
Genet. 77, 64–77 (2005). randomization investigations. Wellcome Open Res. 4, assortative mating, and population stratification
20. Davey Smith, G. et al. Association of C-reactive protein 186 (2019). biases. Nat. Commun. 11, 1–13 (2020).
with blood pressure and hypertension: life course 40. Pierce, B. L. & Burgess, S. Efficient design for 60. Hemani, G., Bowden, J. & Davey Smith, G.
confounding and Mendelian randomization tests Mendelian randomization studies: subsample and Evaluating the potential role of pleiotropy in
of causality. Arterioscler. Thromb. Vasc. Biol. 25, 2-sample instrumental variable estimators. Am. J. Mendelian randomization studies. Hum. Mol. Genet.
1051–1056 (2005). Epidemiol. 178, 1177–1184 (2013). 27, R195–R208 (2018).
21. Morris, T. T., Heron, J., Sanderson, E., 41. Holmes, M. V., Richardson, T. G., Ference, B. A., 61. Davey Smith, G. & Hemani, G. Mendelian
Davey Smith, G. & Tilling, K. Interpretation of Davies, N. M. & Davey Smith, G. Integrating genomics randomization: genetic anchors for causal inference
Mendelian randomization using one measure of an with biomarkers and therapeutic targets to invigorate in epidemiological studies. Hum. Mol. Genet. 23,
exposure that varies over time. Preprint at medRxiv cardiovascular drug development. Nat. Rev. Cardiol. R89–R98 (2014).
https://doi.org/10.1101/2021.11.18.21266515 18, 435–453 (2021). 62. Burgess, S., Swanson, S. A. & Labrecque, J. A. Are
(2021). 42. Bound, J., Jaeger, D. A. & Baker, R. M. Problems Mendelian randomization investigations immune from
22. Swanson, S. A., Tiemeier, H., Ikram, M. A. & with instrumental variables estimation when bias due to reverse causation? Eur. J. Epidemiol. 36,
Hernán, M. A. Nature as a trialist? Deconstructing the the correlation between the instruments and the 253–257 (2021).
0123456789();:
Primer
63. Griffith, G. J. et al. Collider bias undermines our 84. Rees, J. M., Wood, A. M., Dudbridge, F. & Burgess, S. 107. Burgess, S., Davies, N. M. & Thompson, S. G.
understanding of COVID-19 disease risk and severity. Robust methods in Mendelian randomization via Instrumental variable analysis with a nonlinear
Nat. Commun. 11, 5749 (2020). penalization of heterogeneous causal estimates. exposure–outcome relationship. Epidemiology 25,
64. Hughes, R. A., Davies, N. M., Davey Smith, G. & PloS ONE 14, e0222362 (2019). 877 (2014).
Tilling, K. Selection bias when estimating average 85. Cho, Y. et al. Exploiting horizontal pleiotropy to search 108. Sun, Y.-Q. et al. Body mass index and all cause
treatment effects using one-sample instrumental for causal pathways within a Mendelian randomization mortality in HUNT and UK Biobank studies: linear and
variable analysis. Epidemiology 30, 350–357 (2019). framework. Nat. Commun. 11, 1010 (2020). non-linear mendelian randomisation analyses. BMJ
65. Sargan, J. D. The estimation of economic relationships 86. Verbanck, M., Chen, C.-Y., Neale, B. & Do, R. 364, l1042 (2019).
using instrumental variables. Econometrica 26, Detection of widespread horizontal pleiotropy in 109. North, T.-L. et al. Using genetic instruments to
393–415 (1958). causal relationships inferred from Mendelian estimate interactions in Mendelian randomization
66. Glymour, M. M., Tchetgen Tchetgen, E. J. & randomization between complex traits and diseases. studies. Epidemiology 30, e33–e35 (2019).
Robins, J. M. Credible Mendelian randomization Nat. Genet. 50, 693–698 (2018). 110. Rees, J., Foley, C. N. & Burgess, S. Factorial
studies: approaches for evaluating the instrumental 87. Zhao, Q., Wang, J., Hemani, G., Bowden, J. & Mendelian randomization: using genetic variants to
variable assumptions. Am. J. Epidemiol. 175, Small, D. S. Statistical inference in two-sample assess interactions. Int. J. Epidemiol. 49, 1147–1158
332–339 (2012). summary-data Mendelian randomization using robust (2019).
67. Diemer, E. W., Labrecque, J., Tiemeier, H. & adjusted profile score. Ann. Stat. 48, 1742–1769 111. Plagnol, V., Smyth, D. J., Todd, J. A. & Clayton, D. G.
Swanson, S. A. Application of the instrumental (2020). Statistical independence of the colocalized association
inequalities to a Mendelian randomization study with 88. Wang, J. et al. Causal inference for heritable signals for type 1 diabetes and RPS26 gene
multiple proposed instruments. Epidemiology 31, phenotypic risk factors using heterogeneous genetic expression on chromosome 12q13. Biostatistics 10,
65–74 (2020). instruments. PLoS Genet. 17, e1009575 (2021). 327–334 (2009).
68. Yang, Q., Sanderson, E., Tilling, K., Borges, M. C. & 89. Morrison, J., Knoblauch, N., Marcus, J. H., 112. Wallace, C. Statistical testing of shared genetic control
Lawlor, D. A. Exploring and mitigating potential bias Stephens, M. & He, X. Mendelian randomization for potentially related traits. Genet. Epidemiol. 37,
when genetic instrumental variables are associated accounting for correlated and uncorrelated pleiotropic 802–813 (2013).
with multiple non-exposure traits in Mendelian effects using genome-wide summary statistics. 113. Pavlides, J. M. W. et al. Predicting gene targets from
randomization. Preprint at medRxiv https://doi. Nat. Genet. 52, 740–747 (2020). integrative analyses of summary data from GWAS and
org/10.1101/19009605 (2019). 90. Bowden, J., Davey Smith, G. & Burgess, S. Mendelian eQTL studies for 28 human complex traits. Genome
69. Lawlor, D. A. et al. Exploring the developmental randomization with invalid instruments: effect Med. 8, 84–84 (2016).
overnutrition hypothesis using parental–offspring estimation and bias detection through Egger 114. Hormozdiari, F. et al. Colocalization of GWAS and
associations and FTO as an instrumental variable. regression. Int. J. Epidemiol. 44, 512–525 (2015). eQTL signals detects target genes. Am. J. Hum. Genet.
PLoS Med. 5, e33 (2008). 91. Burgess, S. & Thompson, S. G. Multivariable 99, 1245–1260 (2016).
70. Kang, H., Zhang, A., Cai, T. T. & Small, D. S. Mendelian randomization: the use of pleiotropic 115. Giambartolomei, C. et al. Bayesian test for
Instrumental variables estimation with some invalid genetic variants to estimate causal effects. Am. J. colocalisation between pairs of genetic association
instruments and its application to Mendelian Epidemiol. 181, 251–260 (2015). studies using summary statistics. PLoS Genet. 10,
randomization. J. Am. Stat. Assoc. 111, 132–144 92. Bowden, J. & Vansteelandt, S. Mendelian e1004383 (2014).
(2016). randomization analysis of case-control data using 116. Wallace, C. Eliciting priors and relaxing the single
71. Windmeijer, F., Farbmacher, H., Davies, N. & structural mean models. Stat. Med. 30, 678–694 causal variant assumption in colocalisation analyses.
Davey Smith, G. On the use of the lasso for (2011). PLoS Genet. 16, e1008720 (2020).
instrumental variables estimation with some invalid 93. Hemani, G., Tilling, K. & Davey Smith, G. Orienting 117. Marmot, M. & Brunner, E. Alcohol and cardiovascular
instruments. J. Am. Stat. Assoc. 114, 1339–1350 the causal relationship between imprecisely measured disease: the status of the U shaped curve. BMJ 303,
(2019). traits using GWAS summary data. PLoS Genet. 13, 565–568 (1991).
72. Jiang, L. et al. Constrained instruments and e1007081 (2017). 118. Corrao, G., Rubbiati, L., Bagnardi, V., Zambon, A. &
their application to Mendelian randomization with 94. Brown, B. C. & Knowles, D. A. Welch-weighted Egger Poikolainen, K. Alcohol and coronary heart
pleiotropy. Genet. Epidemiol. 43, 373–401 (2019). regression reduces false positives due to correlated disease: a meta-analysis. Addiction 95, 1505–1523
73. Sanderson, E., Davey Smith, G., Windmeijer, F. pleiotropy in Mendelian randomization. Am. J. Hum. (2000).
& Bowden, J. An examination of multivariable Genet. 108, 2319–2335 (2021). 119. Mukamal, K. J. & Rimm, E. B. Alcohol’s effects on the
Mendelian randomization in the single-sample and 95. O’Connor, L. J. & Price, A. L. Distinguishing genetic risk for coronary heart disease. Alcohol. Res. Health
two-sample summary data settings. Int. J. Epidemiol. correlation from causation across 52 diseases and 25, 255–261 (2001).
48, 713–727 (2019). complex traits. Nat. Genet. 50, 1728–1734 (2018). 120. US National Library of Medicine. ClinicalTrials.gov
74. Chen, L., Davey Smith, G., Harbord, R. M. & Lewis, S. J. 96. Elsworth, B. L. et al. The MRC IEU OpenGWAS data https://clinicaltrials.gov/ct2/show/NCT03169530
Alcohol intake and blood pressure: a systematic infrastructure. Preprint at bioRxiv https://doi.org/ (2019).
review implementing a Mendelian randomization 10.1101/2020.08.10.244293 (2020). 121. Dyer, O. $100m alcohol study is cancelled amid
approach. PLoS Med. 5, e52 (2008). 97. Livingston, G. et al. Dementia prevention, pro-industry “bias”. BMJ 361, k2689 (2018).
75. Spiller, W., Hartwig, F. P., Sanderson, E., intervention, and care: 2020 report of the Lancet 122. Mitchell, G., Lesch, M. & McCambridge, J. Alcohol
Davey Smith, G. & Bowden, J. Interaction-based Commission. Lancet 396, 413–446 (2020). industry involvement in the moderate alcohol and
Mendelian randomization with measured and 98. Snoeckx, R. L. et al. GJB2 mutations and degree of cardiovascular health trial. Am. J. Public Health 110,
unmeasured gene-by-covariate interactions. Preprint hearing loss: a multicenter study. Am. J. Hum. Genet. 485–488 (2020).
at medRxiv https://doi.org/10.1101/2020.07.27. 77, 945–957 (2005). 123. National Institutes of Health. NIH to end funding
20162909 (2020). 99. Hoffmann, T. J. et al. A large genome-wide association for Moderate Alcohol and Cardiovascular
76. Spiller, W., Slichter, D., Bowden, J. & Davey Smith, G. study of age-related hearing impairment using Health trial. National Institutes of Health https://
Detecting and correcting for bias in Mendelian electronic health records. PLoS Genet. 12, e1006371 www.nih.gov/news-events/news-releases/nih-end-
randomization analyses using gene-by-environment (2016). funding-moderate-alcohol-cardiovascular-health-trial
interactions. Int. J. Epidemiol. 48, 702–712 (2019). 100. Lambert, J.-C. et al. Meta-analysis of 74,046 (2018).
77. Tchetgen Tchetgen, E. J., Sun, B. & Walter, S. The individuals identifies 11 new susceptibility loci for 124. Wild, C. in World Cancer Report 2014 (eds Wild, C. P.
GENIUS approach to robust Mendelian randomization Alzheimer’s disease. Nat. Genet. 45, 1452–1458 & Stewart, B. W.) (World Health Organization, 2014).
inference. Stat. Sci. 36, 443–464 (2019). (2013). 125. Secretan, B. et al. A review of human carcinogens —
78. Burgess, S., Dudbridge, F. & Thompson, S. G. 101. Burgess, S., Dudbridge, F. & Thompson, S. G. Part E: tobacco, areca nut, alcohol, coal smoke,
Combining information on multiple instrumental Re: “Multivariable Mendelian randomization: and salted fish. Lancet Oncol. 10, 1033–1034
variables in Mendelian randomization: comparison of the use of pleiotropic genetic variants to estimate (2009).
allele score and summarized data methods. Stat. Med. causal effects”. Am. J. Epidemiol. 181, 290–291 126. Lawlor, D. A. et al. Exploring causal associations
35, 1880–1906 (2016). (2015). between alcohol and coronary heart disease risk
79. Zhu, Z. et al. Causal associations between risk factors 102. Zuber, V., Colijn, J. M., Klaver, C. & Burgess, S. factors: findings from a Mendelian randomization
and common diseases inferred from GWAS summary Selecting likely causal risk factors from high- study in the Copenhagen General Population Study.
data. Nat. Commun. 9, 224 (2018). throughput experiments using multivariable Eur. Heart J. 34, 2519–2528 (2013).
80. Hartwig, F. P., Davies, N. M., Hemani, G. & Mendelian randomization. Nat. Commun. 11, 29 127. Holmes, M. V. et al. Association between alcohol and
Davey Smith, G. Two-sample Mendelian (2020). cardiovascular disease: Mendelian randomisation
randomization: avoiding the downsides of a powerful, 103. Sanderson, E. Multivariable Mendelian randomization analysis based on individual participant data. BMJ
widely applicable but potentially fallible technique. and mediation. Cold Spring Harb. Perspect. Med. 11, 349, g4164 (2014).
Int. J. Epidemiol. 45, 1717–1726 (2017). a038984 (2020). 128. Silverwood, R. J. et al. Testing for non-linear causal
81. Bowden, J. et al. Improving the visualization, 104. Carter, A. R. et al. Mendelian randomisation for effects using a binary genotype in a Mendelian
interpretation and analysis of two-sample summary mediation analysis: current methods and challenges randomization study: application to alcohol
data Mendelian randomization via the radial plot and for implementation. Eur. J. Epidemiol. 36, 465–478 and cardiovascular traits. Int. J. Epidemiol. 43,
radial regression. Int. J. Epidemiol. 47, 1264–1278 (2021). 1781–1790 (2014).
(2018). 105. Relton, C. L. & Davey Smith, G. Two-step epigenetic 129. Millwood, I. Y. et al. Conventional and genetic
82. Bowden, J., Davey Smith, G., Haycock, P. C. & Mendelian randomization: a strategy for establishing evidence on alcohol and vascular disease aetiology:
Burgess, S. Consistent estimation in Mendelian the causal role of epigenetic processes in pathways to a prospective study of 500 000 men and women
randomization with some invalid instruments using disease. Int. J. Epidemiol. 41, 161–176 (2012). in China. Lancet 393, 1831–1842 (2019).
a weighted median estimator. Genet. Epidemiol. 40, 106. Burgess, S., Daniel, R. M., Butterworth, A. S., 130. Goldstein, J. L. & Brown, M. S. A century of
304–314 (2016). Thompson, S. G. & Consortium, E.-I. Network cholesterol and coronaries: from plaques to genes
83. Hartwig, F. P., Davey Smith, G. & Bowden, J. Mendelian randomization: using genetic variants to statins. Cell 161, 161–172 (2015).
Robust inference in summary data Mendelian as instrumental variables to investigate mediation 131. Miller, G. & Miller, N. Plasma-high-density-lipoprotein
randomization via the zero modal pleiotropy in causal pathways. Int. J. Epidemiol. 44, 484–495 concentration and development of ischaemic
assumption. Int. J. Epidemiol. 46, 1985–1998 (2017). (2015). heart-disease. Lancet 305, 16–19 (1975).
0123456789();:
Primer
132. Castelli, W. P. et al. HDL cholesterol and other 158. Baldwin, J., Pingault, J.-B., Schoeler, T., Sallis, H. M. 183. Zhou, W. et al. Causal relationships between body
lipids in coronary heart disease. The cooperative & Munafo, M. R. Protecting against researcher bias mass index, smoking and lung cancer: univariable and
lipoprotein phenotyping study. Circulation 55, in secondary data analysis: challenges and solutions. multivariable Mendelian randomization. Int. J. Cancer
767–772 (1977) Eur. J. Epidemiol. 37, 1–10 (2022). 148, 1077–1086 (2021).
133. Emerging Risk Factors Collaboration et al. Major 159. Sallis, H. Triangulation protocol; intergenerational 184. Lee, J. C. et al. Genome-wide association study
lipids, apolipoproteins, and risk of vascular disease. effects of parental substance use on child substance identifies distinct genetic contributions to prognosis
JAMA 302, 1993–2000 (2009). use and mental health outcomes. Preprint at https:// and susceptibility in Crohn’s disease. Nat. Genet. 49,
134. Davey Smith, G. & Phillips, A. N. Correlation without osf.io/s6jv4/ (2021). 262–268 (2017).
a cause: an epidemiological odyssey. Int. J. Epidemiol. 160. Hartwig, F. P., Davies, N. M. & Davey Smith, G. Bias in 185. Kim, Y.-I. Role of folate in colon cancer development
49, 4–14 (2020). Mendelian randomization due to assortative mating. and progression. J. Nutr. 133, 3731S–3739S (2003).
135. Voight, B. F. et al. Plasma HDL cholesterol and risk Genet. Epidemiol. 42, 608–620 (2018). 186. Davey Smith, G., Paternoster, L. & Relton, C.
of myocardial infarction: a Mendelian randomisation 161. Brumpton, B. et al. Avoiding dynastic, assortative When will Mendelian randomization become relevant
study. Lancet 380, 572–580 (2012). mating, and population stratification biases in for clinical practice and public health? JAMA 317,
136. Do, R. et al. Common variants associated with Mendelian randomization through within-family 589–591 (2017).
plasma triglycerides and risk for coronary artery analyses. Nat. Commun. 11, 3519 (2020). 187. Ye, T., Shao, J. & Kang, H. Debiased inverse-variance
disease. Nat. Genet. 45, 1345–1352 (2013). 162. Morris, T. T., Davies, N. M., Hemani, G. & Davey weighted estimator in two-sample summary-data
137. Holmes, M. V. et al. Mendelian randomization of blood Smith, G. Population phenomena inflate genetic Mendelian randomization. Ann. Stat. 49, 2079–2100
lipids for coronary heart disease. Eur. Heart J. 36, associations of complex social traits. Sci. Adv. 6, (2021).
539–550 (2015). eaay0328 (2020). 188. Bowden, J. et al. Improving the accuracy of two-
138. Holmes, M. V. & Davey Smith, G. REVEALing the effect 163. Minică, C. C., Boomsma, D. I., Dolan, C. V., de Geus, E. sample summary-data Mendelian randomization:
of CETP inhibition in cardiovascular disease. Nat. Rev. & Neale, M. C. Empirical comparisons of multiple moving beyond the NOME assumption. Int. J.
Cardiol. 14, 635–636 (2017). Mendelian randomization approaches in the Epidemiol. 48, 728–742 (2018).
139. Barter, P. J. et al. Effects of torcetrapib in patients at presence of assortative mating. Int. J. Epidemiol. 49, 189. Wang, S. & Kang, H. Weak-instrument robust tests in
high risk for coronary events. N. Engl. J. Med. 357, 1185–1193 (2020). two-sample summary-data Mendelian randomization.
2109–2122 (2007). 164. Davies, N. M. et al. Within family Mendelian Biometrics https://doi.org/10.1111/biom.13524 (2021).
140. Riaz, H. et al. Effects of high-density lipoprotein randomization studies. Hum. Mol. Genet. 28, 190. Minelli, C. et al. The use of two-sample methods for
targeting treatments on cardiovascular outcomes: R170–R179 (2019). Mendelian randomization analyses on single large
a systematic review and meta-analysis. Eur. J. Prev. 165. Minică, C. C., Dolan, C. V., Boomsma, D. I., de Geus, E. datasets. Int. J. Epidemiol. 50, 1651–1659 (2021).
Cardiol. 26, 533–543 (2019). & Neale, M. C. Extending causality tests with 191. Burgess, S., Foley, C. N., Allara, E., Staley, J. R. &
141. Richardson, T. G., Sanderson, E., Elsworth, B., genetic instruments: an integration of Mendelian Howson, J. M. M. A robust and efficient method for
Tilling, K. & Davey Smith, G. Use of genetic variation randomization with the classical twin design. Behav. Mendelian randomization with hundreds of genetic
to separate the effects of early and later life adiposity Genet. 48, 337–349 (2018). variants. Nat. Commun. 11, 376 (2020).
on disease risk: Mendelian randomisation study. BMJ 166. Howe, L. J. et al. Within-sibship genome-wide 192. Foley, C. N., Mason, A. M., Kirk, P. D. W. & Burgess, S.
369, m1203 (2020). association analyses decrease bias in estimates MR-Clust: clustering of genetic variants in Mendelian
142. Bycroft, C. et al. The UK Biobank resource with of direct genetic effects. Nat. Genet. (in the press). randomization with similar causal estimates.
deep phenotyping and genomic data. Nature 562, 167. Taylor, A. E. et al. Exploring the association of genetic Bioinformatics 37, 531–541 (2020).
203–209 (2018). factors with participation in the Avon Longitudinal 193. Berzuini, C., Guo, H., Burgess, S. & Bernardinelli, L.
143. Schooling, C. M. Selection bias in population- Study of Parents and Children. Int. J. Epidemiol. 47, A Bayesian approach to Mendelian randomization
representative studies? A commentary on Deaton 1207–1216 (2018). with multiple pleiotropic variants. Biostatistics 21,
and Cartwright. Soc. Sci. Med. 210, 70 (2018). 168. Fry, A. et al. Comparison of sociodemographic 86–101 (2018).
144. Dixon, P., Davey Smith, G., von Hinke, S., Davies, N. M. and health-related characteristics of UK Biobank 194. Xu, S., Fung, W. K. & Liu, Z. MRCIP: a robust
& Hollingworth, W. Estimating marginal healthcare participants with those of the general population. Mendelian randomization method accounting
costs using genetic variants as instrumental variables: Am. J. Epidemiol. 186, 1026–1034 (2017). for correlated and idiosyncratic pleiotropy.
Mendelian randomization in economic evaluation. 169. Pirastu, N. et al. Genetic analyses identify widespread Brief. Bioinform. 22, bbab019 (2021).
PharmacoEconomics 34, 1075–1086 (2016). sex-differential participation bias. Nat. Genet. 53, 195. Qi, G. & Chatterjee, N. Mendelian randomization
145. Dixon, P., Hollingworth, W., Harrison, S., Davies, N. M. 663–671 (2021). analysis using mixture models for robust and efficient
& Davey Smith, G. Mendelian randomization analysis 170. Smit, R. A., Trompet, S., Dekkers, O. M., Jukema, J. W. estimation of causal effects. Nat. Commun. 10, 1941
of the causal effect of adiposity on hospital costs. & le Cessie, S. Survival bias in Mendelian randomization (2019).
J. Health Econ. 70, 102300 (2020). studies: a threat to causal inference. Epidemiology 30, 196. Cheng, Q. et al. MR-LDP: a two-sample Mendelian
146. Xu, Z. M. & Burgess, S. Polygenic modelling of 813 (2019). randomization for GWAS summary statistics
treatment effect heterogeneity. Genet. Epidemiol. 44, 171. Schooling, C. M. et al. Use of multivariable Mendelian accounting for linkage disequilibrium and horizontal
868–879 (2020). randomization to address biases due to competing pleiotropy. NAR Genom. Bioinform 2, lqaa028 (2020).
147. Holmes, M. V. Human genetics and drug development. risk before recruitment. Front. Genet. 11, 610852 197. Zhu, X., Li, X., Xu, R. & Wang, T. An iterative approach
N. Engl. J. Med. 380, 1076–1079 (2019). (2020). to detect pleiotropy and perform Mendelian
148. Kyriacou, D. N. & Lewis, R. J. Confounding 172. Vansteelandt, S., Dukes, O. & Martinussen, T. randomization analysis using GWAS summary
by indication in clinical research. JAMA 316, Survivor bias in Mendelian randomization analysis. statistics. Bioinformatics 37, 1390–1400 (2020).
1818–1819 (2016). Biostatistics 19, 426–443 (2017). 198. Grant, A. J. & Burgess, S. An efficient and robust
149. Schmidt, A. F. et al. Genetic drug target validation 173. Hernán, M. A. Invited commentary: selection bias approach to Mendelian randomization with measured
using Mendelian randomisation. Nat. Commun. 11, without colliders. Am. J. Epidemiol. 185, 1048–1050 pleiotropic effects in a high-dimensional setting.
3255 (2020). (2017). Biostatistics https://doi.org/10.1093/biostatistics/
150. Schmidt, A. F., Hingorani, A. D. & Finan, C. Human 174. Mahmoud, O., Dudbridge, F., Davey Smith, G., kxaa045 (2020).
genomics and drug development. Cold Spring Harb. Munafo, M. & Tilling, K. Slope-Hunter: a robust 199. Iong, D., Zhao, Q. & Chen, Y. A latent mixture model
Perspect. Med. https://doi.org/10.1101/cshperspect. method for index-event bias correction in genome- for heterogeneous causal mechanisms in mendelian
a039230 (2021). wide association studies of subsequent traits. randomization. Preprint at https://arxiv.org/abs/
151. Munafò, M. R. et al. A manifesto for reproducible Nat. Commun. (in the press). 2007.06476 (2020).
science. Nat. Hum. Behav. 1, 0021 (2017). 175. Waddington, C. H. Canalization of development and 200. van der Graaf, A. et al. Mendelian randomization
152. Munafò, M. R. & Davey Smith, G. Robust research the inheritance of acquired characters. Nature 150, while jointly modeling cis genetics identifies causal
needs many lines of evidence. Nature 553, 399–401 563–565 (1942). relationships between gene expression and lipids.
(2018). 176. Debat, V. & David, P. Mapping phenotypes: Nat. Commun. 11, 4930 (2020).
153. Davies, N. M., Dickson, M., Davey Smith, G., canalization, plasticity and developmental stability. 201. Jiang, L., Xu, S., Mancuso, N., Newcombe, P. J. &
van den Berg, G. J. & Windmeijer, F. The causal Trends Ecol. Evol. 16, 555–561 (2001). Conti, D. V. A hierarchical approach using marginal
effects of education on health outcomes in the UK 177. Kitami, T. & Nadeau, J. H. Biochemical networking summary statistics for multiple intermediates
Biobank. Nat. Hum. Behav. 2, 117–125 (2018). contributes more to genetic buffering in human and in a Mendelian randomization or transcriptome
154. Sanderson, E., Davey Smith, G., Bowden, J. & mouse metabolic pathways than does gene analysis. Am. J. Epidemiol. 190, 1148–1158
Munafò, M. R. Mendelian randomisation analysis duplication. Nat. Genet. 32, 191–194 (2002). (2021).
of the effect of educational attainment and cognitive 178. Gu, Z. et al. Role of duplicate genes in genetic robustness 202. DiPrete, T. A., Burik, C. A. P. & Koellinger, P. D.
ability on smoking behaviour. Nat. Commun. 10, against null mutations. Nature 421, 63–66 (2003). Genetic instrumental variable regression:
2949 (2019). 179. Hemani, G. et al. Automating Mendelian randomization explaining socioeconomic and health outcomes in
155. Davies, N. M. et al. Multivariable two-sample through machine learning to construct a putative nonexperimental data. Proc. Natl Acad. Sci. USA 115,
Mendelian randomization estimates of the effects of causal map of the human phenome. Preprint at bioRxiv E4970–E4979 (2018).
intelligence and education on health. eLife 8, e43990 https://doi.org/10.1101/173682 (2017). 203. Howey, R., Shin, S.-Y., Relton, C., Davey Smith, G. &
(2019). 180. Ioannidis, J. P. The mass production of redundant, Cordell, H. J. Bayesian network analysis incorporating
156. Tillmann, T. et al. Education and coronary heart misleading, and conflicted systematic reviews and genetic anchors complements conventional Mendelian
disease: mendelian randomisation study. BMJ 358, meta-analyses. Milbank Q. 94, 485–514 (2016). randomization approaches for exploratory analysis of
j3542 (2017). 181. Martin, A. R. et al. Clinical use of current polygenic causal relationships in complex data. PLoS Genet. 16,
157. Davies, N. M., Dickson, M., Davey Smith, G., risk scores may exacerbate health disparities. e1008198 (2020).
Windmeijer, F. & van den Berg, G. J. The causal Nat. Genet. 51, 584–591 (2019). 204. Schmidt, A. F. & Dudbridge, F. Mendelian
effects of education on adult health, mortality and 182. Paternoster, L., Tilling, K. & Davey Smith, G. Genetic randomization with Egger pleiotropy correction and
income: evidence from Mendelian randomization epidemiology and Mendelian randomization for weakly informative Bayesian priors. Int. J. Epidemiol.
and the raising of the school leaving age. Preprint at informing disease therapeutics: conceptual and 47, 1217–1228 (2017).
SSRN https://doi.org/10.2139/ssrn.3390179 methodological challenges. PLoS Genet. 13, 205. Bucur, I. G., Claassen, T. & Heskes, T. Inferring
(2019). e1006944 (2017). the direction of a causal link and estimating its
0123456789();:
Primer
effect via a Bayesian Mendelian randomization 218. Labrecque, J. A. & Swanson, S. A. Interpretation works in a unit that receives funding from the MRC and is
approach. Stat. Methods Med. Res. 29, 1081–1111 and potential biases of mendelian randomization supported by a British Heart Foundation Intermediate
(2019). estimates with time-varying exposures. Am. J. Clinical Research Fellowship (FS/18/23/33512) and the
206. Davey Smith, G., Holmes, M. V., Davies, N. M. & Epidemiol. 188, 231–238 (2018). National Institute for Health Research Oxford Biomedical
Ebrahim, S. Mendel’s laws, Mendelian randomization 219. Sanderson, E., Richardson, T. G., Morris, T. T., Tilling, K. Research Centre. H.K. is supported by the National Science
and causal inference in observational data: & Davey Smith, G. Estimation of causal effects of a Foundation grant DMS-1811414. C.W. is funded by the MRC
substantive and nomenclatural issues. Eur. J. time-varying exposure at multiple time points through (MC UU 00002/4, MC UU 00002/13) and the Wellcome
Epidemiol. 35, 99–111 (2020). Multivariable Mendelian randomization. Preprint Trust (WT107881).
207. Davey Smith, G. et al. Clustered environments and at medRxiv https://doi.org/10.1101/2022.01.04.
randomized genes: a fundamental distinction between 22268740 (2022). Author contributions
conventional and genetic epidemiology. PLoS Med. 4, 220. Cardon, L. R. & Palmer, L. J. Population stratification Introduction (E.S.); Experimentation (E.S., M.M.G. and T.P);
1985–1992 (2007). and spurious allelic association. Lancet 361, Results (E.S., M.M.G., T.P. and C.W); Applications (E.S. and
208. Pearl, J. Causality (Cambridge Univ. Press, 2009). 598–604 (2003). M.V.H.); Reproducibility and data deposition (M.R.M.);
209. Keele, L., Zhao, Q., Kelz, R. R. & Small, D. Falsification 221. Loh, P.-R. et al. Efficient Bayesian mixed-model Limitations and optimizations (E.S.); Outlook (G.D.S.); Overview
tests for instrumental variable designs with an analysis increases association power in large cohorts. of the Primer (E.S., H.K., J.M., C.M.S., Q.Z. and G.D.S.).
application to tendency to operate. Med. Care 57, Nat. Genet. 47, 284 (2015).
167–171 (2019). 222. Zhou, W. et al. Efficiently controlling for case-control Competing interests
210. Brookhart, M. A., Rassen, J. A. & Schneeweiss, S. imbalance and sample relatedness in large-scale The authors declare no competing interests.
Instrumental variable methods in comparative safety genetic association studies. Nat. Genet. 50,
and effectiveness research. Pharmacoepidemiol. Drug 1335–1341 (2018). Peer review information
Saf. 19, 537–554 (2010). 223. Price, A. L. et al. Principal components analysis Nature Reviews Methods Primers thanks Marianne Benn,
211. Burgess, S. & Labrecque, J. A. Mendelian corrects for stratification in genome-wide association Frida Emanuelsson, Sarah Gagliano Taliun and the other,
randomization with a binary exposure variable: studies. Nat. Genet. 38, 904–909 (2006). anonymous, reviewer(s) for their contribution to the peer
interpretation and presentation of causal estimates. 224. Lawson, D. J. et al. Is population structure in the review of this work.
Eur. J. Epidemiol. 33, 947–952 (2018). genetic biobank era irrelevant, a challenge, or an
212. Wang, L. & Tchetgen Tchetgen, E. Bounded, efficient opportunity? Hum. Genet. 139, 23–41 (2020). Publisher’s note
and multiply robust estimation of average treatment 225. Haworth, S. et al. Apparent latent structure within the Springer Nature remains neutral with regard to jurisdictional
effects using instrumental variables. J. R. Stat. Soc. UK Biobank sample has implications for epidemiological claims in published maps and institutional affiliations.
Ser. B 80, 531–550 (2018). analysis. Nat. Commun. 10, 1–9 (2019).
213. Mills, H. L. et al. Detecting heterogeneity of 226. Howe, L. J. et al. Genetic evidence for assortative Supplementary information
intervention effects using analysis and meta-analysis mating on alcohol consumption in the UK Biobank. The online version contains supplementary material available
of differences in variance between arms of a trial. Nat. Commun. 10, 5039 (2019). at https://doi.org/10.1038/s43586-021-00092-5.
Epidemiology 32, 846–854 (2021). 227. Nordsletten, A. E. et al. Patterns of nonrandom
214. West-Eberhard, M. J. Developmental Plasticity and mating within and across 11 major psychiatric
Evolution (Oxford Univ. Press, 2003). disorders. JAMA Psychiat. 73, 354–361 (2016).
Related links
ivonesamplemr: https://github.com/remlapmot/ivonesamplemr
215. Zuckerkandl, E. & Villet, R. Concentration-affinity 228. Bochud, M., Chiolero, A., Elston, R. C. &
mendelianrandomization: https://cran.r-project.org/
equivalence in gene regulation: convergence of genetic Paccaud, F. A cautionary note on the use of Mendelian
package=MendelianRandomization
and environmental effects. Proc. Natl Acad. Sci. USA randomization to infer causation in observational
mr dictionary: https://mr-dictionary.mrcieu.ac.uk/
85, 4784–4788 (1988). epidemiology. Int. J. Epidemiol. 37, 414–416 (2008).
mrrobust: https://github.com/remlapmot/mrrobust
216. Ebrahim, S. & Davey Smith, G. Mendelian
OneSamplemr: https://remlapmot.github.io/OneSampleMR/
randomization: can genetic epidemiology help Acknowledgements
STrOBe-mr: https://www.strobe-mr.org/
redress the failures of observational epidemiology? E.S., M.R.M., T.P. and G.D.S. are members of the UK Medical
The OpenGWAS project: https://gwas.mrcieu.ac.uk/
Hum. Genet. 123, 15–33 (2008). Research Council (MRC) Integrative Epidemiology unit, which
TwoSamplemr: https://github.com/MRCIEU/TwoSampleMR
217. Hill, W. D. et al. Molecular genetic contributions is funded by the MRC (MC_UU_00011/1, MC_UU_00011/3
UK Biobank: https://www.ukbiobank.ac.uk/
to social deprivation and household income and MC_UU_00011/7) and the University of Bristol. M.M.G.
in UK Biobank. Curr. Biol. 26, 3083–3089 is supported by the National Institutes of Health/National
(2016). Institute on Aging (NIH/NIA) grant R01AG057869. M.V.H. © Springer Nature Limited 2022
0123456789();: