Introduction To Matching in Case-Control and Cohor
Introduction To Matching in Case-Control and Cohor
SEMINAR
ABSTRACT
Matching is a technique through which patients with and without an outcome of interest (in case-control studies) or
patients with and without an exposure of interest (in cohort studies) are sampled from an underlying cohort to have the
same or similar distributions of some characteristics. This technique is used to increase the statistical efficiency and cost
efficiency of studies. In case-control studies, besides time in risk set sampling, controls are often matched for each case
with respect to important confounding factors, such as age and sex, and covariates with a large number of values or
levels, such as area of residence (e.g., post code) and clinics/hospitals. In the statistical analysis of matched case-control
studies, fixed-effect models such as the Mantel-Haenszel odds ratio estimator and conditional logistic regression model
are needed to stratify matched case-control sets and remove selection bias artificially introduced by sampling controls.
In cohort studies, exact matching is used to increase study efficiency and remove or reduce confounding effects of
matching factors. Propensity score matching is another matching method whereby patients with and without exposure
are matched based on estimated propensity scores to receive exposure. If appropriately used, matching can improve
study efficiency without introducing bias and could also present results that are more intuitive for clinicians.
KEY WORDS
matching, case-control study, cohort study, risk set sampling, conditional logistic regression, stratified Cox regression
33
ANNALS OF C L I N I C A L E P I D E M I O L O G Y
Fig. 1 Graphical representation of cumulative incidence sampling (A), case-control sampling (B), and risk set sampling (C) for 10 example
patients in a cohort. ● indicates an outcome onset and time at selection as a case. ○ indicates time at selection as a control.
34
MATCHING IN CASE-CONTROL AND COHORT STUDIES
reviewing medical records or questionnaires, (ii) use 2.3. Choice of Matching Ratio
stored samples to measure new biomarkers, (iii) require Because the number of cases (which are often rare disea‐
adjudication of individual study endpoints with special ses) is usually much smaller than that of potential con‐
expertise in their assessment and classification, and (iv) trols, the matching ratio (i.e., ratio of cases:controls in
make it convenient to assess triggers of an acute event by each matched set) is often set to 1:n. If the ratio is set to
flexibly modeling the exposure window at varying prox‐ 1:1, the design is called a pair-matched case-control
imities to the event of interest [4]. study. In practice, many studies set the matching ratio to
1:4 or 1:5, whereas other studies opt to set it to a large
2.2. Purpose of Matching in Case-Control Studies ratio, such as 1:7 [7] and 1:10 [8]. In unmatched case-
Similar to cohort studies, case-control studies typically control settings, the gain of statistical power sharply
require confounder adjustment using stratified analysis increases until the ratio 1:4 or 1:5 and then slowly increa‐
or regression modeling. To further improve statistical ses thereafter [9]. However, this may not be always true
efficiency in adjusted analyses, case-control studies may in matched case-control settings. The matched case-
match controls on confounders to be adjusted for, i.e., control design generally requires stratification on match‐
sampling a control(s) with an identical (or nearly identi‐ ing factors, which completely discards the information of
cal) value of confounders for each case. When the total matched sets of cases and controls with concordant expo‐
number of cases and controls to be sampled is fixed, the sure (i.e., a set containing people exposed only or people
adjusted odds ratio estimates are likely to be less variable unexposed only). Thus, a 1:4 or 1:5 matching ratio may
(i.e., more statistically efficient) in case-control data still have substantial power loss if (i) cases and controls in
matched on strong confounders than in unmatched data. the same strata of matching factors have similar exposure
Besides common confounding factors such as age and patterns or (ii) exposure is rare (e.g., <15%) in an under‐
sex, area of residence (e.g., post code) or clinics/hospitals lying cohort [10].
(which patients are registered to or visit) are sometimes Sometimes, a case cannot find a prespecified number
matched between cases and controls. If variables with a of controls. For example, in a case-control study planning
large number of values or levels (e.g., over 1,000 post 1:4 matching, some cases could find only less than four
codes or clinics/hospitals) are adjusted for as “surrogate” controls. However, it is not necessary to exclude these
confounders in the statistical analysis, at least one case pairs when matching factors or matched sets of cases and
and one control in each area (or clinic/hospital) are controls are stratified in the analysis. The mixture of pairs
needed; otherwise, the data are discarded in the fixed- with different matching ratios will not result in a biased
effect models (stratification). Although a case and control estimate as long as an adequate adjustment for matching
may rarely come from the same area (or clinic/hospital) factors is adopted.
in unmatched case-control sampling, matching can
ensure that the pairs (or sets) of cases and controls are 2.4. Choice of Matching With and Without Replacement
derived from the same area (or clinics/hospitals). Conse‐ It is necessary to decide whether the same individual can
quently, the odds ratio adjusted for these variables can be be sampled repeatedly as a control (called matching with
efficiently estimated. replacement) or only once (called matching without
Caution is needed for the effect of case-control match‐ replacement). Researchers need to choose one of the two
ing on confounding: matching itself does not have a role as the main analysis, considering the balance between the
in adjusting for confounding factors but rather introdu‐ demerit of not finding sufficient number of controls (i.e.,
ces selection bias [5]. Therefore, as explained later, statis‐ many pairs not achieving the prespecified 1:n matching
tical analysis with fixed-effect adjustment, such as the ratio) by matching without replacement and the demerit
Mantel-Haenszel odds ratio estimator and conditional of decreased statistical efficiency if the same individual is
logistic regression models, is necessary to estimate an repeatedly included as a control by matching with
unbiased confounder-adjusted odds ratio. In addition, if replacement. If the number of controls is much larger
a case and controls become too similar by matching too than that of the case, the choice would not make a big
many variables, statistical efficiency in the fixed-effect difference in the estimated odds ratios. Notably, in risk
analysis will be reduced, which is called over-matching set sampling, (i) people with the outcome (i.e., cases)
[6]. Thus, it is generally not recommended to match should be potentially selected as controls until they
many variables in case-control studies. become a case to represent the underlying cohort, and
(ii) the same person should be selected as a control
35
ANNALS OF C L I N I C A L E P I D E M I O L O G Y
several times at different time points, meaning that pling. If the hazard of disease incidence varies with time
matching without replacement is biased [11]. and the exposure prevalence changes during follow-up,
time should be accounted for as a “confounder.” To do so,
2.5. Statistical Analysis in Matched Case-Control Studies one can use the Mantel-Haenszel odds ratio estimator or
To remove the selection bias artificially introduced by a conditional logistic regression model, which estimates
case-control matching, it is necessary to “stratify” data on the hazard ratio constant over time (and across other
matching factors in the statistical analysis. One tradi‐ matching factors, if any) that would be modeled by the
tional method is the Mantel-Haenszel odds ratio estima‐ Cox proportional hazards model in an underlying cohort.
tor that stratifies on matching factors themselves (e.g.,
subgroups by age group and sex, if controls are matched
3. EXAMPLES OF CASE-CONTROL STUDY
on these factors) or matched sets (e.g., each pair of a case
WITH MATCHING
and control). The Mantel-Haenszel estimator adjusts for
matching factors as fixed effects and estimates a common 3.1. Example 1: A Case-Control Study with Primary Data
odds ratio assumed to be constant across strata. The Collection
Mantel-Haenszel odds ratio estimator consistently esti‐ Hayashi et al. conducted a case-control study to identify
mates the common odds ratio when each stratum con‐ factors associated with calciphylaxis (calcific uremic arte‐
tains sparse data (e.g., only two patients, one case and riolopathy), a rare and fatal complication characterized
one control, in each stratum) but the number of strata by painful skin ulceration and necrosis, in patients
increases. Adjusting for confounding factors besides the undergoing hemodialysis for end-stage renal disease [14].
matching factors by additional stratification within the The researchers representing the Japanese Calciphylaxis
matching factor strata is infeasible. Study Group sent questionnaires to hemodialysis centers
As another method, it is much more common to use a in Japan and included 28 cases with a definitive diagnosis
conditional logistic regression model, which estimates the of calciphylaxis. For each case, two controls matched for
common stratum- and covariate-specific odds ratio by age and hemodialysis duration were randomly selected
stratifying on matching factors while adjusting for other from the same dialysis center. Clinical information,
confounders as covariates [12]. For example, when a con‐ including known and unknown (but suspected) risk fac‐
trol is matched on age and hospital of a case, stratifica‐ tors for calciphylaxis, was collected for cases and con‐
tion of matched pairs using the conditional logistic trols. Univariable logistic regression analyses showed that
regression model will eliminate the confounding effect of warfarin therapy, lower serum albumin levels, higher
these matching variables. Additionally, medical condi‐ plasma glucose levels, and higher serum calcium levels
tions that may confound the exposure-outcome relation‐ were significantly associated with calciphylaxis. A multi‐
ship within the age-hospital strata can be adjusted for by variable logistic regression analysis showed that warfarin
including them as covariates in the model without intro‐ therapy and lower serum albumin levels (per 1 g/dL
ducing unnecessary bias. decrease) were still significantly associated with calciphy‐
Notably, simple adjustments of matching factors by laxis, with an adjusted odds ratio of 10.1 (95% confidence
including them as covariates in (unconditional) logistic interval [CI] 1.63–62.7) and 12.7 (95% CI 2.35–68.6),
regression models are not recommended in matched respectively.
case-control studies. For example, an unconditional
logistic regression model for an outcome, including 3.2. Example 2: A Case-Control Study with Secondary Use
exposure and age as covariates in age-matched case- of Existing Cohort Data
control data, provides a biased estimate of age-adjusted Iwagami et al. conducted a case-control study to identify
odds ratio, even if the model correctly specifies the asso‐ medical diagnoses strongly associated with the incidence
ciation between the outcome and covariates in an under‐ of long-term care needs certification, using linked medi‐
lying cohort [13]. This is because the selection bias cal and long-term care insurance data from two cities in
induced by matching distorts the association between the Japan [15]. The participants were aged ≥75 years, had no
outcome and matching factors, resulting in residual bias previous long-term care needs certification, and had at
owing to model misspecification. least one medical insurance claim record during the
Finally, time at matching (time from cohort entry, study period. Cases were newly certified people for long-
calendar time, or possibly age as time from birth) can be term care needs during the study period, whereas con‐
considered one of the “matching factors” in risk set sam‐ trols were randomly selected in a 1:4 ratio and matched
36
MATCHING IN CASE-CONTROL AND COHORT STUDIES
for age category, sex, city, and calendar date (index date). the level of exposure), and the exposure status of selected
Multivariable conditional logistic regression analysis was patients is assumed to remain unchanged during the
conducted to estimate the association between 22 catego‐ follow-up period.
ries of medical diagnoses recorded during the period of In population-based cohort studies, the exposure sta‐
exposure definition (past 6 months of index date) and tus may change according to the calendar time. For
new long-term care needs certification, under the example, people without diabetes may be diagnosed as
assumption that exposures are independent of each other. having the disease one day. As another example,
Among 38,338 eligible people, 5,434 people newly patients who have never used a certain drug with poten‐
received long-term care needs certification and were tial carcinogenic effects before may start taking it one
matched with 21,736 controls. In the multivariable condi‐ day. In such situations, researchers can create matched
tional logistic regression analysis, the adjusted odds ratio sets of patients with and without the exposure of interest
(95% CI) was the largest for femur fractures (8.80 [6.35– at the same calendar time (Fig. 4). However, the exposure
12.20]), followed by dementia (6.70 [5.96–7.53]), pneu‐
monia (3.72 [3.19–4.32]), hemorrhagic stroke (3.31
[2.53–4.34]), Parkinson’s disease (2.74 [2.07–3.63]), and
other fractures (2.68 [2.38–3.02]).
37
ANNALS OF C L I N I C A L E P I D E M I O L O G Y
status of the matched sets is assumed to remain sets (e.g., conditional logistic regression, stratified
unchanged during the follow-up period. In the presence Poisson, or stratified Cox regression models) may or may
of time-varying exposures, survival analysis with time- not be an option. Other possible statistical methods
dependent covariates or by censoring the follow-up of a include i) covariate adjustment for matching factors, ii)
patient when his/her exposure status changes may pro‐ random-effect adjustment for matched sets, and iii) mar‐
vide estimates of associational measures (e.g., hazard ginal regression modeling without stratification on
ratios) free from time-related biases [16]. However, in matching factors but with cluster-robust variance
general, a matched cohort study is unsuitable if the expo‐ accounting for matched sets as clusters. The differences
sure status frequently changes between “on” and “off” in between fixed-effect models and other possible statistical
the same patient. Furthermore, although the method methods are the estimand and modeling assumptions of
exists [17], causal interpretation of associational mea‐ the analysis [19].
sures for time-varying exposures estimated in matched Caution is needed in the sense that matching can only
cohort studies requires additional consideration. “balance” distributions in sampled (i.e., matched) data,
and such balance is easily affected by additional adjust‐
4.2. Choice of Matching Factors, Matching Ratio, and ment for or stratification on other variables. Therefore,
Matching With or Without Replacement ignoring matching factors (i.e., adjusting for additional
Matching factors in the secondary use of existing unmatched variables without adjusting for matching fac‐
databases often include age (age category or age within a tors) would cause bias in estimates [20].
range, such as ±2 years), sex, area of residence (e.g., post In some matched-pair cohort studies, observation time
code) or clinics/hospitals (which patients are registered is prematurely terminated immediately after the follow-
to or visit), and calendar time. Although cohort matching up of his/her matched counterpart is completed by an
on known confounders typically leads to an efficiency event or censoring [21]. The impact of such termination
gain in adjusted estimates, there are exceptions depend‐ is minimal when adopting stratified Cox models. How‐
ing on associational measures (e.g., risk difference or ever, in statistical methods other than stratified Cox
ratio) and underlying models (e.g., additive or multipli‐ models, termination is not generally encouraged because
cative risk models) [18]. If the statistical efficiency is information is then discarded in an irremediable manner
rather worsened by matching, the resulting estimates suf‐ [22].
fer from “over-matching.”
Regarding the matching ratio, 1:4 or 1:5 is sometimes 4.4. Propensity Score Matching
chosen in matched-pair cohort studies, whereas 1:1 may Although the aforementioned matching method is spe‐
be chosen more frequently to prioritize simplicity and cifically called exact matching, matching based on the
intuitiveness. Mixed matching ratios (meaning that, for propensity score, which is the probability of receiving
example, some pairs are matched in a ratio of 1:4, exposure within the confounder stratum to which a
whereas other pairs are matched by a ratio of 1:3, 1:2, or patient belongs, is another type of matching method
1:1 between exposed and unexposed people) will not known as marginal matching [13]. Propensity score
cause bias if matching variables or matched sets are matching was featured in a previous paper of this semi‐
adjusted for in the analysis. In contrast, as such varying nar series [23]. Briefly, patients with and without expo‐
matching ratios do not balance the distributions of sure are matched based on estimated propensity scores to
matching factors in exposed and unexposed people, the receive exposure at a certain time point, mostly at the
unadjusted comparison in the matched cohort still time of cohort inclusion. Consequently, the distribution
suffers from confounding bias. of the measured confounding factors defining the pro‐
Matching with or without replacement remains the pensity scores are balanced between the two groups in
choice of researchers, although matching without the propensity score-matched samples. Researchers using
replacement may be more intuitive for clinicians. this method should be aware of the theoretical subtleties
in propensity score matching, such as the lack of justifica‐
4.3. Statistical Analysis in Matched-Pair Cohort Study tion for interval estimation for propensity score-matched
Unlike case-control matching, non-mixed cohort match‐ estimates using off-the-shelf software [24] and bias owing
ing completely or partially removes the confounding to additional adjustment for risk factors not balanced by
effect of matching factors without introducing additional propensity score matching [25].
selection bias. Hence, fixed-effect models for matched
38
MATCHING IN CASE-CONTROL AND COHORT STUDIES
REFERENCES
1. Wacholder S. The case-control study as 2. Noma H, Tanaka S. Analysis of case-cohort tion. Stat Methods Med Res 2017;26:691–706.
data missing by design: estimating risk differ‐ designs with binary outcomes: improving effi‐ 3. Schuemie MJ, Ryan PB, Man KKC, Wong
ences. Epidemiology 1996;7:144–50. ciency using whole-cohort auxiliary informa‐ ICK, Suchard MA, Hripcsak G. A plea to stop
39
ANNALS OF C L I N I C A L E P I D E M I O L O G Y
using the case-control design in retrospective tate cancer progression. Cancer Epidemiol 20. Sjölander A, Greenland S. Ignoring the
database studies. Stat Med 2019;38:4199–208. Biomarkers Prev 2009;18:706–11. matching variables in cohort studies – when
4. Schneeweiss S, Suissa S.Discussion of 12. Pearce N. Analysis of matched case- is it valid and why? Stat Med 2013;32:4696–
Schuemie et al. “A plea to stop using the case- control studies. BMJ 2016;352:i969. 708.
control design in retrospective database stud‐ 13. Greenland S. Partial and marginal match‐ 21. Sutradhar R, Baxter NN, Austin PC. Ter‐
ies”. Stat Med 2019;38:4209–12. ing in case-control studies. Modern statistical minating observation within matched pairs
5. Rothman KJ, Lash TL. 6 Epidemiologic methods in chronic disease epidemiology. of subjects in a matched cohort analysis: a
study design with validity and efficiency con‐ Wiley: New York, NY, 1986:35–49. Monte Carlo simulation study. Stat Med
siderations. Modern epidemiology 4th edi‐ 14. Hayashi M, Takamatsu I, Kanno Y, Yosh‐ 2016;35:294–304.
tion. Lippincott Williams & Wilkins, 2021: ida T, Abe T, Sato Y, Japanese Calciphylaxis 22. Shinozaki T, Mansournia MA. Hazard
105–140. Study Group. A case-control study of calci‐ ratio estimators after terminating observation
6. Marsh JL, Hutton JL, Binks K. Removal of phylaxis in Japanese end-stage renal disease within matched pairs in sibling and propen‐
radiation dose response effects: an example of patients. Nephrol Dial Transplant 2012; sity score matched designs. Int J Biostat
over-matching. BMJ 2002;325:327–30. 27:1580–4. 2019;15.
7. Richardson K, Fox C, Maidment I, Steel N, 15. Iwagami M, Taniguchi Y, Jin X, Adomi M, 23. Yasunaga H. Introduction to applied sta‐
Loke YK, Arthur A, et al. Anticholinergic Mori T, Hamada S, et al. Association between tistics—chapter 1 propensity score analysis.
drugs and risk of dementia: case-control recorded medical diagnoses and incidence of Annals Clin Epidemiol 2020;2:33–7.
study. BMJ 2018;361:k1315. long-term care needs certification: a case con‐ 24. Abadie A, Imbens GW. Matching on the
8. Lapi F, Azoulay L, Yin H, Nessim SJ, Suissa trol study using linked medical and long- estimated propensity score. Econometrica
S. Concurrent use of diuretics, angiotensin term care data in two Japanese cities. Annals 2016;84:781–807.
converting enzyme inhibitors, and angioten‐ Clin Epidemiol 2019;1:56–68. 25. Shinozaki T, Nojima M. Misuse of regres‐
sin receptor blockers with non-steroidal anti- 16. Suissa S, Dell’Aniello S. Time-related sion adjustment for additional confounders
inflammatory drugs and risk of acute kidney biases in pharmacoepidemiology. Pharmaco‐ following insufficient propensity score bal‐
injury: nested case-control study. BMJ 2013; epidemiol Drug Saf 2020;29:1101–10. ancing. Epidemiology 2019;30:541–8.
346:e8525. 17. Thomas LE, Yang S, Wojdyla D, Schaubel 26. Ohbe H, Goto T, Miyamoto Y, Yasunaga
9. Woodward M. Epidemiology: study design DE. Matching with time-dependent treat‐ H. Risk of cardiovascular events after spouse’s
and data analysis. Chapman & Hall: Boca ments: a review and look forward. Stat Med ICU admission. Circulation 2020;142:1691–3.
Raton, 1999:265. 2020;39:2350–70. 27. Nagasu H, Yano Y, Kanegae H, Heerspink
10. Hennessy S, Bilker WB, Berlin JA, Strom 18. Greenland S, Morgenstern H. Matching and HJL, Nangaku M, Hirakawa Y, et al. Kidney
BL. Factors influencing the optimal control- efficiency in cohort studies. Am J Epidemiol outcomes associated with SGLT2 inhibitors
to-case ratio in matched case-control studies. 1990;131:151–9. versus other glucose-lowering drugs in real-
Am J Epidemiol 1999;149:195–7. 19. Shinozaki T, Mansournia MA, world clinical practice: the Japan chronic kid‐
11. Wang MH, Shugart YY, Cole SR, Platz Matsuyama Y. On hazard ratio estimators by ney disease database. Diabetes Care 2021;
EA. A simulation study of control sampling proportional hazards models in matched-pair 44:2542–51.
methods for nested case-control studies of cohort studies. Emerg Themes Epidemiol
genetic and molecular biomarkers and pros‐ 2017;14:6.
40