0% found this document useful (0 votes)
15 views

Introduction To Matching in Case-Control and Cohor

cara matching data

Uploaded by

bela pangesti
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Introduction To Matching in Case-Control and Cohor

cara matching data

Uploaded by

bela pangesti
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Annals of Clinical Epidemiology 2022;4(2):33–40

SEMINAR

Introduction to Matching in Case-Control


and Cohort Studies
Masao Iwagami1,2, Tomohiro Shinozaki3
1
Department of Health Services Research, Faculty of Medicine, University of Tsukuba
2
Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine
3
Tokyo University of Science, Department of Information and Computer Technology

ABSTRACT
Matching is a technique through which patients with and without an outcome of interest (in case-control studies) or
patients with and without an exposure of interest (in cohort studies) are sampled from an underlying cohort to have the
same or similar distributions of some characteristics. This technique is used to increase the statistical efficiency and cost
efficiency of studies. In case-control studies, besides time in risk set sampling, controls are often matched for each case
with respect to important confounding factors, such as age and sex, and covariates with a large number of values or
levels, such as area of residence (e.g., post code) and clinics/hospitals. In the statistical analysis of matched case-control
studies, fixed-effect models such as the Mantel-Haenszel odds ratio estimator and conditional logistic regression model
are needed to stratify matched case-control sets and remove selection bias artificially introduced by sampling controls.
In cohort studies, exact matching is used to increase study efficiency and remove or reduce confounding effects of
matching factors. Propensity score matching is another matching method whereby patients with and without exposure
are matched based on estimated propensity scores to receive exposure. If appropriately used, matching can improve
study efficiency without introducing bias and could also present results that are more intuitive for clinicians.

KEY WORDS
matching, case-control study, cohort study, risk set sampling, conditional logistic regression, stratified Cox regression

In addition, in cohort studies, matching can remove or


1 . I NTROD U CTI O N
reduce confounding effects of matching factors.
Matching is mainly used in observational studies, includ‐ This paper aims to introduce basic principles of
ing case-control and cohort studies. Matching is a techni‐ matching in case-control and cohort studies, with some
que by which patients with and without an outcome of recent examples.
interest (in case-control studies) or patients with and
without an exposure of interest (in cohort studies) are
2 . MATCHING IN CASE-CONTROL STUDIES
sampled from an underlying cohort to have the same or
similar distributions of characteristics such as age and sex. 2.1. Unmatched Case-Control Sampling
The main purpose of matching is to increase study A case-control study is a design used to compare levels of
efficiency for data collection and subsequent statistical exposures between cases and controls defined by the sta‐
analysis. Matching helps researchers reduce the volume tus of outcome of interest. In typical case-control studies,
of data for collection without much loss of information cases are all patients with an outcome in an underlying
(i.e., improving cost efficiency) and obtain more precise cohort, with multiple control selection strategies, as
estimates than simple random sampling of the same explained below. Despite outcome-dependent sampling
number of patients (i.e., improving statistical efficiency). (which is also called “biased sampling”) that introduces

33
ANNALS OF C L I N I C A L E P I D E M I O L O G Y

Fig. 1 Graphical representation of cumulative incidence sampling (A), case-control sampling (B), and risk set sampling (C) for 10 example
patients in a cohort. ● indicates an outcome onset and time at selection as a case. ○ indicates time at selection as a control.

selection bias, data collection only for cases and controls


enables researchers to estimate some associational
measures (such as risk ratio, odds ratio, and rate ratio)
that would be obtained in an underlying cohort study,
unless sampling depends on the exposure status in con‐
trols. Specifically, the exposure-outcome odds ratio in a
cumulative incidence sampling (also called exclusive
sampling) of controls is expectedly identical to the odds
ratio in the underlying cohort (Fig. 1A), the odds ratio in
a case-cohort sampling (also called inclusive sampling) is
equal to the risk ratio in the underlying cohort (Fig. 1B),
and the odds ratio in a risk set sampling (also called con‐
current sampling) is equal to the rate ratio (or hazard Fig. 2 Graphical representation of a risk set sampling for 10
example patients in a population-based cohort. ● indicates an
ratio, according to analysis models) in the underlying outcome onset and time at selection as a case. ○ indicates time at
cohort (Fig. 1C). Note that none of the above interpreta‐ selection as a control.
tions of odds ratios requires a rare disease assumption.
Researchers can even restore other associational
measures (such as risk differences) in the underlying
cohort from each sampling design if auxiliary data on tion on cases and selected controls, instead of all people
patients not selected as cases or controls are available in the underlying cohort, is collected and used for statis‐
[1, 2]. tical analysis. Especially for rare outcomes, a cohort study
In Fig. 1, patients are followed up from when they recruiting many people to observe a sufficient number of
enter the cohort, regardless of the calendar date. This is a outcomes is not feasible. However, a case-control design
common timeframe used in cohort studies such as would still be feasible, with reduced costs and efforts.
randomized controlled trials, registry-based cohort In a study with the secondary use of existing cohort
studies, and hospital-based cohort studies. Meanwhile, in data, case-control sampling is usually unnecessary [3].
population-based cohort studies, calendar time is often Such post-hoc sampling would miss the opportunity to
used as a time frame, where a risk set sampling is usually estimate absolute risks of the outcome in the cohort,
used to sample controls for each case at the same calen‐ which is an important indicator in evidence-based
dar time (Fig. 2). medicine or policymaking. However, case-control study
In a study requiring primary data collection, case- designs are still used sometimes if researchers want to
control study designs are efficient because only informa‐ (i) collect additional data on confounding factors by

34
MATCHING IN CASE-CONTROL AND COHORT STUDIES

reviewing medical records or questionnaires, (ii) use 2.3. Choice of Matching Ratio
stored samples to measure new biomarkers, (iii) require Because the number of cases (which are often rare disea‐
adjudication of individual study endpoints with special ses) is usually much smaller than that of potential con‐
expertise in their assessment and classification, and (iv) trols, the matching ratio (i.e., ratio of cases:controls in
make it convenient to assess triggers of an acute event by each matched set) is often set to 1:n. If the ratio is set to
flexibly modeling the exposure window at varying prox‐ 1:1, the design is called a pair-matched case-control
imities to the event of interest [4]. study. In practice, many studies set the matching ratio to
1:4 or 1:5, whereas other studies opt to set it to a large
2.2. Purpose of Matching in Case-Control Studies ratio, such as 1:7 [7] and 1:10 [8]. In unmatched case-
Similar to cohort studies, case-control studies typically control settings, the gain of statistical power sharply
require confounder adjustment using stratified analysis increases until the ratio 1:4 or 1:5 and then slowly increa‐
or regression modeling. To further improve statistical ses thereafter [9]. However, this may not be always true
efficiency in adjusted analyses, case-control studies may in matched case-control settings. The matched case-
match controls on confounders to be adjusted for, i.e., control design generally requires stratification on match‐
sampling a control(s) with an identical (or nearly identi‐ ing factors, which completely discards the information of
cal) value of confounders for each case. When the total matched sets of cases and controls with concordant expo‐
number of cases and controls to be sampled is fixed, the sure (i.e., a set containing people exposed only or people
adjusted odds ratio estimates are likely to be less variable unexposed only). Thus, a 1:4 or 1:5 matching ratio may
(i.e., more statistically efficient) in case-control data still have substantial power loss if (i) cases and controls in
matched on strong confounders than in unmatched data. the same strata of matching factors have similar exposure
Besides common confounding factors such as age and patterns or (ii) exposure is rare (e.g., <15%) in an under‐
sex, area of residence (e.g., post code) or clinics/hospitals lying cohort [10].
(which patients are registered to or visit) are sometimes Sometimes, a case cannot find a prespecified number
matched between cases and controls. If variables with a of controls. For example, in a case-control study planning
large number of values or levels (e.g., over 1,000 post 1:4 matching, some cases could find only less than four
codes or clinics/hospitals) are adjusted for as “surrogate” controls. However, it is not necessary to exclude these
confounders in the statistical analysis, at least one case pairs when matching factors or matched sets of cases and
and one control in each area (or clinic/hospital) are controls are stratified in the analysis. The mixture of pairs
needed; otherwise, the data are discarded in the fixed- with different matching ratios will not result in a biased
effect models (stratification). Although a case and control estimate as long as an adequate adjustment for matching
may rarely come from the same area (or clinic/hospital) factors is adopted.
in unmatched case-control sampling, matching can
ensure that the pairs (or sets) of cases and controls are 2.4. Choice of Matching With and Without Replacement
derived from the same area (or clinics/hospitals). Conse‐ It is necessary to decide whether the same individual can
quently, the odds ratio adjusted for these variables can be be sampled repeatedly as a control (called matching with
efficiently estimated. replacement) or only once (called matching without
Caution is needed for the effect of case-control match‐ replacement). Researchers need to choose one of the two
ing on confounding: matching itself does not have a role as the main analysis, considering the balance between the
in adjusting for confounding factors but rather introdu‐ demerit of not finding sufficient number of controls (i.e.,
ces selection bias [5]. Therefore, as explained later, statis‐ many pairs not achieving the prespecified 1:n matching
tical analysis with fixed-effect adjustment, such as the ratio) by matching without replacement and the demerit
Mantel-Haenszel odds ratio estimator and conditional of decreased statistical efficiency if the same individual is
logistic regression models, is necessary to estimate an repeatedly included as a control by matching with
unbiased confounder-adjusted odds ratio. In addition, if replacement. If the number of controls is much larger
a case and controls become too similar by matching too than that of the case, the choice would not make a big
many variables, statistical efficiency in the fixed-effect difference in the estimated odds ratios. Notably, in risk
analysis will be reduced, which is called over-matching set sampling, (i) people with the outcome (i.e., cases)
[6]. Thus, it is generally not recommended to match should be potentially selected as controls until they
many variables in case-control studies. become a case to represent the underlying cohort, and
(ii) the same person should be selected as a control

35
ANNALS OF C L I N I C A L E P I D E M I O L O G Y

several times at different time points, meaning that pling. If the hazard of disease incidence varies with time
matching without replacement is biased [11]. and the exposure prevalence changes during follow-up,
time should be accounted for as a “confounder.” To do so,
2.5. Statistical Analysis in Matched Case-Control Studies one can use the Mantel-Haenszel odds ratio estimator or
To remove the selection bias artificially introduced by a conditional logistic regression model, which estimates
case-control matching, it is necessary to “stratify” data on the hazard ratio constant over time (and across other
matching factors in the statistical analysis. One tradi‐ matching factors, if any) that would be modeled by the
tional method is the Mantel-Haenszel odds ratio estima‐ Cox proportional hazards model in an underlying cohort.
tor that stratifies on matching factors themselves (e.g.,
subgroups by age group and sex, if controls are matched
3. EXAMPLES OF CASE-CONTROL STUDY
on these factors) or matched sets (e.g., each pair of a case
WITH MATCHING
and control). The Mantel-Haenszel estimator adjusts for
matching factors as fixed effects and estimates a common 3.1. Example 1: A Case-Control Study with Primary Data
odds ratio assumed to be constant across strata. The Collection
Mantel-Haenszel odds ratio estimator consistently esti‐ Hayashi et al. conducted a case-control study to identify
mates the common odds ratio when each stratum con‐ factors associated with calciphylaxis (calcific uremic arte‐
tains sparse data (e.g., only two patients, one case and riolopathy), a rare and fatal complication characterized
one control, in each stratum) but the number of strata by painful skin ulceration and necrosis, in patients
increases. Adjusting for confounding factors besides the undergoing hemodialysis for end-stage renal disease [14].
matching factors by additional stratification within the The researchers representing the Japanese Calciphylaxis
matching factor strata is infeasible. Study Group sent questionnaires to hemodialysis centers
As another method, it is much more common to use a in Japan and included 28 cases with a definitive diagnosis
conditional logistic regression model, which estimates the of calciphylaxis. For each case, two controls matched for
common stratum- and covariate-specific odds ratio by age and hemodialysis duration were randomly selected
stratifying on matching factors while adjusting for other from the same dialysis center. Clinical information,
confounders as covariates [12]. For example, when a con‐ including known and unknown (but suspected) risk fac‐
trol is matched on age and hospital of a case, stratifica‐ tors for calciphylaxis, was collected for cases and con‐
tion of matched pairs using the conditional logistic trols. Univariable logistic regression analyses showed that
regression model will eliminate the confounding effect of warfarin therapy, lower serum albumin levels, higher
these matching variables. Additionally, medical condi‐ plasma glucose levels, and higher serum calcium levels
tions that may confound the exposure-outcome relation‐ were significantly associated with calciphylaxis. A multi‐
ship within the age-hospital strata can be adjusted for by variable logistic regression analysis showed that warfarin
including them as covariates in the model without intro‐ therapy and lower serum albumin levels (per 1 g/dL
ducing unnecessary bias. decrease) were still significantly associated with calciphy‐
Notably, simple adjustments of matching factors by laxis, with an adjusted odds ratio of 10.1 (95% confidence
including them as covariates in (unconditional) logistic interval [CI] 1.63–62.7) and 12.7 (95% CI 2.35–68.6),
regression models are not recommended in matched respectively.
case-control studies. For example, an unconditional
logistic regression model for an outcome, including 3.2. Example 2: A Case-Control Study with Secondary Use
exposure and age as covariates in age-matched case- of Existing Cohort Data
control data, provides a biased estimate of age-adjusted Iwagami et al. conducted a case-control study to identify
odds ratio, even if the model correctly specifies the asso‐ medical diagnoses strongly associated with the incidence
ciation between the outcome and covariates in an under‐ of long-term care needs certification, using linked medi‐
lying cohort [13]. This is because the selection bias cal and long-term care insurance data from two cities in
induced by matching distorts the association between the Japan [15]. The participants were aged ≥75 years, had no
outcome and matching factors, resulting in residual bias previous long-term care needs certification, and had at
owing to model misspecification. least one medical insurance claim record during the
Finally, time at matching (time from cohort entry, study period. Cases were newly certified people for long-
calendar time, or possibly age as time from birth) can be term care needs during the study period, whereas con‐
considered one of the “matching factors” in risk set sam‐ trols were randomly selected in a 1:4 ratio and matched

36
MATCHING IN CASE-CONTROL AND COHORT STUDIES

for age category, sex, city, and calendar date (index date). the level of exposure), and the exposure status of selected
Multivariable conditional logistic regression analysis was patients is assumed to remain unchanged during the
conducted to estimate the association between 22 catego‐ follow-up period.
ries of medical diagnoses recorded during the period of In population-based cohort studies, the exposure sta‐
exposure definition (past 6 months of index date) and tus may change according to the calendar time. For
new long-term care needs certification, under the example, people without diabetes may be diagnosed as
assumption that exposures are independent of each other. having the disease one day. As another example,
Among 38,338 eligible people, 5,434 people newly patients who have never used a certain drug with poten‐
received long-term care needs certification and were tial carcinogenic effects before may start taking it one
matched with 21,736 controls. In the multivariable condi‐ day. In such situations, researchers can create matched
tional logistic regression analysis, the adjusted odds ratio sets of patients with and without the exposure of interest
(95% CI) was the largest for femur fractures (8.80 [6.35– at the same calendar time (Fig. 4). However, the exposure
12.20]), followed by dementia (6.70 [5.96–7.53]), pneu‐
monia (3.72 [3.19–4.32]), hemorrhagic stroke (3.31
[2.53–4.34]), Parkinson’s disease (2.74 [2.07–3.63]), and
other fractures (2.68 [2.38–3.02]).

4 . MA TCHI NG I N CO HOR T STU D I ES


4.1. Rationale for Matching in Cohort Studies
Matching can also be used in cohort studies. Patients
with and without the exposure of interest are matched on
some patient characteristics and compared for the inci‐
dence of outcomes. Matching is rarely used in observa‐
tional cohort studies with primary data collection (with
some exceptions such as sibling design and spouse sur‐
vey) probably because most observational cohort studies
are conducted without pre-specifying a certain exposure,
Fig. 3 Graphical representation of a matched-pair cohort study
for a wide range of research questions. In contrast, for 10 example patients in a cohort. Solid lines indicate that people
matching is sometimes used in cohort studies with the are exposed, dotted lines denote that people are not exposed, and
● indicates the incidence of outcome.
secondary use of existing databases to reduce computa‐
tional burden by selecting a subset of data without sacri‐
ficing statistical precision. In addition, unlike case-
control matching, cohort matching removes or reduces
the confounding effects of matching factors [5].
A matched cohort study may also be conducted from a
practical viewpoint: it would provide an intuitive presen‐
tation of patient characteristics in “comparable” exposure
groups matched on important confounding factors such
as age, sex, and calendar time. As crude absolute mea‐
sures (such as risks and rates) during the follow-up
period are easily summarized in exposed and unexposed
patients, clinicians unfamiliar with statistical analysis can
grasp the difference between the two groups in a non-
statistical manner. Fig. 4 Graphical representation of a matched-pair cohort study
In cohort studies, patients with and without the expo‐ for 10 example patients in a population-based cohort. Solid lines
denote that people are exposed, dotted lines denote that people
sure of interest at the start of the follow-up, such as
are not exposed, ▼ indicates the timing of the matched-pair
smoking and use of a certain drug, are matched in a 1:1 cohort inclusion in the exposed group, ▽ indicates the timing of
or 1:n ratio (Fig. 3). In practice, exposure is dichotom‐ the matched-pair cohort inclusion in the non-exposed group, and
● indicates the incidence of outcome.
ized (i.e., presence or absence of exposure, rather than

37
ANNALS OF C L I N I C A L E P I D E M I O L O G Y

status of the matched sets is assumed to remain sets (e.g., conditional logistic regression, stratified
unchanged during the follow-up period. In the presence Poisson, or stratified Cox regression models) may or may
of time-varying exposures, survival analysis with time- not be an option. Other possible statistical methods
dependent covariates or by censoring the follow-up of a include i) covariate adjustment for matching factors, ii)
patient when his/her exposure status changes may pro‐ random-effect adjustment for matched sets, and iii) mar‐
vide estimates of associational measures (e.g., hazard ginal regression modeling without stratification on
ratios) free from time-related biases [16]. However, in matching factors but with cluster-robust variance
general, a matched cohort study is unsuitable if the expo‐ accounting for matched sets as clusters. The differences
sure status frequently changes between “on” and “off” in between fixed-effect models and other possible statistical
the same patient. Furthermore, although the method methods are the estimand and modeling assumptions of
exists [17], causal interpretation of associational mea‐ the analysis [19].
sures for time-varying exposures estimated in matched Caution is needed in the sense that matching can only
cohort studies requires additional consideration. “balance” distributions in sampled (i.e., matched) data,
and such balance is easily affected by additional adjust‐
4.2. Choice of Matching Factors, Matching Ratio, and ment for or stratification on other variables. Therefore,
Matching With or Without Replacement ignoring matching factors (i.e., adjusting for additional
Matching factors in the secondary use of existing unmatched variables without adjusting for matching fac‐
databases often include age (age category or age within a tors) would cause bias in estimates [20].
range, such as ±2 years), sex, area of residence (e.g., post In some matched-pair cohort studies, observation time
code) or clinics/hospitals (which patients are registered is prematurely terminated immediately after the follow-
to or visit), and calendar time. Although cohort matching up of his/her matched counterpart is completed by an
on known confounders typically leads to an efficiency event or censoring [21]. The impact of such termination
gain in adjusted estimates, there are exceptions depend‐ is minimal when adopting stratified Cox models. How‐
ing on associational measures (e.g., risk difference or ever, in statistical methods other than stratified Cox
ratio) and underlying models (e.g., additive or multipli‐ models, termination is not generally encouraged because
cative risk models) [18]. If the statistical efficiency is information is then discarded in an irremediable manner
rather worsened by matching, the resulting estimates suf‐ [22].
fer from “over-matching.”
Regarding the matching ratio, 1:4 or 1:5 is sometimes 4.4. Propensity Score Matching
chosen in matched-pair cohort studies, whereas 1:1 may Although the aforementioned matching method is spe‐
be chosen more frequently to prioritize simplicity and cifically called exact matching, matching based on the
intuitiveness. Mixed matching ratios (meaning that, for propensity score, which is the probability of receiving
example, some pairs are matched in a ratio of 1:4, exposure within the confounder stratum to which a
whereas other pairs are matched by a ratio of 1:3, 1:2, or patient belongs, is another type of matching method
1:1 between exposed and unexposed people) will not known as marginal matching [13]. Propensity score
cause bias if matching variables or matched sets are matching was featured in a previous paper of this semi‐
adjusted for in the analysis. In contrast, as such varying nar series [23]. Briefly, patients with and without expo‐
matching ratios do not balance the distributions of sure are matched based on estimated propensity scores to
matching factors in exposed and unexposed people, the receive exposure at a certain time point, mostly at the
unadjusted comparison in the matched cohort still time of cohort inclusion. Consequently, the distribution
suffers from confounding bias. of the measured confounding factors defining the pro‐
Matching with or without replacement remains the pensity scores are balanced between the two groups in
choice of researchers, although matching without the propensity score-matched samples. Researchers using
replacement may be more intuitive for clinicians. this method should be aware of the theoretical subtleties
in propensity score matching, such as the lack of justifica‐
4.3. Statistical Analysis in Matched-Pair Cohort Study tion for interval estimation for propensity score-matched
Unlike case-control matching, non-mixed cohort match‐ estimates using off-the-shelf software [24] and bias owing
ing completely or partially removes the confounding to additional adjustment for risk factors not balanced by
effect of matching factors without introducing additional propensity score matching [25].
selection bias. Hence, fixed-effect models for matched

38
MATCHING IN CASE-CONTROL AND COHORT STUDIES

outcome, during follow-up, the mean annual rates of esti‐


5 . E X AMP LE S O F COHO RT STU D Y W IT H
mated glomerular filtration rate (eGFR) change were
MA TCHI NG
−0.47 (95% CI −0.63 to −0.31) and −1.22 (−1.41 to −1.03)
5.1. Example 1: A Cohort Study with Exact Matching mL/min/1.73 m2 per year in the SGLT2 inhibitor and
Ohbe et al. conducted a population-based matched other glucose-lowering drug groups, respectively (P <
cohort study to examine the risk of cardiovascular events 0.001). Regarding the secondary outcome, there were 30
after a spouse’s intensive care unit (ICU) admission, using patients with a composite kidney outcome (50% eGFR
the JMDC claims database, which includes employees of decline or end-stage kidney disease) in the SGLT2 inhibi‐
relatively large Japanese companies and their family tor group (14 events/1,000 patient-years) and 73 in the
members in Japan [26]. Among 1,082,208 eligible mar‐ other glucose-lowering drug group (36 events/1,000
ried couples (2,164,416 spouses), the researchers identi‐ patient-years), with a hazard ratio of 0.40 (95% CI 0.26–
fied 7,815 spouses of patients who were admitted to the 0.61). Thus, compared with other glucose-lowering
ICU for more than 2 days. From the rest of the study drugs, the initiation of SGLT2 inhibitors was associated
population, they randomly selected a non-exposure with a significantly lower rate of eGFR decline and a
group with a ratio of one spouse in the exposure group to lower risk of composite kidney outcome.
four individuals in the non-exposure group, matched for
age, sex, and medical insurance status on the same date
6. CONCLUSION
(index date). When examining the primary outcome, the
percentage of any visits for cardiovascular diseases 1–4 We have provided an overview and some recent examples
weeks after the spouse’s ICU admission was 2.7% of matching in case-control and cohort studies. Matching
(210/7815) in the exposure group and 2.1% (666/31 250) in case-control studies can increase study efficiency,
in the non-exposure group, with an adjusted odds ratio including both cost and statistical efficiencies. Neverthe‐
of 1.27 (95% CI, 1.08–1.50). Secondary outcomes, which less, caution is still warranted since inappropriate sam‐
included any hospitalization for cardiovascular disease or pling of controls and application of statistical analysis
hospitalization for severe cardiovascular events, were also without stratification would result in a biased estimate. In
significantly more frequent in the exposure group. The cohort studies, exact matching can increase efficiency
odds ratios became closer to 1 (i.e., the null association) 4 and remove or reduce the confounding effect of matching
weeks after the index date. Thus, the authors concluded factors, whereas a propensity score matching can be used
that ICU admission of a spouse can be a risk factor for to balance the distributions of measured confounding
cardiovascular events 1–4 weeks after the date of the factors between exposed and unexposed individuals. If
spouse’s ICU admission. appropriately used, matching can improve study effi‐
ciency without introducing bias and can present results
5.2. Example 2: A Cohort Study with Propensity Score that are more intuitive for clinicians.
Matching
Nagasu et al. conducted a registry-based propensity ACKNOWLEDGMENTS
score-matched cohort study using the Japan Chronic We would like to thank Dr. Hiroyuki Ohbe of the Department
Kidney Disease Database (J-CKD-DB) [27] to examine of Clinical Epidemiology and Health Economics, School of
the protective effects of sodium-glucose cotransporter 2 Public Health, The University of Tokyo, and Dr. Motohiko
(SGLT2) inhibitors on kidneys compared with other Adomi in the Department of Epidemiology, Harvard T.H. Chan
glucose-lowering drugs. The researchers identified School of Public Health, for their critical reading of the manu‐
patients with CKD who started SGLT2 inhibitors or other script and feedback.
glucose-lowering drugs. On the day of initiation, they
calculated a propensity score for SGLT2 inhibitor initia‐ CONFLICT OF INTERESTS
tion for each patient and created a 1:1 propensity score- No potential competing interests relevant to this paper are
matched cohort (n = 1,033 pairs). Regarding the primary reported.

REFERENCES

1. Wacholder S. The case-control study as 2. Noma H, Tanaka S. Analysis of case-cohort tion. Stat Methods Med Res 2017;26:691–706.
data missing by design: estimating risk differ‐ designs with binary outcomes: improving effi‐ 3. Schuemie MJ, Ryan PB, Man KKC, Wong
ences. Epidemiology 1996;7:144–50. ciency using whole-cohort auxiliary informa‐ ICK, Suchard MA, Hripcsak G. A plea to stop

39
ANNALS OF C L I N I C A L E P I D E M I O L O G Y

using the case-control design in retrospective tate cancer progression. Cancer Epidemiol 20. Sjölander A, Greenland S. Ignoring the
database studies. Stat Med 2019;38:4199–208. Biomarkers Prev 2009;18:706–11. matching variables in cohort studies – when
4. Schneeweiss S, Suissa S.Discussion of 12. Pearce N. Analysis of matched case- is it valid and why? Stat Med 2013;32:4696–
Schuemie et al. “A plea to stop using the case- control studies. BMJ 2016;352:i969. 708.
control design in retrospective database stud‐ 13. Greenland S. Partial and marginal match‐ 21. Sutradhar R, Baxter NN, Austin PC. Ter‐
ies”. Stat Med 2019;38:4209–12. ing in case-control studies. Modern statistical minating observation within matched pairs
5. Rothman KJ, Lash TL. 6 Epidemiologic methods in chronic disease epidemiology. of subjects in a matched cohort analysis: a
study design with validity and efficiency con‐ Wiley: New York, NY, 1986:35–49. Monte Carlo simulation study. Stat Med
siderations. Modern epidemiology 4th edi‐ 14. Hayashi M, Takamatsu I, Kanno Y, Yosh‐ 2016;35:294–304.
tion. Lippincott Williams & Wilkins, 2021: ida T, Abe T, Sato Y, Japanese Calciphylaxis 22. Shinozaki T, Mansournia MA. Hazard
105–140. Study Group. A case-control study of calci‐ ratio estimators after terminating observation
6. Marsh JL, Hutton JL, Binks K. Removal of phylaxis in Japanese end-stage renal disease within matched pairs in sibling and propen‐
radiation dose response effects: an example of patients. Nephrol Dial Transplant 2012; sity score matched designs. Int J Biostat
over-matching. BMJ 2002;325:327–30. 27:1580–4. 2019;15.
7. Richardson K, Fox C, Maidment I, Steel N, 15. Iwagami M, Taniguchi Y, Jin X, Adomi M, 23. Yasunaga H. Introduction to applied sta‐
Loke YK, Arthur A, et al. Anticholinergic Mori T, Hamada S, et al. Association between tistics—chapter 1 propensity score analysis.
drugs and risk of dementia: case-control recorded medical diagnoses and incidence of Annals Clin Epidemiol 2020;2:33–7.
study. BMJ 2018;361:k1315. long-term care needs certification: a case con‐ 24. Abadie A, Imbens GW. Matching on the
8. Lapi F, Azoulay L, Yin H, Nessim SJ, Suissa trol study using linked medical and long- estimated propensity score. Econometrica
S. Concurrent use of diuretics, angiotensin term care data in two Japanese cities. Annals 2016;84:781–807.
converting enzyme inhibitors, and angioten‐ Clin Epidemiol 2019;1:56–68. 25. Shinozaki T, Nojima M. Misuse of regres‐
sin receptor blockers with non-steroidal anti- 16. Suissa S, Dell’Aniello S. Time-related sion adjustment for additional confounders
inflammatory drugs and risk of acute kidney biases in pharmacoepidemiology. Pharmaco‐ following insufficient propensity score bal‐
injury: nested case-control study. BMJ 2013; epidemiol Drug Saf 2020;29:1101–10. ancing. Epidemiology 2019;30:541–8.
346:e8525. 17. Thomas LE, Yang S, Wojdyla D, Schaubel 26. Ohbe H, Goto T, Miyamoto Y, Yasunaga
9. Woodward M. Epidemiology: study design DE. Matching with time-dependent treat‐ H. Risk of cardiovascular events after spouse’s
and data analysis. Chapman & Hall: Boca ments: a review and look forward. Stat Med ICU admission. Circulation 2020;142:1691–3.
Raton, 1999:265. 2020;39:2350–70. 27. Nagasu H, Yano Y, Kanegae H, Heerspink
10. Hennessy S, Bilker WB, Berlin JA, Strom 18. Greenland S, Morgenstern H. Matching and HJL, Nangaku M, Hirakawa Y, et al. Kidney
BL. Factors influencing the optimal control- efficiency in cohort studies. Am J Epidemiol outcomes associated with SGLT2 inhibitors
to-case ratio in matched case-control studies. 1990;131:151–9. versus other glucose-lowering drugs in real-
Am J Epidemiol 1999;149:195–7. 19. Shinozaki T, Mansournia MA, world clinical practice: the Japan chronic kid‐
11. Wang MH, Shugart YY, Cole SR, Platz Matsuyama Y. On hazard ratio estimators by ney disease database. Diabetes Care 2021;
EA. A simulation study of control sampling proportional hazards models in matched-pair 44:2542–51.
methods for nested case-control studies of cohort studies. Emerg Themes Epidemiol
genetic and molecular biomarkers and pros‐ 2017;14:6.

40

You might also like