0% found this document useful (0 votes)
34 views11 pages

Systematic Review Guide

This document provides an overview of systematic reviews and meta-analysis for beginners. It discusses that systematic reviews involve applying scientific methods to reduce bias when reviewing literature [1]. Key components of systematic reviews include developing a research question, conducting a comprehensive literature search, critically appraising studies, extracting and analyzing data, and considering applicability of the evidence [1]. Meta-analysis provides pooled estimates of effects across studies using statistical methods [2]. The output of meta-analysis is a forest plot showing information on individual studies and pooled effects [2]. Systematic reviews aim to facilitate objective, reproducible and transparent healthcare decisions by synthesizing high-quality research evidence [3].

Uploaded by

Skanda Swaroop
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views11 pages

Systematic Review Guide

This document provides an overview of systematic reviews and meta-analysis for beginners. It discusses that systematic reviews involve applying scientific methods to reduce bias when reviewing literature [1]. Key components of systematic reviews include developing a research question, conducting a comprehensive literature search, critically appraising studies, extracting and analyzing data, and considering applicability of the evidence [1]. Meta-analysis provides pooled estimates of effects across studies using statistical methods [2]. The output of meta-analysis is a forest plot showing information on individual studies and pooled effects [2]. Systematic reviews aim to facilitate objective, reproducible and transparent healthcare decisions by synthesizing high-quality research evidence [3].

Uploaded by

Skanda Swaroop
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

RESEARCH METHODOLOGY SERIES

Systematic Reviews and Meta-Analysis: A Guide for Beginners


JOSEPH L MATHEW
From Department of Pediatrics, Advanced Pediatrics Centre, PGIMER, Chandigarh.
Correspondence to: Prof Joseph L Mathew, Department of Pediatrics, Advanced Pediatrics Centre, PGIMER Chandigarh.
[email protected]

Systematic reviews involve the application of scientific methods to reduce bias in review of literature. The key components of a
systematic review are a well-defined research question, comprehensive literature search to identify all studies that potentially address
the question, systematic assembly of the studies that answer the question, critical appraisal of the methodological quality of the
included studies, data extraction and analysis (with and without statistics), and considerations towards applicability of the evidence
generated in a systematic review. These key features can be remembered as six ‘A’; Ask, Access, Assimilate, Appraise, Analyze and
Apply. Meta-analysis is a statistical tool that provides pooled estimates of effect from the data extracted from individual studies in the
systematic review. The graphical output of meta-analysis is a forest plot which provides information on individual studies and the pooled
effect. Systematic reviews of literature can be undertaken for all types of questions, and all types of study designs. This article
highlights the key features of systematic reviews, and is designed to help readers understand and interpret them. It can also help to
serve as a beginner’s guide for both users and producers of systematic reviews and to appreciate some of the methodological issues.
Keywords: Forest plot, Pooled estimates, Risk of bias, Secondary research.

Published online: June 28, 2021; PII: S097475591600350

E
vidence-based (or evidence-informed) health- and generate a summary conclusion [4]. Meta-analysis of
care requires the integration of high-quality data is inappro-priate if not derived from a systematic review.
research evidence, clinical expertise and patient It would be akin to applying statistical tests on data which
(con- sumer) values [1]. However, the immense are not derived from primary research studies.
volume of primary research and its diversity in terms of
This article highlights the key features and methodo-
methodology, necessitate that it be reviewed and
logical issues of systematic reviews and is designed to help
synthesized to make rational interpretations and decisions.
readers understand and interpret them. This article is not
This necessity has led to an entire field of secondary research
intended to be a comprehensive handbook to inter-pret or
to synthesize data from primary research. Systematic reviews
conduct systematic reviews but can serve as a beginner’s
are the key pillar of such secondary research. The broad
guide for both users and producers of systematic reviews.
principle of systematic reviews is to apply “scientific
strategies that limit bias to the systematic assembly, critical Systematic reviews are initiated after preparing,
appraisal, and synthesis of all relevant research studies on a registering, and publishing a review protocol. The process is
specific topic” [2]. Thus, in contrast to traditional narrative similar to preparing protocols for primary research studies.
reviews, there is a rigorous attempt to limit bias in the process Registration of systematic review protocols is broadly similar
of selecting, reviewing and synthesizing primary research to registration of clinical trial protocols; however, different
studies in systematic reviews. These efforts at minimizing platforms are used. One such platform is PROSPERO, which
bias have led systematic reviews to be regarded superior to serves as a database for registering protocols of systematic
primary research study designs, thereby finding a place at reviews [5]. This promotes transparency in the review
the top of the hierarchy of research evidence. In terms of process.
research methodology, bias can be described as systematic
High quality reviews such as Cochrane reviews, publish
error that leads away from the truth [3]. This is largely
systematic review protocols after stringent peer review.
avoidable, in contrast to random error which occurs by
Some journals also publish systematic review protocols,
chance [3], and hence, is unpredictable. The ultimate goal of
whereas others expect them to be available online for access
systematic reviews is to facilitate healthcare decisions that
by anyone. Currently, it is difficult to publish a good quality
are objective, reproducible and transparent.
systematic review without prior registration and publication
Meta-analysis is a statistical tool that is used to (or disclosure) of the protocol. This is to ensure that
mathematically pool data derived from a systematic review, appropriate methodology is used, detailed methods are

INDIAN PEDIATRICS 320 VOLUME 59__APRIL 15, 2022


JOSEPH L MATHEW 57

disclosed beforehand (a priori), and no modifications are nature, and the investigators are merely observing the
made after data become available (post hoc). This makes the effects. Therefore, ‘I’ can be better expressed as
review process and the product, systematic, objective, ‘Exposure’ abbreviated as ‘E’. This is also true for
reproducible, and trans-parent (summarized by the acronym systematic review of diagnostic test studies (wherein
SORT). participants are ‘exposed to’ diagnostic tests), prog-
nostic markers (wherein participants are exposed to one
MAKING SENSE OF A SYSTEMATIC REVIEW
or more factors), and prevalence of certain conditions
Healthcare professionals reading, appraising or conduc-ting (wherein participants are naturally exposed to the
a systematic review should focus on six key aspects (Table I). condition).
Ask (Research Question) • C (Comparison): People not receiving the intervention
could receive an alternate intervention, or placebo, or
The science of evidence-based medicine hinges on the art of nothing (depending on the research question). However,
framing and addressing research questions [6]. This is the for some study designs and/or research questions, it
most important step in any research study, including may not be feasible to include a Comparison.
systematic review. The ‘PICO format’ [7] of research
questions is better expanded to ‘PICOTS’ as follows. • O (Outcome): This refers to the broad parameters by
which the effect of ‘I’ on ‘P’ in comparison to ‘C’ can be
• P (Population and/or Patient and/or Problem): It refers measured. In general, systematic reviews of inter-
to the people in/for whom the systematic review is ventions focus on efficacy, safety, and sometimes cost.
expected to be applied. Systematic reviews of diagnostic tests focus on
• I (Intervention): In the context of systematic reviews measures of accuracy, reliability, and cost. Multiple
examining effects of treatment, ‘I’ encompasses specific outcome measures can be analyzed for each
medicines, procedures, health education, public health outcome being evaluated.
measures, or bundles/combinations of these. ‘I’ also • T (Time-frame): Outcomes are meaningful only when the
includes preventive measures such as vaccination, time-frame in which they are recorded are specified. For
prophylaxis, health education tools, and packages of example, ‘mortality’ as an outcome can be recorded in
such interventions. In some contexts, the intervention is various time-frames. Different outcomes in a systematic
not administered by the study investigators, but by review may have different time-frames which should be

Table I Key Aspects of Systematic Reviews


Key principle Interpretation Remarks
Ask What is the specific research question ‘asked’ or The entire methodology of a systematic review inter-
addressed in the systematic review? pretation of findings, and conclusions, depend on this.
Access What literature sources were accessed (or searched) The focus is to ensure that no study that can potentially
to identify the primary search studies to be included answer the research question, gets missed.
in the systematic review? What was the ‘search
strategy’?
Assimilate What strategies were used to assimilate or synthesize In order to minimize bias, most systematic review
or ‘put together’ the primary research studies? prudently limit the included studies to those conforming
to the best, or sometimes most appropriate study
designs that can answer the research question.
Appraise How were the included studies critically appraised This is to estimate the risk of bias in the primary studies,
for methodological quality? and the potential impact on the systematic review
results and conclusions.
Analyze What data was extracted from each primary study The data extracted from the primary studies could be
for synthesis? How were the data analyzed? What examined with a combination of qualitative and
are the main findings? What is the level of confidence quantitative methods. Meta-analysis helps to obtain a
in these findings, based on the methodological pooled estimate of the included data.
aspects of the included studies?
Apply Can the findings of the systematic review be applied Conclusions of a systematic review have to be integrated
in the patient or population of your interest? with clinical expertise and patient preferences/values
for a truly evidence-informed healthcare decision.

INDIAN PEDIATRICS 321 VOLUME 59__APRIL 15, 2022


58 SYSTEMATIC REVIEWS AND META-ANALYSIS

specified clearly. Access (Literature Search)


• S (Study design): Multiple study designs may be used in This step is designed to identify all literature that can
primary studies to address the same research question. potentially answer the research question. It includes several
However, study designs have inherent risks of bias (by components to facilitate systematic, objective, reproducible,
virtue of the design itself) which results in a hierarchy of and transparent (SORT) search and inclusion of studies.
pri mary research study designs. Rando-mized controlled
Types of studies: Systematic review authors may include only
trials (RCT) are associated with the least risk of bias for
studies conforming to the most appropriate study design, or
evaluating interventions. Bias increases in non-
choose to include various types of study designs. The
randomized trials, other clinical trials, cohort studies
advantage of the first approach is that studies with higher
(with and without comparison groups), case-control
risk of bias are eliminated upfront; however, the
studies, case series, and case reports (in that order).
disadvantage is that there may be insufficient studies of high
Since the focus of systematic reviews is to review
methodological quality, and these may not truly represent
literature minimizing bias as far as possible, some
the real-world scenario. The second approach may yield
systematic reviews include only methodologi-cally high-
more studies (hence larger sample size) but reduce the
quality study designs (such as RCT), whereas others
confidence in the overall result due to inclusion of lower
may include various study designs and examine the
quality primary studies. The way out is for systematic review
impact of lower-quality designs separately.
authors to either include only the highest quality study
There are other formats (besides PICOTS) for framing design, or include multiple designs but perform separate
and/or presenting research questions. The SPICE acronym analyses of high quality versus lower quality designs, and
covers issues such as setting, population, intervention, explore the difference.
comparison and evaluation [8]. It is generally considered
Types of participants: This refers to the participant
helpful to develop questions relating to qualitative research,
characteristics in the primary studies, such as age group,
and for evaluating project proposals and quality
socio-demographic characteristics, duration of disease, and
improvement. Another tool is SPIDER, which helps to
severity. Here also, choosing a very narrow set of criteria
structure qualitative research questions. It summarizes
limits the generalizability of the systematic review; whereas,
sample, phenomenon of interest, design, evaluation and
very broad criteria may end up combining apples and
research type [9]. Yet another format is ECLIPSE [10], that is
oranges to obtain a pooled result. A useful method is to
reportedly helpful for questions addressing healthcare
ensure that the inclusion criteria are broad, but include
policies or services. The acronym covers expectation, client,
objective methods of diagnosis and measurement of disease
location, impact, professionals, and service.
severity. For example, in diagnostic test studies, the
However, the PICO format remains the most popular participants should include people ‘suspected to have the
version perhaps because it is the oldest, covers a variety of disease’ or those ‘with potential to have the disease’, and not
research questions, is ‘portable’ across study designs, and only those confirmed to have the disease.
can be extended to secondary research, health tech-nology
Types of intervention/exposure: The PICOTS question in the
assessment, guidelines, and policy issues.
Introduction section identifies the broad contours of the
The research question in a systematic review is usually intervention/exposure, whereas the methods section
clearly specified in the introduction section. Often, no provides greater detail of the intervention such as, dosage,
research question may be found but enough information may frequency of administration, mode of adminis-tration, dura-
be provided for readers to frame one in the PICOTS format. tion of administration, and similar issues. When the interven-
However, systematic reviews that do not specify a research tion is a procedure, the skill/training of the operator and the
question, or facilitate the construction of one by readers, are healthcare setting may be additional factors. For studies
likely to result in biased interpretations and should be read measuring behavior change (in res-ponse to health
with caution. Research questions that have very narrow or education, legislation etc.), the ‘inter-vention’ may consist of
highly focused ‘P’ run the risk of producing systematic a ‘bundle’ involving many different components, with or
reviews with limited generalizability. On the other hand, very without reinforcement.
broad questions can generate more noise than signal. The
The intervention is actually an ‘exposure’ in diagnostic
key is to have a research question where-in the elements are
test studies, prognosis studies, and prevalence/incidence
balanced to include the population of interest in a non-
studies.
restrictive manner, yet have a high signal to noise ratio. The
PICOTS template is applicable for systematic reviews Types of comparison: All the details specified for the
addressing all types of research questions (Table II). intervention should be specified for the comparison also. In

INDIAN PEDIATRICS 322 VOLUME 59__APRIL 15, 2022


JOSEPH L MATHEW 59

Table II Applicability of PICOTS to Systematic reviews Addressing Various Types of Research Questions
Research Intervention Diagnosis Prognosis Prevalence/ Association
question Incidence
Example Is plasma exchange Can ‘loss of smell’ Do people with What proportion of Does international
therapy beneficial in be used to diagnose COVID-19 patients with travel result in
COVID-19? COVID-19? having co-existing COVID-19, have or COVID-19?
diabetes or hyper- develop acute res-
tension, fare worse? piratory distress
syndrome?
P = Patient/ People with severe People with sus- People with confir- People with Indian citizens, resi-
Population COVID-19 pected COVID-19 med COVID-19 COVID-19 ding in the country.
I = Intervention Plasma exchange Confirmation of (Controlled and un- International travel
or Exposure therapy ‘loss of smell’ controlled) Diabetes, (within the
or Hypertension preceding 21 days)
C = Comparison No plasma exchange Reverse transcrip- None of the above No international
tase PCR for novel travel (within the
Coronavirus preceding 21 days)
O = Outcomes Mortality, Need for Diagnostic Disease severity, Acute respiratory Development of
invasive ventilation, accuracy, Cost Need for intensive distress syndrome COVID-19
Side effects, Cost care, Mortality (ARDS)
T = Time-frame Within 30 days of Not applicable* From diagnosis to From diagnosis to Within 28 days of
treatment (for all recovery or dis- recovery or dis- the date of con-
outcomes) charge or death. charge or death. clusion of the travel.
S = Study design RCT Diagnostic test Cohort study with Cross-sectional Case-control study.
study comparison group study (for preval-
ence) Cohort study
(for incidence)
*Diagnostic test studies are cross-sectional in the sense that the index test (confirmation of loss of smell) and reference test (RT-PCR) should
ideally be performed at the same time, or if that is not feasible, within a narrow interval, during which there is no probability of a change in the
diagnostic status of a given patient (from negative to positive, or vice versa). Similarly, the gap between the index test and diagnostic test should
not be such that people who receive one test may get cured, or drop-out, or die before the other test is administered.

intervention studies, the comparator may be another findings, or results of combinations of investigations. Each
intervention (such as the current standard of care), placebo of these outcomes could be measured in multiple ways, and
(if that is deemed safe and appropriate on ethical grounds), or may be recorded at multiple time points, and/or using multiple
no intervention (if safe/appropriate). In diagnostic test instruments/tools, all of which are generally reported in the
studies, there is no separate group of individuals for systematic review. Similarly, safety outcomes could include
comparison, but the same group of participants receives the development of adverse events, count of serious adverse
index test (exposure) and the reference test (comparison). events, number of patients developing such events, number
Some primary research studies may not have comparison of events per patient, need for enhanced monitoring, etc. It is
group (examples are clinical trials without a comparison impossible to include every possible outcome measure in a
group, cohort studies without comparison, and prevalence/ systematic review. However, no important outcomes should
incidence studies). The information derived from such be missed; patient-centric outcomes should be included;
studies is inferior to those with comparison groups. outcomes measured objectively are preferred; hard
outcomes are considered superior to soft outcomes, and
Types of outcome measures: Just as in primary research purely indirect/surrogate outcomes are less preferred. The
studies, systematic reviews generally have one primary methods section should include the time-frame of recording
outcome and multiple secondary outcomes. Each outcome each of the included outcomes. Where the outcomes are
may have several methods of measurement/recording. Thus recorded multiple times, separate analyses would be
the broad term ‘efficacy’ may include outcomes like clinical necessary for each.
cure, resolution, survival/mortality, need for escalation of
therapy, duration of hospitalization, or quality of life Search methods for identification of studies
measurements. Other surrogate outcomes of efficacy could Where? This section defines the literature databases
be laboratory parameters, biomarker levels, radiological accessed to identify all the relevant evidence. High quality

INDIAN PEDIATRICS 323 VOLUME 59__APRIL 15, 2022


60 SYSTEMATIC REVIEWS AND META-ANALYSIS

systematic reviews search multiple electronic databases such by screening the abstract of short-listed titles. The third step
as Medline, Embase, Cochrane Register of Trials, and other is to read the full-text of the short-listed abstracts to match
repositories. At the very least, two databases should be against the set of eligibility criteria described above, for
searched. Depending on the review question, additional deciding on inclusion into the systematic review (or
literature databases may also be searched. In addition, most otherwise). Here too, the PICOTS framework is very helpful.
reviewers search other sources of literature including Each step is carefully recorded and reasons for exclusion are
reference lists of included studies (this is referred to as hand- documented for the studies excluded in the third step. This is
searching), clinical trials registries (for registered trials), done to ensure transparency and objectivity in study
conference abstract books/proceedings, and databases of selection. It is good practice to ensure that screening of titles,
non-indexed journals. In the Indian context, many journals are abstracts, and full text for potential inclusion, is done by
indexed in IndMED [11], although not in Medline. Similarly, more than one reviewer, working independently.
Wangfang Data is a source of Chinese literature [12], and It is also helpful to prepare a flow diagram showing the
LILACS database includes Latin American and Caribbean results of the literature searches, exclusion of publications
literature [13]. There are also specific databases for different with reasons, and the pathway to final inclusion of eligible
types of clinical problems and/or healthcare specialists. All studies. This is similar to the flow-diagram of participant
these additional searches are focused on published sources recruitment in trials.
of evidence. Some authors go further and search sources of
unpublished literature (sometimes referred to as grey Appraise (Critical Appraisal of Included Studies)
literature). These may be available through repositories of All systematic reviews undertake critical appraisal of
such studies (for example OpenGrey database includes over included studies for methodological quality. This refers to
7 lakh references of grey literature in Europe) [14]. assessment of efforts made by investigators of primary
How? Databases of published and unpublished literature studies to minimize bias during the conduct of their study.
have specific approaches to ensure comprehensive searc- Bias or systematic error can creep into primary research
hes for all eligible primary studies. Systematic reviews thus studies with inappropriate study designs, and inappropriate
undertake multiple searches of each database, with various study methods. The former includes choosing study
combinations of keywords, exploiting the inbuilt filters in designs that inherently have high(er) risk of bias, and
some of the databases. Although it may be convenient to insufficient precautions to address the common sources of
search only English language publications, high-quality bias within each study design. For example, in studies
reviews do not restrict by language or any other criteria. This examining interventions, RCT is the ideal study design, and
is so that no bias creeps in through selective inclusion (or within RCT, sources of bias include selection bias, allocation
exclusion) of primary studies. Such rigour increases the cost, bias, performance bias, and out-come reporting bias.
duration, and workload of syste-matic review authors, but Inappropriate study methods include using inappropriate
minimizes a major source of bias. tools for measuring outcomes, lack of calibration of instru-
ments used to record outcomes, inappropriate recording
When? Systematic review authors are expected to declare the methods, inappropriate/insufficient follow-up, etc.
date of literature search, period over which each database
was searched, and also provide updated searches just before Appraisal in systematic reviews is generally restricted to
the systematic review is published. All these efforts ensure examination of study design issues and efforts to minimize
that the evidence is current and the searches are bias due to this. There are standard online tools available for
reproducible. each type of study design. The Cochrane Risk of Bias tool
[15] is considered a standard tool for RCT and includes
Who? Literature searching is a key step of systematics appraising the methods used (and adequacy thereof) for key
reviews, and is generally conducted independently by more design elements in intervention trials viz., random sequence
than one author. The outputs, eligibility, and selection are generation, concealment of allocation, blinding of study
compared and is resolved by another independent author participants, blinding of outcome assessors, incomplete
where there is mismatch. Although not essential, reference outcome reporting, and selective outcome reporting. There is
managers such as Endnote, Zotero, or Mendeley can be used an additional element for appraising any other bias. Software
to compile the search output, remove duplicate publications tools for systematic reviews, such as the Cochrane Review
and obtain the final list of the preliminary search. Manager or RevMan [16] have options for the pictorial
Assimilate (Inclusion and Exclusion of Studies) representation of quality appraisal of included studies.
Generally, a three-step approach is used to confirm the The Newcastle Ottawa Scale (NOS) is often used to
eligibility of primary studies for inclusion in the SR. This assess the quality of non-randomized studies including case-
includes a preliminary screening of each study title, followed control, cohort studies, and even qualitative studies [17]. The

INDIAN PEDIATRICS 324 VOLUME 59__APRIL 15, 2022


JOSEPH L MATHEW 61

NOS contains eight items, categorized into three broad situations, the systematic review authors correspond with
perspectives: selection of the study groups; comparability of study authors to obtain missing data (and record the
the groups; and ascertainment of either the exposure or process).
outcome of interest (for case-control or cohort studies,
In intervention reviews, numerical data of outcome
respectively). For each item, a star system is used to allow a
measures (from included studies) usually conform to either
semi-quantitative assessment of study quality. High-quality
dichotomous data (expressed as proportions) or continuous
studies are defined by a score 6 or more of 9 total points [18].
data (expressed as mean with standard deviations, or
Another popular tool for non-RCT studies is the Risk of variations of this). Other forms of presentation include
Bias in Non-Randomized Studies of Interventions tool, median (with interquartile ranges). In diagnostic test reviews,
abbreviated as ROBINS-I [19]. It includes assessments of each included study provides information on the number of
bias in pre-intervention (biases due to confounding as well as true positive, false positive, true negative, and false negative
participant selection), at intervention (bias in classification of test results.
interventions), and post-intervention (biases due to devia-
The extracted data may be considered for pooled
tions from the intended inter-ventions, missing data,
analysis if there is sufficient data (although there is no strict
measure ment of outcomes, and selective reporting).
definition for this), and the data are in a format conducive for
The QUADAS-2 tool [20] can be used to evaluate the risk pooling. For example, data from a study presenting an
of bias of diagnostic test accuracy studies. It examines the outcome as mean (standard deviation) is not amenable for
risk of bias in four broad domains viz. patient selection, index pooling with data from another study presenting the same
test, reference standard, and flow and timing. Among these, outcome as median (IQR), unless mathematical conversion
the first three are also evaluated in terms of applicability. techniques are applied to con-vert medians to means.
Likewise, in studies reporting diagnostic tests, if only data
There are specific tools for assessing quality of
on sensitivity and specificity are reported without the
environmental health studies. These include tools
numbers from which they are derived, it is difficult to pool
developed by the Office of Health Assessment and
them. Such problems can be resolved if systematic review
Translation (OHAT) and Integrated Risk Information System
authors have access to the raw data from primary studies,
(IRIS) [21]. There are also additional tools specific for animal
and/or are able to undertake individual patient meta-analysis
studies. For example, SYCRLE’s tool is an adaptation of the
[23].
Cochrane Risk of Bias tool, and is used to assess internal
validity, addressing selection, perfor-mance, detection, Meta-Analysis
attrition and reporting biases [22].
The statistical procedure for pooling data from individual
Analyze (Data Extraction and Analysis) studies is called meta-analysis. Meta-analysis presents the
estimate of effect from each included study, relative weight of
Systematic reviewers prepare data extraction forms (that are
each study in the pool, and the pooled estimate of effect. The
not published, although Cochrane reviews present these
relative weight depends on the variance in the result, which is
details) which include the following information from each
impacted by the sample size and width of the confidence
included study: i) Identification characteristics (authors,
interval of the effect. In general, studies with less variance
source, year); ii) Study characteristics (enrol-ment criteria,
(i.e., narrower confidence interval of the effect) have greater
sample size, PICOTS information), iii) Appraisal for bias
relative weight, and studies with large sample sizes and
(using standard tools/checklists), iv) Data reflecting the
narrow interval have the greatest weight. Understanding the
outcomes specified in PICOTS, and v) Additional notes, if
concept of study weights is important because the pooled
any.
estimate of effect is not a mathematical average of the data
Data to be analyzed could include descriptive data and from individual studies, but a weighted average.
quantitative data. Narrative synthesis of the extracted data is
The graphical output of meta-analysis is referred to as a
helpful to understand the perspectives of the primary
forest plot. Although they may seem intimidating, a step-
studies in terms of the PICO elements. A table highlighting
wise approach (Fig. 1) makes it easier to understand and
the descriptive characteristics of the included studies is very
interpret forest plots. Fig. 1 presents a meta-analysis (from a
helpful for readers. Quantitative data are extracted for each
fictitious systematic review) of six hypothetical RCTs
outcome measure (specified in the review protocol). Data
comparing Option A vs Option B for a clinical condition.
extraction is also generally done independently by more than
one reviewer, with provision to resolve discrepancies. Step 1: What is the comparison? This is presented at the top
Sometimes, published versions of indivi-dual studies lack of the forest plot and shows the interventions being
pieces of data that are important for the review. In such compared as well as the outcome.

INDIAN PEDIATRICS 325 VOLUME 59__APRIL 15, 2022


62 SYSTEMATIC REVIEWS AND META-ANALYSIS

Step 2: What outcome measure is being compared? Each effect represent estimates that could lie on either side. No
outcome can be represented by several measures. Each further tests of statistical significance are required; however,
outcome measure is analyzed in a separate forest plot. some forest plots present additional tests for this. Similarly,
narrower confidence intervals suggest more precise
Step 3: How is the data presented? Dichotomous data are estimates, and vice versa.
compared using odds ratio (OR), risk ratio or relative risk
(RR), or risk difference (RD). All are valid measures. OR are Step 7: Examine and explore heterogeneity. Hetero-geneity
mathematically purer, but RR are easier to understand. RD among studies refers to variation in the effect, which could
can be used to calculate the number needed to treat (NNT). be due to random chance or other factors. Random chance
Continuous data are presented as mean difference (MD), or would be the only explanation for differences in estimates of
weighted mean difference (WMD), or standardized mean effects if all studies were conducted in exactly the same way.
difference (SMD). All measures are presented with In reality, studies are conducted somewhat differently, hence
confidence intervals (usually 95%, but modifiable). differences in effect result from random chance plus
additional factors. This heterogeneity can be apparent by
Step 4: Which statistical model is used? There are two visual inspection of the pooled data wherein confidence
statistical models viz. fixed effect (FE) and random effects intervals that fail to overlap suggest (but not confirm) the
(RE). The FE model assumes that there is a single common presence (but not the degree) of heterogeneity.
estimate of effect, and all studies aim to estimate that
common effect. In contrast, the RE model assumes that there Currently, the Cochran statistic or more recently, the I
is no single common effect, but a distribution of true effects, square test (I2) is used to mathematically calculate the degree
which varies from study to study [24]. This model considers of heterogeneity [25]. Currently, I2 <50% is accepted as low
heterogeneity among studies in terms of participants, degree of heterogeneity, I2 between 50-75% as moderate
biological characteristics, disease characteristics, degree, and I2 >75% as high degree of heterogeneity. A P
measurement tools, etc. Thus, in the FE model, it is assumed value of <0.10 suggests a statistically significant degree of
that studies do not estimate the true effect because of heterogeneity, which should be explored to identify possible
random error, whereas in the RE model, both random error reasons. The RE model is generally preferred when there is
and heterogeneity affect the pooled estimate of effect. Web significant heterogeneity among studies, for the reasons
Fig. 1 presents the differences between FE and RE models of cited previously.
analysis, using the forest plot presented in Fig. 1.
It may also be worth considering sub-group analysis
Step 5: Examine individual studies. The forest plot shows when significant heterogeneity is evident. Here, studies
the outcome data for each study, its effect (with confi-dence sharing common characteristics are grouped together and
interval), relative weight in the pooled analysis, and a pooled estimates of each sub-group are presented along
pictorial presentation of this data (which is usually a square with the overall estimate. Web Fig. 2 presents an example
whose position represents the effect, size represents the wherein the studies presented in Web Fig. 1 have been split
weight, and a horizontal line through the square represents into two sub-groups based on underlying disease severity. It
the confidence interval). is to be noted that the outcome presented in Web Fig. 2 is
different from that in Web Fig. 1.
Step 6: Examine pooled effect. The pooled effect is
presented numerically as well as graphically. It represents a It should be remembered that studies could have signi-
weighted average estimate of effect. The pictorial ficant heterogeneity if they were so different so as to be non-
representation is with a diamond whose center corres- amenable to pooling in a meta-analysis in the first place.
ponds to the pooled effect, and width represents the
confidence interval. Authors have the option of undertaking sensitivity
analysis of the results of meta-analysis. Here, studies with
A vertical line in the center of the forest plot represents low(er) methodological quality are excluded from the
the line of no effect. In the case of RR and OR, this analysis, and the pooled estimates of effect of only the high-
corresponds to 1.0 and implies that the risk ratio (or odds quality studies are examined. This helps to determine how
ratio) is 1.0, confirming the absence of a difference between ‘sensitive’ the pooled estimates are to the exclusion of
the groups. For mean differences, the line of no effect methodologically lower quality studies. Lower quality
corresponds to zero, confirming that there is no difference studies are prone to higher risk of bias and tend to over-
between the groups. Therefore, it is obvious that confi- estimate the effect of interventions. Results that are not
dence intervals whose bounds (limits) are on the same side sensitive to the exclusion of lower quality studies (meaning
of the line of no effect, suggest a statistically significant that the overall effect remains unchanged, even if the
result, whereas confidence intervals crossing the line of no magnitude changes) are expressed as robust results.

INDIAN PEDIATRICS 326 VOLUME 59__APRIL 15, 2022


JOSEPH L MATHEW 63

Fig. 1 Step-wise interpretation of a forest plot.

Step 8: Interpret the forest plot. The above steps facilitate There are several methods to assess the probability of
interpretation of the pooled estimate of effect of the publication bias in systematic reviews. Begg and Mazumdar
interventions being compared for one specific outcome, in rank correlation test [27] for publication bias correlates the
terms of the parameter used to present the pooled esti-mate ranks of effect sizes (of various studies in the meta-analysis)
and the statistical model used to combine the data. against the ranks of the variance in the treatment effect.
Additionally, this is done considering the number of studies
One of the popular methods to assess publication bias, is
contributing to the pooled estimate, total number of
using funnel plots. This refers to a scatter plot of all the
participants, their individual characteristics and effects,
studies in a meta-analysis with effect size on the x-axis and
methodological quality, and degree of heterogeneity.
standard error on the y-axis. Ideally the plot also shows the
Publication bias: Despite best efforts of systematic review estimated effect size (with confidence intervals) and the
authors to include all relevant studies addressing the predicted effect size (with confidence intervals). The plot
research question, a review may be hampered by the non- also shows a vertical line that runs through the (adjusted)
availability of primary studies. Generally, primary studies combined effect and the corresponding lower and upper
with positive results (i.e. showing evidence of efficacy of bounds of the confidence interval. Such a plot visually
interventions) are more likely to be published than those highlights whether there is asymmetry in the distribution of
showing negative results. This can result in publication bias, the included studies, which hints at publication bias. This
wherein the publication (or non-publication of some approach works only where there are more than ten studies
studies) determines the direction or strength of the overall in the meta-analysis. Egger regression method shows “the
evidence [26]. This is why high quality systematic reviews degree of funnel plot asymmetry as measured by the
make tremendous efforts to search for unpublished intercept from regression of standard normal deviates
literature. against precision” [28].

INDIAN PEDIATRICS 327 VOLUME 59__APRIL 15, 2022


64 SYSTEMATIC REVIEWS AND META-ANALYSIS

When publication bias is suspected, systematic review data to confirm safety. Therefore, the overall decision on
authors should measure the impact of this on the estimated whether to use the intervention may need more information
effect. This can be done using Duval and Tweedle trim and fill than that reported in a systematic review.
technique [29], which mathematically adjusts the pooled
It should be emphasized that evidence-based practice is
effect, accounting for funnel plot asymmetry.
not the mere application of systematic review findings to
In reviews showing efficacy of interventions with patients (healthcare consumers). The best research evidence
publication bias, Rosenthal analysis or the ‘fail-safe N that needs to be integrated with clinical expertise and patient
method’ was used to try and identify the number of additio- values and preferences, to arrive at a shared decision
nal studies (with negative results) that would be needed to (between the healthcare recipient and provider). Thus
make the pooled estimate statistically insignificant [30]. Of paradoxically, a shared decision to not apply the findings of a
course, this depends on making assumptions of data in systematic review, on account of issues related to clinical
unobserved/unpublished studies, hence is itself fraught expertise and/or patient values, is also well-aligned with the
with bias(es). principles of evidence-based healthcare.
Apply (Considerations About Application of the Strengths, limitations and challenges of systematic reviews:
Results of Systematic Reviews) Systematic reviews of well-designed and well-conducted
studies are the keystone of high-quality research evidence.
Both users and producers of systematic reviews have to
The information from systematic reviews can be included in
make value-based judgements on three important issues
development of evidence-based guidelines and recommen-
viz., i) What does the evidence (accessed, assimilated,
dations, health technology assessment, healthcare policy
appraised and analyzed to answer the research question)
decisions, or health payment/reimbursement decisions.
show; ii) What is the quality of the overall evidence and the
However, systematic reviews only provide research
level of confidence that can be placed in it; and iii) Can the
evidence on what works in research settings (referred to as
evidence be considered for use in clinical situations? Careful
efficacy), but not necessarily on what will work in real-world
analysis of these three issues leads to the next and final step
settings (referred to as effectiveness). The gap between
in evidence-informed healthcare practice viz., discussion of
efficacy versus effectiveness, and methods to plug it, are
the evidence with individual patients by healthcare
beyond the scope of this article. Second, users of systematic
personnel with clinical expertise, to arrive at a shared
reviews look for answers to decision questions (exemplified
decision.
by: Shall I use this intervention?) whereas producers of
Several new initiatives have been introduced to help systematic reviews generate answers to research questions
systematic review users make better sense of the data (exemplified by: Does this intervention work?). The
presented. One of these is the Summary of Findings Table difference between answers to research questions and
(SoFT) [31], that shows the absolute as well as relative effect decision questions needs to be clearly understood for
of the intervention (including parameters like number needed appropriate use of systematic reviews in clinical practice.
to treat), the quantity of evidence, and the certainty of
Although systematic reviews include many metho-
available evidence (which is an indirect measure of quality).
dological refinements to reduce bias, they are completely
SoFT are prepared for each of the key outcomes.
dependent on the quantity and quality of the primary studies
Another approach is to grade the evidence quality using available to answer the research question. This can lead to
an approach popularized by the acronym GRADE (Grading the piquant situation where an excellent systematic review
of Recommendations, Assessment, Development and finds limited (or no) evidence, and concludes the need of
Evaluation) [31]. This approach allows systematic review more research. Although this does not diminish the value of
producers and users to apply semi-objective judg-ments on the systematic reviews, it may sometimes be unhelpful for
factors that may limit the quality of evidence in the review. decision-makers.
The key factors used are study limitations (viz., risk of bias),
Despite attempts to minimize bias, certain forms of bias
inconsistency (due to heterogeneity), indirectness,
can creep into systematic reviews. These include publication
imprecision, and publication bias. A detailed explanation of
bias, sponsorship bias (sponsored studies are published
the GRADE approach is outside the scope of this article.
more often, especially when they show signi-ficant results),
Often the various analyses in systematic reviews do not and intentional or unintentional emphasis of systematic
point in the same direction. A common situation is one review authors to highlight only some aspects of the
wherein some measures of efficacy favor one treatment, systematic review [32]. Some of these anticipated biases can
whereas other measures do not. Further, sometimes effica- be addressed by ensuring that the conduct and reporting of
cious interventions may be less safe, or there is insuffi-cient systematic review conform to guidelines established for the

INDIAN PEDIATRICS 328 VOLUME 59__APRIL 15, 2022


JOSEPH L MATHEW 65

Key Messages
• Systematic reviews involve the application of scientific methods to reduce bias in review of literature.
• The key components of systematic reviews can be summarized as: Ask, Access, Assimilate, Appraise, Analyze
and Apply.
• Meta-analysis is a statistical tool that provides pooled estimates of effect from the data extracted from individual
studies included in the review.

purpose. These are exemplified by the PRISMA tool [33,34]. sciences-lit-review/question


PRISMA is an acronym for ‘Preferred Reporting Items for 8. http://www.knowledge.scot.nhs.uk/k2atoolkit/source/
Systematic reviews and Meta-Analyses’. The checklist identify-what-you-need-to-know/spice.aspx
9. Cooke A., Smith D, Booth A. Beyond PICO: The SPIDER
comprises 27 individual items that systematic review authors
tool for qualitative evidence synthesis. Qualitative Health
are expec-ted to report. It also includes a flow chart
Research. 2012;22:1435-43.
summarizing the output of literature search in terms of 10. Booth A, Noyes J, Flemming K, et al. Formulating questions
studies identified, screened (after removal of duplicate to explore complex interventions within quali-tative evidence
publications), eligible for inclusion, those excluded, and synthesis. Accessed October 01, 2020. Available from: https:/
those actually included. Extensions of the original PRISMA /library.nd.edu.au/evidencebased practice/ask/question
tool include PRISMA-P for systematic review protocols, 11. Infolibrarian. Bibliographic databases. Accessed October 01,
PRISMA-IPD for reviews with individual patient data, and 2020. Available from: http://infolibrarian.com/edb.html
PRISMA-NMA for network meta-analyses. 12. E-Resources for China Studies. Accessed October 01, 2020.
Available from: http://www.wanfangdata.com
Finally, users of systematic reviews should not blindly 13. LILACS, health information from Latin America and the
believe everything presented in the review, but learn to Caribbean countries. Accessed October 01, 2020. Available
critically appraise systematic reviews for validity, from: https://lilacs.bvsalud.org/en/
significance and applicability. Standard tools and checklists 14. OpenGrey. System for information on grey literature in
Europe. Accessed October 01, 2020. Available from: http://
available for the purpose can be very helpful [35]. Last but
www.opengrey.eu
not the least, readers of Indian Pediatrics may benefit from 15. Sterne JAC, Savoviæ J, Page MJ, et al. RoB 2: A revised tool
the Journal Club section wherein systematic reviews have for assessing risk of bias in randomised trials. BMJ.
been critically appraised from time to time. 2019;366:l4898.
16. Cochrane Training. RevMan 5. Accessed October 01, 2020.
Note: Additional material related to this paper is available with the
Available from: https://training.cochrane.org/online-learning/
online version at www.indianpediatrics.net
core-software-cochrane-reviews/revman/revman-5-down
REFERENCES load.
17. Wells GA, Shea B, O’Connell D, et al. The Newcastle-Ottawa
1. Sackett D, Strauss S, Richardson W, et al. Evidence-Based Scale (NOS) for assessing the quality of non-randomised
Medicine: How to practice and teach EBM. 2nd ed. Churchill studies in meta-analyses. Accessed October 02, 2020.
Livingstone: 2000. Available from: http://www.ohri.ca/programs/clinical-epide
2. Cook DJ, Mulrow CD, Haynes RB. Systematic reviews: miology/oxford.asp
Synthesis of best evidence for clinical decisions. Ann Intern 18. Stang A. Critical evaluation of the Newcastle-Ottawa scale for
Med. 1997;126:376-80. the assessment of the quality of nonrandomized studies in
3. PennState Eberley College of Science. Lesson 4: Bias and meta-analyses. Eur J Epidemiol. 2010;25:603-5.
Random Error. Accessed October 01, 2020. Available from: 19. Sterne JA, Hernan MA, Reeves BC, et al. ROBINS-I: A tool
https://online.stat.psu.edu/stat 509/node/26/ for assessing risk of bias in non-randomised studies of
4. Comprehensive Meta-analysis. Accessed October 01, 2020. interventions BMJ. 2016;355:i4919.
Available from: https://www.meta-analysis.com/pages/why_ 20. University of Bristol. QUADAS-2. Accessed October 01,
do.php?cart= 2020. Available from: https://www.bristol.ac.uk/population-
5. National Institute for Health Research. PROSPERO health-sciences/projects/quadas/quadas-2/
International prospective register of systematic reviews. 21. OHAT Risk of Bias Rating Tool for Human and Animal
Accessed October 01, 2020. Available from: https:// Studies. Accessed February 27, 2021. Available from: https://
utas.libguides.com/SystematicReviews/Protocol ntp.niehs.nih.gov/ntp/ohat/pubs/riskofbiastool _508.pdf
6. Mathew JL, Singh M. Evidence based child health: Fly but 22. Hooijmans CR, Rovers MM, De Vries RB, et al. SYRCLE’s
with feet on the ground! Indian Pediatr. 2008;45:95-8. risk of bias tool for animal studies. BMC Med Res Meth.
7. Virginia Commonwealth University. How to conduct a 2014;14:1-9.
literature review (Health Sciences). Accessed October 01, 23. Cochrane Methods. About IPD meta-analyses. Accessed
2020. Available from: https://guides.library. vcu.edu/health- October 01, 2020. Available from: https://methods.

INDIAN PEDIATRICS 329 VOLUME 59__APRIL 15, 2022


66 SYSTEMATIC REVIEWS AND META-ANALYSIS

cochrane.org/ipdma/about-ipd-meta-analyses results. Psycholog Bulletin. 1979;86:638-41.


24. Borenstein M, Hedges L, Rothstein H. Meta-analysis. Fixed 31. Schünemann HJ, Higgins JPT, Vist GE, et al. Chapter 14:
effect vs. random effects. Accessed October 01, 2020. Completing ‘Summary of findings’ tables and grading the
Available from: https://www.meta-analysis.com/downloads/ certainty of the evidence. Accessed October 01, 2020.
M-a_f_e_v_r_e_sv.pdf Available from: www.training.coachrane.org/handbook/
25. Heterogeneity in Meta-analysis. Accessed October 01, 2020. current/chapter-14
Available from: https://www.statsdirect.com/help/meta_analy 32. Drucker AM, Fleming P, Chan AW. Research techniques
sis/heterogeneity.htm made simple: Assessing risk of bias in systematic reviews. J
26. Dalton JE, Bolen SD, Mascha EJ. Publication bias: The Invest Dermatol. 2016;136:e109-e14.
elephant in the review. Anesth Analg. 2016;123:812-3. 33. Preferred Reporting Items for Systematic reviews and Meta-
27. Begg CB, Mazumdar M. Operating characteristics of a rank Analyses (PRISMA). Accessed October 01, 2020. Available
correlation test for publication bias. Biometrics. 1994; from: http://www.prisma-statement.org
50:1088-101. 34. Moher D, Liberati A, Tetzlaff J, Altman DG, The PRISMA
28. Egger M, Smith GD, Schneider M, et al. Bias in meta- Group. Preferred Reporting Items for Systematic reviews and
analysis detected by a simple, graphical test. BMJ. Meta-Analyses: The PRISMA Statement. PLoS Med.
1997;315:629-34. 2009;6:e1000097.
29. Duval S, Tweedie R. Trim and fill: A simple funnel-plot–based 35. Critical Appraisal Skills Programme. 10 questions to help you
method of testing and adjusting for publication bias in meta- make sense of a Systematic review. Accessed October 01,
analysis. Biometrics. 2000;56:455-63. 2020. Available from: https://casp-uk.net/wp-content/uploads/
30. Rosenthal R. The file drawer problem and tolerance for null 2018/01/CASP-Systematic-Review-Checklist_ 2018.pdf

INDIAN PEDIATRICS 330 VOLUME 59__APRIL 15, 2022

You might also like