0% found this document useful (0 votes)
1 views

susin2010

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

susin2010

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Understanding and Interpreting Systematic

Review and Meta-Analysis Results 3


Cristiano Susin, Alex Nogueira Haas,
and Cassiano Kuchenbecker Rösing

evidence from systematic reviews reaches mainstream


Core Message
health care only when they are adopted or endorsed by
›› Here, we provide an overview of the methods professional associations/societies and governmental
used to combine the results of several studies. bodies. In an evidence-based era, it is interesting to
Specifically, we discuss the application and note that some of the journals with higher impact in
interpretation of meta-analytic methods. medicine and dentistry are still based on narrative
reviews written by invited authorities. This underlines
the fact that most health care providers have trouble
understanding one of the most important sources of
evidence. In this context, the aim of the present chapter
is to provide an overview of the methods used to
3.1 Introduction ­combine the results of several studies. We will focus
on the application and interpretation of meta-analytic
methods.
Systematic reviews and meta-analysis have become
First, we would like to acknowledge that system-
the de facto gold standard in evidence-based health
atic reviews and meta-analyses are not easy topics for
care. Nevertheless, most health care providers do not
most readers. This is especially true for health care
have a clear understanding of how systematic reviews
providers who focused most of their efforts on learn-
and meta-analyses are conducted and how to interpret
ing biology-related subjects instead of mathematical
their results. This fact greatly hinders the application
concepts. As a consequence, most researchers do not
and dissemination of evidence that could have an
like statistics-related topics, most professionals do not
important impact on the population health. Frequently,
use it in their appraisal of the medical literature, and
majority of the students are not willing to learn it. This
is an unfortunate truth with known causes and
C. Susin (*) ­consequences. Our approach to try to explain these
Laboratory for Applied Periodontal and Craniofacial concepts will be as intuitive as possible and we will
Regeneration, Department of Periodontics and Oral Biology, try to avoid the classic mathematical approach when-
Medical College of Georgia School of Dentistry,
1459, Laney Walker Blrd,
ever possible.
Augusta, GA 30912, USA It is beyond the scope of this chapter to review all
e-mail: [email protected]/[email protected] steps of a systematic review. Thus, we will assume that
A.N Haas the necessary steps to carry out a systematic review
C.K Rösing have been fulfilled (identification of the need for the
School of Dentistry, Federal University of Rio Grande do Sul, review, preparation of a review protocol, identification
Rua Ramiro Barcelos, 2492, 90035-003,
and selection of the studies, quality assessment and
Porto Alegre, RS, Brazil
e-mail: [email protected] data collection, etc.), and we will focus on the analysis
e-mail: [email protected] and presentation of the results.

F. Chiappelli et al. (eds.), Evidence-Based Practice: Toward Optimizing Clinical Outcomes, 35


DOI: 10.1007/978-3-642-05025-1_3, © Springer-Verlag Berlin Heidelberg 2010
36 C. Susin et al.

3.1.1 Example: Studies Characteristics very inconsistent, with few studies showing large posi-
and Descriptive Results tive or negative effects for the therapy when compared
to placebo. In contrast, large studies do not show major
To illustrate this chapter, let us imagine that we are differences between experimental groups. As expected,
conducting a systematic review about the effect of a variability is larger in smaller studies due to the sample
new antiviral therapy for recurrent herpes labialis. For size effect on standard deviation estimates. Only the
simplicity, our main outcome will be reduction in the first two studies reached statistical significance, and in
number of days with pain, i.e., a continuous outcome. three other studies somewhat borderline results were
Also for simplicity, let us assume that all studies used found (p ~ 0.10).
placebo as the control group.
Most systematic reviews use tables to present the
methodological characteristics and outcomes of the
3.2 Main Results: Overall
selected studies. For example, the success of second-
ary root canal treatment was investigated in a system- Estimates of Effect
atic review published by Ng et al. [7]. The search
strategy identified 40 studies, of which 17 were The treatment effect could be estimated by calculating
included in the analyses. Table 3.1 describes the meth- an overall mean of the results simply by summing up the
odological characteristics and outcomes of the 17 individual results dividing by the number of studies.
included studies. The methodological characteristics This approach, although very intuitive, would not take
of the studies facilitate the reader interpretation of the into consideration the studies characteristics, with stud-
meta-analysis results. Other study characteristics fre- ies contributing equally to the overall estimate. Looking
quently reported are sample characteristics, random- at the estimates in Table 3.2, it is obvious that some
ization method, blindness of patients, therapists and studies have more precise estimates than others. Factors
examiner, follow-up time, and dropout rate. that may affect the precision of the estimates are vari-
In addition to the methodological characteristics, most ous, including sample characteristics, sample size, mea-
systematic reviews present the original results in descrip- surement precision, and reliability. In the meta-analysis
tive forms using tables and graphs. Table 3.2 combines framework, sample size is often the most important fac-
study characteristics and results for our systematic review tor to be taken into consideration. Thus, overall esti-
describing 12 studies that tested the effect of our new mates should take into account the sample size with
antiviral therapy. The table presents the year of publica- larger studies contributing more than small studies.
tion, total sample size, source of funding, sample size in Mathematically, this can be accomplished by multiply-
each experimental group (n), estimate of the intervention ing each study estimate by the sample size or, in other
effect (mean), an estimate of the intervention variability words, by weighting the estimates according to sample
(standard deviation – SD), and the p-value. size. The sum of the estimates can then be divided by
The overall trend of the studies is used to suggest if the total sample size. Table 3.3 shows the weight of each
a given intervention is better than the standard treat- study of our example according to sample size. Using
ment or no intervention when only descriptive tables this approach, the mean reduction in days with pain
are used to present results. This approach is very intui- would be 2.8 days for the treatment group and 3.1 days
tive and does not need any statistical expertise to be for the control group. Thus, the placebo treatment
conducted. However, as you will later see in this chap- reduced in approximately 0.3 days patients’ symptoms.
ter, it can be misleading for several reasons. An overall In essence, this is what is done in a meta-analysis to
assessment of Table 3.2 indicates that between 1997 take into consideration the contribution of each study.
and 2002, mostly small studies were conducted, with A similar strategy can be used to account not only for
large studies being published only in the last 4 years. the sample size, but also for the variability in the esti-
This finding is consistent with most new therapy stud- mates of the original studies. The overall weighted
ies since large, costly, time-consuming studies are only estimate is calculated multiplying each study estimate
conducted after some evidence of positive effect and by the inverse of the square of the standard error
safety is available. A closer look at the results of the (inverse-variance weighting method), which is highly
studies shows that within small studies the results are associated with the sample size of the study. Using this
3

Table 3.1 Methodological characteristics and outcomes of included studies by Ng et al. [7]
Study Operator Design Recall Sample Unit of Assessment Radiographic ³4 years Calibration Reliability Statistical
authors rate size measure of success criteria after test analysis
(%) success treatment
Grahnen UG R 64 502 Ro C&R S  – – –
Engstrom UG R 72 153 T C&R L  – – X2
Selden Sp R 20 52 T C&R L – – – X2
Bergneholtz UG R 66 556 Ro C&R S – –  –
Pekruhn Sp R 81 36 T C&R S – – – X2
Molven UG R 50 226 Ro Ra S    X2
Allen – R 53 315 T C&R S –  – X2
Sjogren UG R 46 267 Ro C&R S    LR
Van – R – 612 Ro C&R S –   X2
Nieuwenhuysen
Friedman Sp C 78 128 T C&R S – – – X2
Danin Sp RCT 100 18 T Ra L –   X2
Sundqvist UG C 93 50 T C&R S  – – X2
Understanding and Interpreting Systematic Review and Meta-Analysis Results

Chugal PG R 75 85 Ro Ra S  – – LR
Hoskinson Sp R 78 76 Ro C&R S    GEE
Farzaneh Sp C 22 103 T C&R S –   LR
Gorni PG C 94 452 T C&R S    M–W
Çaliskan Sp R 96 86 T C&R S –   X2
“–” missing information; UG undergraduate students; PG postgraduate students; Sp specialist endodontists; R retrospective study; C prospective cohort study; RCT randomized
controlled trial; T teeth; Ro root; C&R combined clinical and radiographic examination; Ra radiographic examination only; S strict criteria; L loose criteria; LR single level logistic
regression; GEE generalized estimating equations; X2 chi-square test; M-W Mann-whitney U-test
37
38 C. Susin et al.

Table 3.2 Description of study characteristics and original results


Year of Sample size Source Treatment Control p-value
publication of funding n Mean SD n Mean SD
1997 42 Private 21 2.0 1.7 21 3.9 2.1 0.003
1998 31 Private 16 1.8 2.4 15 3.8 2.7 0.04
1998 44 Private 22 2.1 2.6 22 3.5 2.8 0.09
1999 33 Public 18 3.3 2.7 15 1.8 2.4 0.11
2001 30 Private 14 3 3.2 16 2.1 2.9 0.43
2001 29 Private 13 2.1 2.9 16 3.2 3.2 0.35
2002 27 Public 13 2.9 2.9 14 2.5 2.5 0.70
2002 31 Private 15 2.5 2.5 16 3 2.9 0.61
2005 190 Public 96 2.9 2.1 94 3.1 2.5 0.55
2007 80 Public 39 3.5 2.6 41 3.1 2.2 0.46
2007 145 Public 73 3.2 2.4 72 2.9 1.9 0.41
2008 394 Public 198 2.8 1.7 196 3.1 1.8 0.09

Table 3.3 Study weights according to sample size and ­inverse- the greatest precision (smaller standard deviations and
variance methods confidence intervals). Studies with lower variability
Year of Weight based on Weight based on receive greater weight and therefore have greater influ-
publication sample size (%) inverse-variance
ence in the estimate.
method (%)
Tables and graphs are popular ways of presenting
1997 3.9 4.6
the results of a meta-analysis. Table 3.4 presents the
1998 2.9 1.9 WMD and the 95% confidence interval for each study.
1998 4.1 2.4 The weighted mean provides an estimate and direction
of the effect, and the confidence interval provides an
1999 3.1 2.0
assessment of the variability of the estimates.
2001 2.8 1.3 Confidence intervals also indicate the significance of
2001 2.7 1.2 the results and when it does not include zero (or
2002 2.5 1.5 1 when the results are presented in odds ratio), the
weighted mean is statistically significant.
2002 2.9 1.7
2005 17.7 14.2
2007 7.4 5.5 3.3 Forest Plots
2007 13.5 12.4
2008 36.6 51.3 Figure 3.1 is a Forest plot of the results and has essen-
Total 100 100
tially the same information presented in Table 3.4.
Studies are identified by their year of publication and
sample size on the left side of the graph. The WMDs
approach, the weighted mean difference (WMD) are presented in a graphical form with point estimates
between treatments is 0.26 mm in favor of the new being presented as dots or short vertical lines and con-
antiviral therapy. Table 3.3 also shows the weight fidence intervals as horizontal lines. The size of the
attributed to each study according to the inverse-­ plotting symbol for the estimate is proportional to the
variance method. It is clear that the study published in weight of each study in the meta-analysis. The actual
2008 dominates the overall estimate not only because estimates are also presented on the right side together
it has the largest sample size, but also because it has with the weight of the study. The overall estimate and
3 Understanding and Interpreting Systematic Review and Meta-Analysis Results 39

Table 3.4 Meta-analysis result using the inverse-variance method


Year Sample size Weighted mean difference 95% CI Weight (%)
Lower Upper
1997 42 −1.9 −3.056 −0.744 4.6
1998 31 −2 −3.803 −0.197 1.89
1999 44 −1.4 −2.997 0.197 2.41
1998 33 1.5 −0.241 3.241 2.03
2001 30 0.9 −1.297 3.097 1.27
2001 29 −1.1 −3.323 1.123 1.24
2002 27 0.4 −1.649 2.449 1.46
2002 31 −0.5 −2.403 1.403 1.7
2005 190 −0.2 −0.857 0.457 14.21
2008 80 0.4 −0.658 1.458 5.48
2006 145 0.3 −0.404 1.004 12.38
2007 394 −0.3 −0.646 0.046 51.33
Pooled weighted mean difference −0.257 −0.504 −0.009 100
Significance test of weighted mean p = 0.042
difference = 0

confidence interval are marked by a diamond. A dotted information that the Forest plot does not present is the
vertical line is used to present the overall estimate. p-value for the overall estimate (p = 0.042).
Since Table 3.4 and Fig. 3.1 have the same infor-
mation, most publications present only the Forest plot.
Looking at Fig. 3.1, it is easier to observe that the first
two studies had a significant large effect in favor of the 3.4 Exploring Heterogeneity
new therapy since both estimates are on the left side
and the confidence interval does not include zero. No To further explore heterogeneity, let us try to look into
clear tendency is seen in the next six studies with half the sample size effect. We stratified the studies into
of them favoring therapy and the other half favoring small and large sample sizes. Figure 3.2 presents the
control. It is important to note that the confidence Forest plot with estimates for each stratum. Small stud-
intervals include zero for all studies. The overall ies showed a significant effect in favor of the therapy
WMD estimate is clearly dominated by the last study. with antiviral treatment reducing the number of days in
The last information in Fig. 3.1 is the I-square (I2) pain in 0.8 days (p = 0.01). In contrast, no significant
statistic. The I2 statistic represents the percentage of effect was observed in large studies since the confi-
heterogeneity that can be attributed to variability dence interval includes zero (p = 0.29). An overall test
between studies. The I2 statistic varies between 0 and for heterogeneity between small and large studies is
100% and can be interpreted as follows: low heteroge- significant (p = 0.05). It is interesting to notice that the
neity for <50%, moderate heterogeneity for ³50 – <75%, I2 statistic for the small sample size shows moderate
and high heterogeneity for ³75%. In this example, the heterogeneity (56.8%, p = 0.02) indicating that other
I2 statistic is approximately 53%, which indicates mod- factors may further explain these results. We will
erate heterogeneity. This finding can be explained by address this finding later on.
the inconsistent results of the small studies published Let us try to explore the heterogeneity of the data
between 1997 and 2002. The I2 statistic is statistically even more. Figure 3.3 is the Forest plot using a fixed-
significant with a p-value of 0.016, further indicating effect model stratified by funding source: public or pri-
that there is heterogeneity in the results. The only vate. For public-funded studies, the WMD is 0.10
40 C. Susin et al.

Effect of antiviral treatment on pain reduction

Year n WMD (95% CI) Weight


(%)

1997 42 −1.90 (−3.06, −0.74) 4.60


1998 31 −2.00 (−3.80, −0.20) 1.89
1998 44 −1.40 (−3.00, 0.20) 2.41
1999 33 1.50 (−0.24, 3.24) 2.03
2001 30 0.90 (−1.30, 3.10) 1.27
2001 29 −1.10 (−3.32, 1.12) 1.24
2002 27 0.40 (−1.65, 2.45) 1.46
2002 31 −0.50 (−2.40, 1.40) 1.70
2005 190 −0.20 (−0.86, 0.46) 14.21
2007 80 0.40 (−0.66, 1.46) 5.48
2007 145 0.30 (−0.40, 1.00) 12.38
2008 394 −0.30 (−0.65, 0.05) 51.33
Overall (I-squared = 52.8%, p = 0.016) −0.26 (−0.50, −0.01) 100.00

−4 0 4
Therapy reduces pain Therapy does not reduce pain

Fig. 3.1 Forest plot showing effect estimates and confidence intervals for individual studies and meta-analysis (fixed-effect model)

(p = 0.46, not reported in the Forest plot) in favor of the overall estimate represents the same underlying effect
antiviral therapy, whereas for private-funded studies the and that differences between studies are due to sam-
antiviral therapy reduces pain, in average, in 1.29 days pling error, i.e., individual studies have the same single
(p < 0.001, not reported in the Forest plot). The heteroge- effect. The random-effects model includes an estimate
neity test between groups is highly significant, also indi- of between-study variability assuming that the meta-
cating that funding is an important source of variability. analysis overall estimate is the mean effect around
which individual studies have a normal distribution. In
other words, random-effects models assume that the
intervention is not the only explanation for the overall
3.5 Fixed-Effects vs. Random-Effects estimate allowing for other factors (such as study
design, sample characteristics, and treatment differ-
So far we have found two possible sources of heteroge- ences) to partly explain the results.
neity indicating that these studies may have different In practice, random-effects models yield more con-
characteristics. We have used what is called a fixed- servative estimates with lower p-values and larger con-
effect model to combine studies in a meta-analysis. fidence intervals than fixed-effect models. Disparities
When heterogeneity between studies exists, a different in the overall WMD between treatments can also be
approach called random-effects model should be used. seen due to the fact that random-effects models give
The fixed-effect model assumes that the meta-analysis greater weight to smaller studies than fixed-effect
3 Understanding and Interpreting Systematic Review and Meta-Analysis Results 41

Effect of antiviral treatment on pain reduction

Year n WMD (95% CI) Weight


(%)
Small
1997 42 −1.90 (−3.06, −0.74) 4.60
1998 31 −2.00 (−3.80, −0.20) 1.89
1998 44 −1.40 (−3.00, 0.20) 2.41
1999 33 1.50 (−0.24, 3.24) 2.03
2001 30 0.90 (−1.30, 3.10) 1.27
2001 29 −1.10 (−3.32, 1.12) 1.24
2002 27 0.40 (−1.65, 2.45) 1.46
2002 31 −0.50 (−2.40, 1.40) 1.70
Subtotal (I-squared = 56.8%, p = 0.023) −0.80 (−1.41, −0.20) 16.59

Large
2005 190 −0.20 (−0.86, 0.46) 14.21
2007 80 0.40 (−0.66, 1.46) 5.48
2007 145 0.30 (−0.40, 1.00) 12.38
2008 394 −0.30 (−0.65, 0.05) 51.33
Subtotal (I-squared = 10.5%, p = 0.340) −0.15 (−0.42, 0.12) 83.41

Heterogeneity between groups: p = 0.054


Overall (I-squared = 52.8%, p = 0.016) −0.26 (−0.50, −0.01) 100.00

−4 0 4
Therapy reduces pain Therapy does not reduce pain

Fig. 3.2 Forest plot showing effect estimates and confidence intervals for individual studies and meta-analysis stratified by study
sample size (fixed-effect model)

models (Table 3.5). As can be seen in Fig. 3.4, the for the inexperienced reader because different models
overall WMD using the random-effects model is may have opposite results. As a rule of thumb, if the I2
slightly different than the estimate using the fixed- statistic is moderate or high (>50%) and the p-value is
effect model (0.30 vs. 0.26). However, the major dif- significant (p < 0.05), a random-effects model should
ference can be seen in the confidence interval that now be used. In our example, a random-effects model is
includes zero, and therefore, is associated with a non- warranted.
significant p-value (p = 0.22). In other words, when the
heterogeneity is taken into consideration in the calcu-
lation of the estimates, no overall significant differ-
ences were observed between treatments with regard 3.6 Meta-Regression
to pain reduction. This is in contrast to the conclusion
that could be drawn from the fixed-effect model. Stratified analysis is an important tool for detecting
Sometimes researchers present the Forest plot of the heterogeneity, but has the same drawbacks of subgroup
fixed-effect model and include the random-effects esti- analysis in clinical trials. A better approach to evaluate
mate for comparison (Fig. 3.5). This may be confusing between-group difference is to use a meta-regression
42 C. Susin et al.

Effect of antiviral treatment on pain reduction

Year n WMD (95% CI) Weight


(%)
Public
1999 33 1.50 (−0.24, 3.24) 2.03
2002 27 0.40 (−1.65, 2.45) 1.46
2005 190 −0.20 (−0.86, 0.46) 14.21
2007 80 0.40 (−0.66, 1.46) 5.48
2007 145 0.30 (−0.40, 1.00) 12.38
2008 394 −0.30 (−0.65, 0.05) 51.33
Subtotal (I-squared = 28.0%, p = 0.225) −0.10 (−0.37, 0.17) 86.90

Private
1997 42 −1.90 (−3.06, −0.74) 4.60
1998 31 −2.00 (−3.80, −0.20) 1.89
1998 44 −1.40 (−3.00, 0.20) 2.41
2001 30 0.90 (−1.30, 3.10) 1.27
2001 29 −1.10 (−3.32, 1.12) 1.24
2002 31 −0.50 (−2.40, 1.40) 1.70
Subtotal (I-squared = 19.2%, p = 0.288) −1.29 (−1.98, −0.61) 13.10

Heterogeneity between groups: p = 0.001


Overall (I-squared = 52.8%, p = 0.016) −0.26 (−0.50, −0.01) 100.00

−4 0 4
Therapy reduces pain Therapy does not reduce pain

Fig. 3.3 Forest plot showing effect estimates and confidence intervals for individual studies and meta-analysis stratified by source
of funding (fixed-effect model)

or meta-analysis regression. For those familiar with risk of oral cancer by 49% (pooled odds ratio 0.51
regression analysis, a meta-regression could be thought 95%CI 0.40–0.65). They found a significant heteroge-
as a regression analysis performed at the study-level, neity across studies. To additionally explore heteroge-
i.e., using study-level data instead of individual-level neity, a meta-regression analysis was performed. This
data. Table 3.6 shows the result of the random-effects meta-regression analysis examined the effect of cer-
meta-regression using sample size and source of fund- tain variables, such as quality score, type of cancers
ing as explanatory variables. As observed before in the included, citrus fruit and green vegetable consump-
stratified analysis, both factors were significant sources tion, population studied (men, women, or both), and
of heterogeneity and funding seems to have the biggest time interval for dietary recall, on the role of fruit or
impact on the effect estimates. vegetable consumption in the risk of oral cancer.
Similarly, Pavia et al. [8] conducted a meta-analysis Table 3.7 shows the results for the meta-regression
of observational studies about the contribution of fruit analysis, demonstrating that the lower risk of oral can-
and vegetable intakes to the occurrence of oral cancer. cer associated with fruit consumption was significantly
They included 16 studies and found that each portion influenced by the type of fruit consumed and by the
of fruit consumed per day significantly reduced the time interval of dietary recall.
3 Understanding and Interpreting Systematic Review and Meta-Analysis Results 43

Table 3.5 Study weights according to fixed- and random-effects 3.7 Funnel Plots and Publication Bias
methods
Year of Fixed-effect Random-effects
publication model (%) model Another important issue in meta-analysis is publication
bias. Publication bias arises from the fact that studies
1997 4.6 9.24
with statistically significant results are more likely to
1998 1.9 5.16 be reported by authors and accepted for publication.
1998 2.4 6.15 Consequently, there is a risk that meta-analysis esti-
1999 2.0 5.43
mates are positively biased. It should be remembered
that some publication bias might be diminished during
2001 1.3 3.78 the search strategy, looking for grey literature (unpub-
2001 1.2 3.71 lished data). Graphical and statistical methods have
2002 1.5 4.23 been developed to assist in the identification of publica-
tion bias. The Funnel plot is the most commonly used
2002 1.7 4.75
graphic to investigate bias in meta-­analysis. Funnel
2005 14.2 14.73 plots are scatterplots of each study treatment effect
2007 5.5 10.14 (i.e., WMD) by a measure of the study precision (i.e.,
standard error of the treatment effect). Figure 3.6 shows
2007 12.4 14.13
the Funnel plot of the present data. The WMD is plot-
2008 51.3 18.54 ted in the horizontal axis (x-axis) and the standard error

Effect of antiviral treatment on pain reduction

Year n WMD (95% CI) Weight


(%)

1997 42 −1.90 (−3.06, −0.74) 9.24

1998 31 −2.00 (−3.80, −0.20) 5.16

1998 44 −1.40 (−3.00, 0.20) 6.15


1999 33 1.50 (−0.24, 3.24) 5.43

2001 30 0.90 (−1.30, 3.10) 3.78

2001 29 −1.10 (−3.32, 1.12) 3.71

2002 27 0.40 (−1.65, 2.45) 4.23

2002 31 −0.50 (−2.40, 1.40) 4.75

2005 190 −0.20 (−0.86, 0.46) 14.73

2007 80 0.40 (−0.66, 1.46) 10.14

2007 145 0.30 (−0.40, 1.00) 14.13

2008 394 −0.30 (−0.65, 0.05) 18.54

Overall (I-squared = 52.8%, p = 0.016) −0.30 (−0.77, 0.17) 100.00

NOTE: Weights are from random effects analysis

−4 0 4
Therapy reduces pain Therapy does not reduce pain

Fig. 3.4 Forest plot showing effect estimates and confidence intervals for individual studies and meta-analysis stratified using a
random-effects model
44 C. Susin et al.

Effect of antiviral treatment on pain reduction

Weight (%)

Year n WMD (95% CI) (I-V)

1997 42 −1.90 (−3.06, −0.74) 4.60


1998 31 −2.00 (−3.80, −0.20) 1.89
1998 44 −1.40 (−3.00, 0.20) 2.41
1999 33 1.50 (−0.24, 3.24) 2.03
2001 30 0.90 (−1.30, 3.10) 1.27
2001 29 −1.10 (−3.32, 1.12) 1.24
2002 27 0.40 (−1.65, 2.45) 1.46
2002 31 −0.50 (−2.40, 1.40) 1.70
2005 190 −0.20 (−0.86, 0.46) 14.21
2007 80 0.40 (−0.66, 1.46) 5.48
2007 145 0.30 (−0.40, 1.00) 12.38
2008 394 −0.30 (−0.65, 0.05) 51.33
I-V Overall (I-squared = 52.8%, p = 0.016) −0.26 (−0.50, −0.01) 100.00
D+L Overall −0.30 (−0.77, 0.17)

−4 0 4
Therapy reduces pain Therapy does not reduce pain

Fig. 3.5 Forest plot showing effect estimates and confidence intervals for individual studies and meta-analysis stratified using a
fixed-effect and random-effects model

Table 3.6 Meta-regression analysis using study sample size is plotted on the left and right sides of this reference
and source of funding as explanatory variables line. The two doted diagonal lines represent the 95%
Variable Coefficient SE p-value confidence limits for the Funnel plot. In the absence of
Sample size −0.66 0.28 0.04 bias and heterogeneity, 95% of the studies should lie
within the confidence limits lines. Two out of 12 (17%)
Funding −2.06 0.52 0.003
studies are outside the confidence limits, further pro-
viding evidence of heterogeneity and perhaps bias.
is plotted in the vertical axis (y-axis). Larger studies A clear example of asymmetric Funnel plot using
will often concentrate in the upper part of the Funnel our data could be created by removing four studies
plot because their standard error is generally smaller with effects favoring the control treatment. In Fig. 3.7,
than smaller studies. For instance, the standard error it can be easily seen that small studies (generally shown
for the three largest studies (sample sizes: 145, 190 and on the bottom part of the plot) with negative results are
394) ranged between 0.18 and 0.36, whereas that for missing, which may indicate that they were never
three smallest (sample sizes: 27, 29 and 30) ranged reported or accepted for publication.
between 1.05 and 1.13. A vertical solid line represent- Formal approaches to test Funnel plot asymmetry
ing the overall WMD provides a reference for symme- have been proposed and implemented in statistical
try. A similar number of studies should be on both sides softwares. The Egger test uses a linear regression to
of this line. In our example, the same number of studies draw a straight-line relationship between the WMD
3 Understanding and Interpreting Systematic Review and Meta-Analysis Results 45

Table 3.7 Meta-regression conducted by Pavia et al. [8] and standard errors. When this regression line is plot-
Variable Regression SE p ted in the Funnel plot, it will appear as a vertical line
coefficient as can be seen in Fig. 3.8. If asymmetry is present, the
Fruit regression line will be plotted away from the vertical
Only citrus fruit −1.53 0.56 0.006 and the slope of the line will indicate the direction of
(no = 0; yes = 1) bias (Fig. 3.9). The Egger’s bias coefficient provides a
Dietary recall 0.63 0.3 0.04
measure of the asymmetry. The Egger’s bias coeffi-
(lifelong = 0, 2 cient and its p-value for Fig. 3.7 are small (coefficient:
years = 1, 1 year = 2) −0.18, SE: 0.75, p = 0.81), indicating small chance of
Population studied bias. On the other hand, the bias coefficient for Fig. 3.8
is larger with a p-value approaching significance
Men and women = 0 0 – –
(coefficient: −1.42, SE: 0.81, p = 0.13), indicating
Only women = 1 −1.06 1.07 0.33 some evidence of bias. A negative bias coefficient
Only men = 2 0.01 0.56 0.99 indicates that the effect estimated from the smaller
Study quality score −0.32 0.54 0.56
studies is smaller than the effect estimated from the
(low = 0, high = 1) larger studies. This may be interpreted as evidence
that small sample size studies with nonsignificant
Vegetables
results were not included in the meta-analysis. In gen-
Only green vegetables −0.23 0.43 0.59 eral, bias tests for Funnel plots have lower power;
Dietary recall (life- −0.03 0.21 0.88 thus, lower p-values should be carefully considered
long = 0, 2 years = 1, 1 especially when less than ten studies are included in
year = 2)
the analysis.
Population studied Even though we have focused on publication bias,
Men and women = 0 0 – – Funnel plot asymmetry can be explained by other rea-
sons such as poor study quality, true study heterogene-
Only women = 1 1.14 0.73 0.12
ity, and chance. As discussed before, study quality can
Only men = 2 0.25 0.64 0.69 be addressed during study selection, and quality assess-
Study quality score 0.23 0.47 0.63 ment and heterogeneity can be evaluated by stratified
(low = 0, high = 1) analysis and meta-regression.

Funnel plot with pseudo 95% confidence limits


0
Standard error of estimates
1.5

Fig. 3.6 Funnel plot of the


weighted mean difference
(WMD) against its standard −3 −2 −1 0 1 2 3
error showing a symmetric
Weighted mean difference
distribution of studies
46 C. Susin et al.

Fig. 3.7 Funnel plot of the Funnel plot with pseudo 95% confidence limits
WMD against its standard
error showing an asymmetric

0
distribution of studies (four
studies with effects favoring
the control treatment were
removed)

Standard error of estimates


1.5

−3 −2 −1 0 1 2 3
Weighted mean difference

Funnel plot with pseudo 95% confidence limits


0
Standard error of estimates
1 .5

Fig. 3.8 Funnel plot of the


WMD against its standard −3 −2 −1 0 1 2 3
error and Egger regression Weighted mean difference
line
3 Understanding and Interpreting Systematic Review and Meta-Analysis Results 47

Fig. 3.9 Funnel plot of the Funnel plot with pseudo 95% confidence limits
WMD against its standard

0
error and Egger regression
line. (four studies with effects
favoring the control treatment
were removed)

Standard error of estimates


1 .5

−3 −2 −1 0 1 2 3
Weighted mean difference

Table 3.8 Meta-analysis results after omitting one study at a


3.8 Exploring Influential Studies time
Omitted study Weighted 95% CI
Sometimes a single study has a great impact in the Year Sample mean
size difference Lower Upper
estimates. Table 3.8 shows the WMD and 95% confi-
dence intervals for the meta-analysis when one study 1997 42 −0.18 −0.43 0.08
is omitted at a time. Among the studies that showed 1998 31 −0.22 −0.47 0.03
large positive effect for the antiviral therapy, the first
1999 44 −0.23 −0.48 0.02
study published in 1997 has the greatest impact in the
WMD. Omitting this study from the meta-analysis 1998 33 −0.29 −0.54 −0.04
would change the WMD from −0.26 days to −0.18 2001 30 −0.27 −0.52 −0.02
days. A similar but contrary effect would be observed 2001 29 −0.25 −0.50 0.00
if the 2006 study was omitted. In this case, the WMD
2002 27 −0.27 −0.52 −0.02
would change from −0.26 days to −0.34 days. The
impact of a single study in the overall estimate is 2002 31 −0.25 −0.50 0.00
dependent upon the effect size and sample size. The 2005 190 −0.27 −0.53 0.00
study with largest influence on the confidence inter-
2008 80 −0.29 −0.55 −0.04
vals (i.e., precision of the estimate) is the study pub-
lished in 2007 due to its large sample size. The 2006 145 −0.34 −0.60 −0.07
exclusion of this study would widen the confidence 2007 394 −0.21 −0.57 0.14
interval in approximately 40%. The search for very Pooled weighted −0.26 −0.50 −0.01
influential studies should be done with caution and mean difference
more attention should be paid to influential small when all studies are
studies. included
48 C. Susin et al.

3.9 The Cochrane Collaboration considered large effects. For our data, the standard mean
Forest Plot difference would be −0.11 (95% confidence interval:
−0.23 to 0.01, p = 0.07) using a fixed-effect model, and
−0.12 (95% confidence interval: −0.32 to 0.08, p = 0.24)
We have used Stata [9] to perform this meta-analysis using a random-effects model (Table 3.9). These results
due to personal preferences, but there are other soft- indicate a small effect of the antiviral therapy, but the
ware and statistical packages that can be used with interpretation of the results is difficult to translate in
minor differences in the results. The Cochrane practical terms. Several methods to calculate the stan-
Collaboration has the software Review Manager [5] dardized mean difference have been proposed such as
for preparing systematic reviews and meta-analysis. the Glass method, Cohen method, and Hedges method.
The Forest plot generated by this software is presented
in Fig. 3.10, which is very similar to Fig. 3.1.

3.11 Dichotomous Outcomes
3.10 Standardized Mean Differences
Similar meta-analysis methods can be used for dichot-
We have focused in this chapter on WMDs because it is omous (odds ratios and risk ratios), ordinal (indices
more intuitive and easy to understand. With respect to and scales), counts and rates (number of events), and
continuous outcomes, the standardized mean difference time-to-event data (survival). We will briefly present
can be used instead of the WMD. The standardized below some differences with regard to dichotomous
mean difference can be used when studies have mea- outcomes because they are frequently reported in the
sured the outcomes in different units. However, stan- medical and dental fields. In addition to the inverse-
dardized mean differences are usually difficult to variance method already discussed for continuous
interpret because these measures are not directly related data, three other methods are available for meta-analy-
to everyday outcomes. In this case, the reader should sis of dichotomous outcomes: Mantel–Haenszel and
look for the interpretation given to the results by the Peto methods for fixed-effect models and DerSimonian
authors. Usually, standardized mean differences can be and Laird method for random-effects models. The
presented as the proportion of patients benefiting from Mantel–Haenszel is frequently used for fixed-effect
the intervention, or a measure of the minimal important models and is the standard method for several statisti-
difference can be provided to assist the reader. As a rule cal programs. The Forest plot is also used to present
of thumb, standardized mean differences ³0.7 may be the results with minor differences. Odds ratios and risk

Therapy Control Mean difference Mean difference


Study or Subgroup Mean SD Total Mean SD Total Weight IV, Fixed, 95% CI IV, Fixed, 95% CI
1997 42 2 1.7 21 3.9 2.1 21 4.6% −1.90 (−3.06, −0.74)
1998 31 1.8 2.4 16 3.8 2.7 15 1.9% −2.00 (−3.80, −0.20)
1998 44 2.1 2.6 22 3.5 2.8 22 2.4% −1.40 (−3.00, 0.20)
1999 33 3.3 2.7 18 1.8 2.4 15 2.0% 1.50 (−0.24, 3.24)
2001 29 2.1 2.9 13 3.2 3.2 16 1.2% −1.10 (−3.32, 1.12)
2001 30 3 3.2 14 2.1 2.9 16 1.3% 0.90 (−1.30, 3.10)
2002 27 2.9 2.9 13 2.5 2.5 14 1.5% 0.40 (−1.65, 2.45)
2002 31 2.5 2.5 15 3 2.9 16 1.7% −0.50 (−2.40, 1.40)
2005 190 2.9 2.1 96 3.1 2.5 94 14.2% −0.20 (−0.86, 0.46)
2007 145 3.2 2.4 73 2.9 1.9 72 12.4% 0.30 (−0.40, 1.00)
2007 80 3.5 2.6 39 3.1 2.2 41 5.5% 0.40 (−0.66, 1.46)
2008 394 2.8 1.7 198 3.1 1.8 196 51.3% −0.30 (−0.65, 0.05)

Total (95% CI) 538 538 100.0% -0.26 (-0.50, -0.01)


Heterogeneity: Chi2 = 23.29, df = 11 (p = 0.02); I 2 = 53%
Test for overall effect: Z = 2.03 (p = 0.04) −4 −2 0 2 4
Favors experimental Favors control

Fig. 3.10 Forest plot using the Cochrane Collaboration software (fixed-effect model)
3 Understanding and Interpreting Systematic Review and Meta-Analysis Results 49

Table 3.9 Meta-analysis results using the standardized mean difference instead of the weighted mean difference
Standardized mean 95% CI
Year Sample size difference Lower Upper Weight (%)
1997 42 −1.00 −1.64 −0.35 3.50
1998 31 −0.79 −1.52 −0.05 2.70
1999 44 −0.52 −1.12 0.08 4.00
1998 33 0.58 −0.12 1.28 2.95
2001 30 0.30 −0.43 1.02 2.78
2001 29 −0.36 −1.10 0.38 2.66
2002 27 0.15 −0.61 0.90 2.53
2002 31 −0.18 −0.89 0.52 2.90
2005 190 −0.09 −0.37 0.20 17.88
2008 80 0.17 −0.27 0.61 7.50
2006 145 0.14 −0.19 0.46 13.62
2007 394 −0.17 −0.37 0.03 36.97
Fixed-effect model −0.11 −0.23 0.01 100.00
Random-effects model −0.12 −0.32 0.08 100.00

ratios are frequently transformed using a natural log for periodontal infra-bony defects significantly reduces
scale to facilitate analysis and presentation of the 46% the chance of having ³2 sites gaining less than
results (this transformation makes the scale symmet- 2 mm. Additionally, it can be seen that they found some
ric). Thus, the horizontal axis of the Forest plot gener- heterogeneity (I2 = 44%) and, consequently, a random-
ally uses this scale, which may be misleading for the effects model was applied.
inexperienced reader. The same change in scale occurs
for the Funnel plot. To test for Funnel plot symmetry in
dichotomous data, Harbord et al., Peters et al., and
Rücker et al. proposed alternative tests to the Egger 3.12 Concluding Remarks
test. Nevertheless, the same principles and interpreta-
tion of the results are still valid. Before concluding this chapter we would like to
Needleman et al. [6] published a Cochrane review acknowledge that some of the concepts and statistics
about guided tissue regeneration (GTR) for periodontal presented in this chapter have been simplified in order
infra-bony defects compared to open flap debridement to improve understanding to a broader audience.
(control). The main outcome was clinical attachment Readers with greater statistical background or who are
gain that was dichotomized using a cut-off point of two planning on conducting a meta-analysis are encouraged
sites gaining less than 2 mm of attachment. The Forest to look for more specialized information on this subject
plot below was adapted from their study to illustrate an [1–5]. An updated list of books and websites is pro-
analysis of a dichotomous outcome with the Mantel– vided in the references. We also would like to acknowl-
Haenszel method to pool the results across studies edge that the data sometimes violated some statistical
(Fig. 3.11). Results from 5 out of 6 studies favored assumptions. These minor violations were necessary in
GTR, but only one (the study by Tonnetti 1998) found order to build an interesting dataset that could be used
a statistically significant difference compared to the to show several important steps in ­meta-analysis.
control treatment. The meta-analysis demonstrated a Systematic reviews and meta-analyses are an inte-
final risk ratio of 0.54 indicating that the use of GTR gral part of evidence-based health care practice. In this
50 C. Susin et al.

GTR Risk ratio Risk ratio


Study Flap n/N
n/N M-H, random, 95%CI Weight (%) M-H, random, 95%CI

Cortellini 1995 1/30 2/15 5.3 0.25 (0.02−2.54)

Cortellini 1996 0/24 2/12 3.4 0.10 (0.01−2.01)

Cortellini 2001 10/55 17/54 27.5 0.58 (0.29−1.15)


31.2 1.01 (0.29−1.15)
Mayfield 1998 10/18 11/20
Tonetti 1998 11/69 22/67 28.9 0.49 (0.26−0.92)

Zucchelli 2002 0/30 7/30 3.7 0.07 (0.00−1.12)

Total 100.0 0.54 (0.31− 0.96)


Total events: 32(GTR), 61(flap)
Heterogeneity: I 2=44%
Test for overall effect: p=0.036

0.2 0.5 1 2 5
Favours GTR Favours control (flap)

Fig. 3.11 Forest plot adapted from Needleman et al. [6]

context, we hope that this chapter will encourage more 4. Hartung J, Knapp G, Sinha BK (2008) Statistical meta-anal-
health care professionals to read and apply the evi- ysis with applications. Wiley, Hoboken, 248 pp
5. Higgins JPT, Green S (eds) (2008) Cochrane handbook for
dence contained in systematic reviews and meta-­ Systematic reviews of intervention. Wiley, Chichester, 672pp
analyses in their daily professional lives. Readers are 6. Needleman I, Worthington Helen V, Giedrys-Leeper E,
also encouraged to remain updated since new develop- Tucker R (2009) Guided tissue regeneration for periodontal
ments over the years are likely to occur. infra-bony defects (Cochrane Review). In: The Cochrane
Library, Issue 1
As a final message, we would like to call the reader’s 7. Ng YL, Mann V, Gulabivala K (2008) Outcome of second-
attention to the fact that we are approaching, at least in ary root canal treatment: a systematic review of the litera-
some areas of medicine and dentistry, a limit of how much ture. Int Endod J 41(12):1026–1046
information can be extracted from the current body of sci- 8. Pavia M, Pileggi C, Nobile CG, Angelillo IF (2006)
Association between fruit and vegetable consumption and
entific evidence. Recent systematic reviews and meta- oral cancer: a meta-analysis of observational studies. Am
analyses have often been based in few studies of J Clin Nutr 83(5):1126–1134
questionable quality yielding inconclusive results. Perhaps 9. Sterne J (ed) (2009) Meta-analysis: an updated collection
it is time to stop being creative with our systematic reviews from the Stata Journal. Stata Press, College Station, 259pp
and time to produce new and better evidence.

Websites
References
The Cochrane Collaboration.http://www.cochrane.org/
1. Borenstein M, Hedges LV, Higgins JPT, Rothstein HR (eds) The Cochrane Oral Health Group.http://www.ohg.cochrane.org/
(2009) Introduction to meta-analysis. Wiley, New York, 450pp The Centre for Reviews and Dissemination.http://www.york.ac.
2. Centre for Reviews and Dissemination at University of York uk/inst/crd/index.htm
(2009) Systematic reviews – CRD’s guidance for undertak- Comprehensive meta-analysis.http://www.meta-analysis.com/
ing reviews in health care. Centre for reviews and dissemina- The QUOROM statement (Quality of Reporting of Meta-
tion: York Publishing Services Ltd, York, 3rd edn. 282pp analyses).http://www.consort-statement.org/
3. Egger M, Smith GD, Altman D (2001) Systematic reviews The GRADE working group (Grading of Recommendations
in health care: meta-analysis in context. BMJ Books, Assessment, Development and Evaluation).http://www.
London, 512pp gradeworkinggroup.org/

You might also like