A Comparison of Nine Scales To Detect
A Comparison of Nine Scales To Detect
GLOSSARY
AUC ⫽ area under the curve; BDI ⫽ Beck Depression Inventory; CESD-R ⫽ Center for Epidemiologic Studies Depression Rating
Scale–Revised; GDS ⫽ Geriatric Depression Scale; H&Y ⫽ Hoehn and Yahr; HAM-D-17 ⫽ 17-item Hamilton Depression Rating
Scale; IDS-C ⫽ Inventory of Depressive Symptoms–Clinician; IDS-SR ⫽ Inventory of Depressive Symptoms–Patient; MADRS ⫽
Montgomery-Åsberg Depression Rating Scale; MMSE ⫽ Mini-Mental State Examination; MOOD-PD ⫽ Methods of Optimal De-
pression Detection in Parkinson’s Disease; NPV ⫽ negative predictive value; PD ⫽ Parkinson disease; PHQ-9 ⫽ Patient Health
Questionnaire-9; PPV ⫽ positive predictive value; ROC ⫽ receiver operating characteristic; SCID ⫽ Structured Clinical Interview
for DSM Disorders; UPDRS ⫽ Unified Parkinson’s Disease Rating Scale.
Depressive syndromes affect an estimated 40% of patients with Parkinson disease (PD).1 How-
ever, depressive syndromes in PD are often unrecognized or inadequately treated.2 Routine use
of depressive symptom rating scales may improve detection of depression in PD. In 2009, the
US Preventive Services Task Force recommended use of depression scales in primary care but
did not recommend a specific scale.3 Similarly, no one scale is recommended in PD.4
Depression screening tools should be both sensitive and specific to the broad differential
Supplemental data at
diagnosis of depressed mood in PD.5,6 Although not a substitute for a diagnostic evaluation, scales
www.neurology.org
From the Food and Drug Administration (J.R.W., S.R.G.), Silver Spring, MD; Departments of Psychiatry and Behavioral Sciences (J.R.W., E.S.H.,
S.L., J.T.L., R.L.M., J.P., G.P., P.R., L.M.) and Neurology (S.R.G., S.G., H.W., L.M.), Johns Hopkins University School of Medicine, Baltimore,
MD; Department of Psychiatry (K.A.), University of Maryland Medical Center, Baltimore; Mental Health Care Line (A.L.B., L.M.), Michael E.
Supplemental Data DeBakey Veterans Affairs Medical Center, Houston, TX; Menninger Department of Psychiatry and Behavioral Sciences and Department of
Neurology (A.L.B., L.M.), Baylor College of Medicine, Houston, TX; Parkinson’s and Movement Disorder Center of Maryland (S.R.G., S.G.),
Baltimore, MD; Department of Psychiatry (J.T.L.), Georgetown University Medical Center, Washington, DC; and Mental Health Service (J.T.L.),
Veterans Affairs Medical Center, Washington, DC.
Study funding: Funding information is provided at the end of the article.
Disclosure: Author disclosures are provided at the end of the article.
BDI ⫽ Beck Depression Inventory; CESD-R ⫽ Center for Epidemiologic Studies Depression Rating Scale–Revised; GDS-
30 ⫽ 30-item Geriatric Depression Scale; IDS-SR ⫽ Inventory of Depressive Symptoms–Patient; MMSE ⫽ Mini-Mental
State Examination; PD ⫽ Parkinson disease; PHQ-9 ⫽ Patient Health Questionnaire-9.
tive predictive value (PPV), and negative predictive value and 2 tests as appropriate. 2 tests, as implemented by the ROC-
(NPV).6 Indices were evaluated at cutoffs defined as the max- COMP command, were used to evaluate between- and within-scale
imum sum of sensitivity and specificity for each scale. The AUC differences. Within-scale analyses stratified the ROC curves
maximum sum of specificity and sensitivity was chosen be- by gender, H&Y stage (⬍2.5 vs ⱖ2.5), MMSE score (30 –28 vs
cause these measures are not affected by prevalence rates for 27–26 vs 25–24), and tertiles of age, education (years), PD symp-
depression, unlike the PPV and NPV.6 Internal reliability was tom duration (years), and UPDRS Motor score. Repeated-measures
measured by the Cronbach ␣.6 analysis of variance compared self-reported scale completion times.
Analyses were conducted using STATA statistical software (ver- For post hoc pairwise AUC comparisons, ␣ ⫽ 0.001 was chosen a
sion 9.0). Between-group differences were evaluated using t tests priori. For all other analyses, ␣ ⫽ 0.05 was chosen a priori.
Major depression ⬎No active depression (all scales: p ⬍ 0 䡠 001). Nonmajor depression ⬎No active depression (17-item
Hamilton Depression Rating Scale [HAM-D-17], Inventory of Depressive Symptoms–Clinician [IDS-C]: p ⬍ 0 䡠 001; Beck
Depression Inventory-II [BDI-II], Inventory of Depressive Symptoms–Patient [IDS-SR]: p ⬍ 0 䡠 002; Montgomery-Åsberg
Depression Rating Scale [MADRS]: p ⬍ 0 䡠 005; 30-item Geriatric Depression Scale [GDS-30]: p ⬎ 0 䡠 011; Center for
Epidemiologic Studies Depression Rating Scale–Revised [CESD-R]: p ⬎ 0 䡠 128; Patient Health Questionnaire-9 [PHQ-9]:
p ⬎ 0 䡠 291; Unified Parkinson’s Disease Rating Scale Depression [UPDRS-Dep]: p ⬎ 0 䡠 382). Major depression ⬎ Nonmajor
depression (MADRS, PHQ-9, UPDRS-Dep: p ⬍ 0 䡠 001; HAM-D-17, IDS-C: p ⬍ 0 䡠 002; GDS-30: p ⬍ 0 䡠 006; CESD-R: p ⬍
0 䡠 009; BDI-II: p ⬍ 0 䡠 046; IDS-SR: p ⬍ 0 䡠 060). CI ⫽ confidence interval.
against a consensus panel depression diagnosis. All between “sustained depression” and “sustained de-
scales performed better than chance (AUC ⬎ 0.50) pression with vegetative symptoms.” In addition, a
in detecting depressive disorders and were not influ- single item may not adequately assess all relevant
enced by patient characteristics. The GDS-30, with phenomena. For example, the UPDRS Depression
its favorable psychometric properties, short adminis- assumes that vegetative symptoms cannot occur
tration time, and lack of copyright protection, without dysphoria and does not query for anhedonia.
should have broad appeal as a depression screening An explanation for the AUC difference between the
tool in PD; however, the BDI-II and clinician-rated CESD-R and MADRS is not readily apparent, be-
scales tested also had strong psychometric properties cause the CESD-R has questions similar to those of
and may be useful for depression screening. other scales studied.9,10,17,22,26,34 –36 However, the ob-
Although all scales performed better than chance, served AUC difference is unlikely to be a type I error,
all did not discriminate depressed patients equally because there was also a trend for differences between
well. Groupwise tests of the AUC revealed an in- CESD-R and BDI-II, HAM-D-17, and IDS-C ( p ⬍
equality across all scales and all self-reported scales, 0.01).
but not across clinician-rated scales. In subsequent Most scales were sensitive to differences in symp-
pairwise comparisons, all scales had similar AUCs ex- tom severity between major and nonmajor depres-
cept the CESD-R and UPDRS Depression. sion; however, the IDS-SR did not differentiate
Examining the scale questions provided a poten- nonmajor from major depression and the CESD-R,
tial explanation for the AUC of the UPDRS Depres- PHQ-9, and UPDRS-Depression did not differenti-
sion but not that of the CESD-R.17,35 The UPDRS ate nonmajor depression from no active depression.
Depression was not developed as a single item self- Lack of sensitivity to depression severity suggests that
report scale but rather as a clinician-rated scale.17 As further study is needed before the IDS-SR, CESD-R,
such, its responses are not written in lay terms. For PHQ-9, or UPDRS Depression is used to assess mi-
example, a patient may not understand differences nor depression. These results also question whether
BDI-II
CESD-R
GDS-30
IDS-SR
PHQ-9
UPDRS Depression
HAM-D-17
IDS-C
MADRS
Abbreviations: AUC ⫽ area under the curve; BDI-II ⫽ Beck Depression Inventory-II; CESD-R ⫽ Center for Epidemiologic
Studies Depression Rating Scale–Revised; GDS-30 ⫽ 30-item Geriatric Depression Scale; HAM-D-17 ⫽ 17-item Hamilton
Depression Rating Scale; IDS-C ⫽ Inventory of Depressive Symptoms–Clinician; IDS-SR ⫽ Inventory of Depressive Symp-
toms–Patient; MADRS ⫽ Montgomery-Åsberg Depression Rating Scale; MOOD-PD ⫽ Methods of Optimal Depression De-
tection in Parkinson’s Disease; NPV ⫽ negative predictive value; PHQ-9 ⫽ Patient Health Questionnaire-9; PPV ⫽ positive
predictive value; UPDRS ⫽ Unified Parkinson’s Disease Rating Scale.
a
The cutoff point that maximized the sum of sensitivity and specificity are presented for comparison with other studies,
not as a recommendation for a cutoff score to be used in clinical practice.
b
Results from Reijnders et al.40 and Naarding et al.12 were not included in the table because they reported on an expanded
sample first reported by Leentjens et al.11
the aforementioned scales are appropriate to monitor whereas the other scales rate the presence or fre-
treatment response; studies assessing sensitivity to quency of symptoms.9,10,17,22,26,34 –36 In addition,
change are needed to address this issue. patients may have difficulty comprehending the IDS-SR,
This study provided novel information on the because study staff noted that subjects often asked clarify-
time needed to complete self-report depression scales ing questions when completing this scale.
in PD. The CESD-R, GDS-30, PHQ-9, and The AUC and other psychometric indices for the
UPDRS Depression took most subjects less than 3 GDS-30, HAM-D-17, MADRS, and UPDRS De-
minutes to complete, whereas the BDI-II took ap- pression were similar to those previously reported in
proximately 5 minutes and the IDS-SR took approx- PD, but cutoff scores based on the maximum sum
imately 10 minutes. Scale length does not explain of sensitivity and specificity differed across stud-
these differences because all the self-report scales ex- ies.11,12,14 –16,24 In addition, cutoff scores were gener-
cept for the UPDRS Depression are composed of 20 ally lower than suggested cutoffs scores for primary
to 30 questions.9,10,17,22,26,34 –36 Differing response op- care patients with major depression.34,35,38,39 This
tions are a more likely explanation. The BDI-II and study is the first to evaluate the BDI-II, CESD-R,
IDS-SR are rated in terms of symptom severity, IDS-C, and IDS-SR in PD. One study evaluated the
UPDRS-
BDI-II CESD-R GDS-30 IDS-SR PHQ-9 Depression HAM-D-17 IDS-C
CESD-R 0.01
Abbreviations: AUC ⫽ area under the curve; BDI-II ⫽ Beck Depression Inventory-II; CESD-R ⫽ Center for Epidemiologic
Studies Depression Rating Scale–Revised; GDS-30 ⫽ 30-item Geriatric Depression Scale; HAM-D-17 ⫽ 17-item Hamilton
Depression Rating Scale; IDS-C ⫽ Inventory of Depressive Symptoms–Clinician; IDS-SR ⫽ Inventory of Depressive Symp-
toms–Patient; PHQ-9 ⫽ Patient Health Questionnaire-9; UPDRS-Depression ⫽ Unified Parkinson’s Disease Rating Scale.
a
p Values for pairwise 2(1) tests for AUC values are listed in table 2. A priori statistical significance was set at ␣ ⫽ 0.001.
PHQ-9 as a diagnostic instrument in PD but not for ses. Limitations include the nonrandom sample
depression screening, as used in this study.25 potentially enriched for psychiatric morbidity. Patients
Sampling and methodologic differences may ex- were sequentially mailed recruitment letters inviting
plain the variability in reported cutoff scores. One them to participate in a study to assess psychiatric scales.
possibility is that this community-based sample had In addition, two-thirds of the subject pool elected not to
less psychopathology compared with the previously participate. Although the prevalence of depression is
studied tertiary care samples. Thus, lower cutoff within previously reported ranges, sampling biases are
scores may be a function of a scale score distribution possible.1 Furthermore, exclusion of patients with
whose mean is shifted to the left. Different diagnostic MMSE scores ⱕ23 limits generalizations to patients
methods are another potential source of variability. with significant cognitive impairment. In addition, gen-
In the absence of an objective diagnostic test for eralization of these results may be limited by the under-
depression, the gold standard is the clinician’s representation of racial minorities and the high
diagnosis; therefore, this and previous studies used psychia- educational attainment in the sample. In addition,
trists’ diagnosis of depression according to DSM- completion of clinician-rated scales during the psychiat-
IV-TR criteria as the standard. However, application ric interview might have introduced a bias. To limit
of DSM-IV-TR criteria in PD has limits that can af- potential biases, each case was presented to the consensus
fect diagnostic reproducibility.5 In particular, the de- panel by a psychiatrist other than the interviewing psychia-
cision to attribute symptoms to PD or to a mood trist absent of any self-reported depression scale data or any
disorder and the thresholds used to decide when a clinician-rated scale scores. Furthermore, the time to ad-
sign or symptom is clinically relevant might vary minister clinician-rated scales could not be estimated be-
among examiners. For this reason, the decision to cause they were rated in the context of the clinical
exclude or include somatic items in the evaluation of interview. Finally, self-report scale administration times do
patients with PD is controversial. A NIH workgroup not account for the time to score or interpret them.
recommended an inclusive approach, as used in this The ability to differentiate depressed from nonde-
study, for symptom assessment and diagnosis to en- pressed patients should be the primary consideration
hance sensitivity and reliability of diagnostic criteria.5 when a scale for screening in clinical practice or research
This approach is supported by evidence of similar is chosen, but ease of administration is also an impor-
psychometric properties for the BDI-I in PD when tant consideration.6 This study supports the use of select
somatic items were compared with the affective and self-report scales as alternatives to clinician-rated scales.
cognitive items.23 A final source of variability may be Self-report scales can be administered in waiting rooms
the explicit inclusion of minor depression in our case and discussed with the patient during the clinical exam-
definition. However, this is not likely because exclu- ination, a more practical approach for clinicians in com-
sion of minor depression in sensitivity analyses did parison to the 15- to 20-minute interview required for
not change the psychometric indices appreciably. clinician-rated scales.
Strengths of this study include its use of a Of the self-reported scales tested, the GDS-30
community-based sample, a standardized diagnostic in- had a strong overall combination of psychometric
strument (SCID), and expert consensus panel diagno- and administration characteristics. Although the
AUTHOR CONTRIBUTIONS
REFERENCES
Dr. Williams: study design, statistical analysis, interpretation of data, and
1. Slaughter JR, Slaughter KA, Nichols D, Holmes SE, Mar-
drafting manuscript. E.S. Hirsch: acquisition of data, study coordination, in-
terpretation of data, and revising manuscript. Dr. Anderson: expert panel
tens MP. Prevalence, clinical manifestations, etiology, and
member, interpretation of data, and revising manuscript. Dr. Bush: statistical treatment of depression in Parkinson’s disease. J Neuro-
analysis and revising manuscript. Dr. Goldstein: neurologic examination, in- psychiatry Clin Neurosci 2001;13:187–196.
terpretation of data, and revising manuscript. Dr. Grill: neurologic examina- 2. Shulman LM, Taback RL, Rabinstein AA, Weiner WJ.
tion, interpretation of data, and revising manuscript. Dr. Lehmann: expert Non-recognition of depression and other non-motor
panel member, interpretation of data, and revising manuscript. Dr. Little: symptoms in Parkinson’s disease. Parkinsonism Relat Dis-
interviewing psychiatrist, expert panel member, interpretation of data, and ord 2002;8:193–197.
revising manuscript. Dr. Margolis: expert panel member, interpretation of
3. O’Connor EA, Whitlock EP, Beil TL, Gaynes BN.
data, and revising manuscript. J. Palanci: acquisition of data, study coordina-
Screening for depression in adult patients in primary care
tion, interpretation of data, and revising manuscript. Dr. Pontone: interview-
settings: a systematic evidence review. Ann Intern Med
ing psychiatrist, expert panel member, interpretation of data, and revising
manuscript. Dr. Weiss: neurologic examination, interpretation of data, and 2009;151:793– 803.
revising manuscript. Dr. Rabins: expert panel member, interpretation of data, 4. Schrag A, Barone P, Brown RG, et al. Depression rating
and revising manuscript. Dr. Marsh: study design, interviewing psychiatrist, scales in Parkinson’s disease: critique and recommenda-
expert panel member, study supervision, obtaining funding, interpretation of tions. Mov Disord 2007;22:1077–1092.
data, and revising manuscript. 5. Marsh L, McDonald WM, Cummings J, Ravina B. Provi-
sional diagnostic criteria for depression in Parkinson’s dis-
ease: report of an NINDS/NIMH Work Group. Mov
STUDY FUNDING
Disord 2006;21:148 –158.
The MOOD-PD study was funded by NIMH grant R01-MH069666.
6. Maruish ME. The Use of Psychological Testing for Treat-
Coauthors were also supported by NINDS grant P50-NS58377 (the Morris
ment Planning and Outcomes Assessment. Mahwah, NJ:
K. Udall Parkinson’s Disease Research Center of Excellence at Johns Hop-
kins), NIA grant T32-AG027668, the Department of Veterans Affairs, the Lawrence Erlbaum Associates; 2004.
Donna Jeanne Gault Baumann Fund, and the Weldon Hall Trust. This arti- 7. Ravina B, Camicioli R, Como PG, et al. The impact of
cle reflects the views of the authors and should not be construed to represent depressive symptoms in early Parkinson disease. Neurol-
the Food and Drug Administration’s views or policies. ogy 2007;69:342–347.
Updated Information & including high resolution figures, can be found at:
Services http://www.neurology.org/content/78/13/998.full.html
Neurology ® is the official journal of the American Academy of Neurology. Published continuously since
1951, it is now a weekly with 48 issues per year. Copyright Copyright © 2012 by AAN Enterprises, Inc.. All
rights reserved. Print ISSN: 0028-3878. Online ISSN: 1526-632X.