0% found this document useful (0 votes)
73 views6 pages

The Long Case: Medical Education December 2004

The long case has traditionally been used to assess clinical competence but has weaknesses in reliability. It can assess integrated doctor-patient interaction with high face validity. However, reliability is limited by using a single case due to content specificity. Recent research found reliability improves with multiple cases. Better structuring and observation also increase validity. Substituting standardized for real patients may provide little additional benefit compared to using more real patient cases.

Uploaded by

Ghada Elhassan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views6 pages

The Long Case: Medical Education December 2004

The long case has traditionally been used to assess clinical competence but has weaknesses in reliability. It can assess integrated doctor-patient interaction with high face validity. However, reliability is limited by using a single case due to content specificity. Recent research found reliability improves with multiple cases. Better structuring and observation also increase validity. Substituting standardized for real patients may provide little additional benefit compared to using more real patient cases.

Uploaded by

Ghada Elhassan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/8210250

The long case

Article  in  Medical Education · December 2004


DOI: 10.1111/j.1365-2929.2004.01985.x · Source: PubMed

CITATIONS READS

48 2,876

2 authors:

Val Wass Cees Van der Vleuten


Keele University Maastricht University
133 PUBLICATIONS   4,890 CITATIONS    808 PUBLICATIONS   33,285 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

PhD in medical education View project

Validity of (programmatic) assessment of physicians: doctors, teachers and researchers. View project

All content following this page was uploaded by Cees Van der Vleuten on 23 October 2017.

The user has requested enhancement of the downloaded file.


the metric of medical education

The long case


Val Wass1 & Cees van der Vleuten2

BACKGROUND The long case has been gradually Medical Education 2004; 38: 1176–1180
replaced by the objective structured clinical exam- doi:10.1111/j.1365-2929.2004.01985.x
ination (OSCE) as a summative assessment of clinical
skills. Its demise occurred against a paucity of psy-
chometric research. This article reviews the current INTRODUCTION
status of the long case, appraising its strengths and
weaknesses as an assessment tool. The search for the ideal mode of assessment of
clinical competence for undergraduates, which is
ISSUES There is a conflict between validity and reli- both valid and reliable, remains controversial. Having
ability. The long case assesses an integrated clinical been increasingly replaced by objective structured
reaction between doctor and real patients and has clinical examinations (OSCEs) throughout the world,
high face validity. Intercase reliability is the prime the long case is still mourned,1,2 and, perhaps, rightly
problem. As most examinations traditionally used a so. The tensions that exist between the validity and
single case only, problems of content specificity and reliability of this assessment method and the feasi-
standardisation were not addressed. bility of its delivery are difficult to resolve, but are
similar to those experienced with any other form of
DISCUSSION Recent research suggests that testing assessment. Adequate sampling across a range of
across more cases does improve reliability. Better content is essential for any test of competence. Yet,
structuring of tests and direct observation increases the long case has educational advantages and, as
validity. Substituting standardised cases for real more focus is placed on performance-based assess-
patients may be of little benefit compared to ment, can be undertaken in the workplace. This
increasing the sample of cases. article balances the strengths and weaknesses of the
long case and argues for more research in this area.
CONCLUSIONS Observed long cases can be useful
for assessment depending on the sample size of cases
and examiners. More research is needed into the VALIDITY OF THE LONG CASE
exact nature of intercase and interexaminer variance
and consequential validity. Feasibility remains a key The American educationalist Flexner (1910) stated:
problem. More exploration of combined assessments ÔThere is only one sort of licensing test that is
using real patients with OSCEs is suggested. significant, i.e. a test that ascertains the practical ability
of the students confronting a concrete case to collect
KEYWORDS education, medical undergraduate ⁄ all the relevant data and to suggest the positive
*methods; educational measurement ⁄ standards; clin- procedures applicable to the conditions disclosed.Õ3 In
ical competence ⁄ standards; reproducibility of results. the traditional long case, candidates are given unin-
terrupted and unobserved time, usually 30–45 min-
utes, to interview and examine a patient who has been
selected from the wards or outpatient departments
1
School of Primary Care, University of Manchester, Manchester, UK and who has had no training for examinations.
2
Department of Educational Development and Research, University of Candidates then present their findings to the exam-
Maastricht, Maastricht, The Netherlands
iners as in an unstructured oral examination. The
Correspondence: Val Wass, School of Primary Care, Rusholme Health long case attempts to assess the integrated interaction
Centre, Walmer Street, Manchester M14 5NP, UK.
Tel: 00 44 161 256 3015 (ext 231); Fax: 00 44 161 256 1070; E-mail: between the doctor and a ÔrealÕ patient. An important
[email protected]. aspect of the validity of a clinical examination is its

1176  Blackwell Publishing Ltd MEDICAL EDUCATION 2004; 38: 1176–1180


1177

with intercase reliability. Content specificity is now


widely recognised as the most crucial issue in the
Overview assessment of clinical competence.6,7 Doctors and
students do not perform consistently from task to
What is already known on this subject task.8 A good performance on a single long case
would not predict a good performance on
Long cases are integrated in-depth assess- another.9,10 Broad sampling across cases is essential
ments of clinical competence. to assess clinical competence reliably. Given the
logistics of long case assessments, medical schools
Observation adds to the reliability of the long traditionally assessed students on a single case only.
case. Implicit in this was the, perhaps rather naı̈ve,
assumption that experienced doctors had the skills to
As for any measure of clinical competence, a immediately identify good or weak students on a
single long case is unreliable. single patient interaction, and that this was predictive
of any patient interaction. It is not surprising there-
What this study adds fore, once the importance of context specificity was
realised, that both undergraduate and postgraduate
Extending tests on long cases may successfully clinical assessments have moved to the multistation
improve reliability. format of the OSCE.11 However, concerns remain
that by developing clinical assessments across a large
Substituting standardised for real patients may number of contexts, little time is available on each
add relatively little benefit. case to fully assess the candidate’s competence. The
validity of the OSCE is being questioned in this
Logistically, the resources required for long respect.12 Depth of assessment, as argued for by
case testing limit its use. Flexner, has been lost in order to gain breadth.
Thirty years after its introduction, life beyond the
Suggestions for further research OSCE is now under discussion.13
Future research should examine the relation- We are looking for ways forward. Surprisingly, the
ships between intercase and interexaminer move away from the long case occurred in the face of
variance, the impact of using real rather than very little published psychometric evidence.4 Yet the
simulated patients, consequential validity, and long case does have face validity and it would be
the feasibility of combining long cases with unwise to abandon this form of testing without
other assessments. clearer statistical evidence of how it performs as a test.
Can the reliability and validity of the long case be
improved?
approximation to the real world. The long case has
arguable validity in this respect because the assess-
ment is based on a highly authentic task and comes IMPROVING THE INTERCASE
very close to a candidate’s actual daily practice. RELIABILITY OF THE LONG CASE
Arguably, it is more valid than the tasks given to a
candidate in a simulated and standardised situation Increasing the number of long cases seen by each
such as in an OSCE. Studies investigating the con- student should, theoretically, address the issue of
struct validity are, however, lacking.4 Moreover, little is content specificity. If infinite resources were avail-
known about the consequential validity of the long able, how many cases would be necessary to achieve
case in terms of its impact on students’ learning and the intercase reliability needed for a high stakes
preparation for examinations. Does the long case have assessment? By comparing final year medical student
greater consequential validity compared to the OSCE? performance across 2 observed, modified history
This in itself is an interesting research question. taking long cases using 2 pairs of different examiners
we predicted, using generalisability theory, that 10
cases would achieve reliability of 0.8 (Cronbach’s
RELIABILITY OF THE LONG CASE alpha).14 The calculation assumed different examin-
ers were used for each case. Thus, a large sample of
Why has the long case fallen from favour? Interrater examiner judgements was also achieved. In terms of
reliability is not the major factor.5 The problem lies

 Blackwell Publishing Ltd MEDICAL EDUCATION 2004; 38: 1176–1180


1178 the metric of medical education

the testing time required for this increased sample, specificity appears to be the key issue and ÔnoiseÕ
the reliability outcome is no better or worse than for associated with the authenticity of the patient pres-
any other measure of clinical competence.6 More entation seems to subordinate this effect.
studies are needed to both replicate this finding and
investigate the relative magnitude of case and exam-
iner variance. So far there is no reason to believe that, IMPROVING THE VALIDITY OF THE
provided sufficient cases and examiners are used, the LONG CASE
long case would differ significantly from, for exam-
ple, the OSCE.15–17 Given sufficient testing time and Over the years attempts have been made to improve
a large patient and examiner resource, a reliable high the validity of the long case format by increasing the
stakes long case examination theoretically has authenticity of the assessment. It would seem logical
potential. that, rather than relying on a presentation alone,
observation of the candidate while eliciting the
The key difference between the long case and the history and carrying out the examination would be a
OSCE is the unstandardised nature of the patients. more valid assessment of the candidate’s competen-
Long case examinations can never be equivalent cies. The use of observed long cases has been
across a cohort of candidates. But does this matter? reported in some institutions.23,24 Gleeson developed
Efforts to standardise encounters and not use real a more structured presentation of an unobserved
patients may lead to relatively small gains compared to long case, the objective structured long examination
ensuring that sufficient encounters are assessed to record (OSLER),25 which includes some direct
overcome the problems of content specificity.18,19 observation of the candidate interacting with the
Logistically, this remains difficult. Hamdy et al. from patient. Fraser developed it as a formative tool for
Bahrain recently demonstrated that a 3-hour exam- assessment of both undergraduates and postgradu-
ination of 4 45-minute observed long cases had good ates within the Leicester Assessment Package.26
reliability.20 Real patients selected from a predeter-
mined blueprint of common disease were used. In the A key question concerns whether observation adds to
USA, the mini-CEX (mini-clinical evaluation exercise) the validity of the assessment.
work-based assessments use limited observation of the
history and examination of real patients to assess A recent study demonstrated that observation does
clinical competencies.21 Durning et al. reported measure a useful and distinctive component of
acceptable reliability across 7 such real patient cases.16 history taking clinical competence over and above the
These findings continue to challenge the assumption contribution made by the presentation. We observed
that standardisation of cases is essential for the an undergraduate history taking long case and
reliable assessment of clinical competence. compared results of the observed and presentation
component with performance on an OSCE underta-
Standardising patients does have great advantages. ken at the same time.5 More studies are needed to
Real patients can be a liability.22 Standardisation investigate the construct validity of such improve-
enables accurate blueprinting of the test. Yet it ments of the long case examination.
requires a high level of training and resource. This is
feasible in some postgraduate examinations, where
the cost can be covered by candidate fees, but it FUTURE RESEARCH
remains difficult in many undergraduate universities
and countries with limited resources. Simulation This review raises more questions than it answers. It
moves the assessment away from the workplace. In our describes the current state of play of the long case.
increasingly diverse society, it is difficult to create More research into the psychometrics of the long
simulations that mirror the range of ethnicity. Cir- case is required. There is a wide range of literature
cumstances such as those involving patients with available on the reliability of the OSCE. We need
limited language, the need to use interpreters, limited more information on the intercase and interexam-
cultural understanding between doctor and patient, iner variance of the long case. A key question
etc., present complex challenges for standardisation, concerns how different structures for the long case
which might be best addressed using a variety of real affect these variances. Whether the long case survives
cases. As we strive for maximum authenticity, research or vanishes from the assessment scene should not
to improve our understanding in this area is needed. merely be based on opinionated arguments but on
Standardising encounters may not impact on reliab- evidence originating from appropriate research.
ility as much as was originally assumed. Content

 Blackwell Publishing Ltd MEDICAL EDUCATION 2004; 38: 1176–1180


1179

Almost 100 years after Flexner’s observation, the long


case continues to have undoubted validity. Critics of ACKNOWLEDGEMENTS
the OSCE may be justified in their concerns about
None.
the failure of this examination format to integrate the
whole process of clinical assessment, from history
taking through to the management of a particular
case. The logistics of providing a sufficient number of FUNDING
long cases to achieve a reliable high stakes summative
None.
test of clinical competence challenges its feasibility.
We encourage its use in low stakes (formative)
assessment with emphasis on the importance of
CONFLICTS OF INTEREST
including observation in the process. Alternatively, a
compromise could be reached if 1 or more long cases
None.
were to be combined with OSCE stations, to bring
both depth and breadth to the assessment. But again,
any such moves should be underpinned by good
ETHICAL APPROVAL
research.
Not applicable.
It is time to fill the gaps in the long case literature.
We need to know more about:
REFERENCES
• Intercase variance: how many observed full long
case assessments are necessary to produce a 1 Meadow R. The structured exam has taken over. BMJ
reliable assessment of clinical competence? 1998;317:1329.
• Interexaminer variance: how much do examiners 2 Fraser R. Does observation add to the validity of the
contribute to the reliability and, provided suffi- long case? [Letter.] Med Educ 2001;35:1131–3.
cient cases are tested, how many different exam- 3 Flexner A. Medical Education in the United States and
iners are needed? Canada. Bethesda, Maryland: Science and Health
• Real patient variance: to what extent does the use of Publications 1910.
real patients affect the reliability? 4 van der Vleuten CPM. Making the best of the Ôlong
caseÕ. Lancet 1996;347:704–5.
• Construct validity: what is the effect of making the
5 Wass V, Jolly B. Does observation add to the validity of
long case as authentic as possible? What is the
the long case? Med Educ 2001;35:729–34.
incremental validity of using real patients versus 6 Wass V, van der Vleuten CPM, Shatzer J, Jones R.
standardised patients? How can the long case Assessment of clinical competence. Lancet
examination be combined with other formats? 2001;357:945–9.
• Consequential validity: what sort of educational 7 van der Vleuten CPM. The assessment of professional
consequences does a (reliable) long case have competence: developments, research and practical
when compared to the OSCE? implications. Adv Health Sci Educ 1996;1:41–67.
8 Swanson DB, Norman GR, Linn RL. Performance-
In the meantime, provided the stakes are not high, based assessment: lessons learnt from the health pro-
long cases remain a useful tool for teachers to fessions. Educ Res 1995;24 (5):5–11.
9 Norcini JJ. The death of the long case? BMJ
observe students in action and give feedback. For
2002;324:408–9.
summative purposes, issues of case specificity and
10 Norcini JJ. Does observation add to the validity of the
effective blueprinting within the logistics of running long case? [Letter.] Med Educ 2001;35:1131–3.
a series of long cases confound feasibility. We would 11 Harden RM, Gleeson FA. ASME Medical Educational
not recommend the use of the long case in high Booklet, 8. Assessment of medical competence using
stakes assessments of clinical competence for this an objective structured clinical examination (OSCE).
reason. J Med Educ 1979;13:41–54.
12 Hodges B. Validity and the OSCE. Med Teacher
2003;25:250–4.
CONTRIBUTORS 13 Ben-David MF. Life beyond the OSCE. Med Teacher
2003;25:239–40.
Both authors reviewed and appraised the current literature, 14 Wass V, Jones R, van der Vleuten CPM. Standardised or
discussed areas for further research and contributed to the real patients to test clinical competence? The long case
writing of this article. revisited. Med Educ 2001;35:321–5.

 Blackwell Publishing Ltd MEDICAL EDUCATION 2004; 38: 1176–1180


1180 the metric of medical education

15 Petrusa ER. Clinical performance assessments. In: current UK practice and the ethicolegal implications
Norman GR, van der Vleuten CPM, Newble DI, eds. for medical education. BMJ 2002;324:404–7.
International Handbook for Research in Medical Education. 23 Newble DI. The observed long case in clinical assess-
Dordrecht: Kluwer Academic Publisher 2002;673–709. ment. Med Educ 1994;25:369–73.
16 Durning SJ, Cation LJ, Markert RJ, Pangaro LN. 24 Price J, Byrne GJA. The direct clinical examination: an
Assessing the reliability and validity of the mini-clinical alternative method for the assessment of clinical psy-
evaluation exercise for internal medicine residency chiatric skills in undergraduate medical students. Med
training. Acad Med 2002;77:900–4. Educ 1994;28:120–5.
17 Williams RG, Klamen DA, McGaghie WC. Cognitive, 25 Gleeson F. The effect of immediate feedback on clin-
social and environmental sources of bias in clinical ical skills using the OSLER. In: Rothman AI, Cohen R,
performance ratings. Teach Learn Med 2003;15:270–92. eds. Proceedings of the Sixth Ottawa Conference of Medical
18 Norman G. The long case versus objective structured Education. Toronto: University of Toronto Bookstore
clinical examinations. BMJ 2002;324:748–9. Custom Publishing 1994;412–5.
19 Norcini JJ. The validity of long cases. Med Educ 26 McKinley RK, Fraser RC, van der Vleuten C, Hastings
2001;35:135–7. AM. Formative assessment of the consultation per-
20 Hamdy H, Prasad K, Williams R, Salim FA. Reliability formance of medical students in the setting of
and validity of the direct observation clinical encounter general practice using a modified version of the
examination (DOCEE). Med Educ 2003;37:205–12. Leicester Assessment Package. Med Educ 2000;34:
21 Norcini JJ, Blank LL, Arnold GK, Kimball HR. The 573–9.
mini-CEX (clinical evaluation exercise): a preliminary
investigation. Ann Intern Med 1995;123:795–9. Received 2 February 2004; editorial comments to authors 10
22 Sayer M, Bowman D, Evans D, Wessier A, Wood D. Use March 2004; accepted for publication 21 June 2004
of patients in professional medical examinations:

 Blackwell Publishing Ltd MEDICAL EDUCATION 2004; 38: 1176–1180

View publication stats

You might also like