Instrument development and
psychometric validation
Roger Watson
Instrument development and psychometric validation
• Questionnaire design
• Questionnaire validation
• Content validation
• Screening questionnaires
• Receiver operating characteristics
• Predictive values
Questionnaire design
Designing a questionnaire
What do you want to find out?
How will you analyse the data?
Designing a questionnaire
What do you want to find out?
How will you analyse the data?
Authenticity and directness
The balance between these will dictate the length and utility of your
questionnaire
If you need to ask it, ask it!
If you don’t need to ask it, don’t!
Avoid the ‘just one more question’ trap
Most items will be obvious and come early
Question every additional item
Response formats (contd.)
Points to consider:
• Have you included all the possible options where options are
provided?
• Have you provided a balanced spread of choices where choices
such as opinions are to be selected?
• Are the options mutually exclusive?
• Should you provide a neutral or mid-point response?
Standardised questions
Statements Strongly
disagree
Disagree Neutral Agree Strongly agree
My job provides me with an opportunity to
advance professionally
My income is adequate for normal
expenses
My job provides an opportunity to use a
variety of skills
When instructions are inadequate, I do what
I think is best
Job satisfaction scale – 5-point Likert scale
Demographic aspects
Gender: Male Female
Age: _______
(Please specify)
Educational qualifications:
___________________________ (Please specify the type of degree)
Years of experience as a nurse: ______yrs ________mths
Current post is my ______nursing job.
first
second
others: ________________________
(Please specify)
The nature of current employment is:
full time
part time: _____________________________
(Please specify the number of hours/week)
Presentation
THIS: What do you think about…?
NOT: What do you think about…?
OR: WHAT DO YOU THINK ABOUT…?
Questionnaire validation
Instrument development and psychometric validation 030222
Reliability
the extent to which an instrument provides the same measure each time it is
used
Validity
the extent to which an instrument measures what it is supposed to measure
Establishing validity
Construct validity (unobtainable)
Construct validity is "the degree to which a test measures what it claims, or purports,
to be measuring." In the classical model of test validity, construct validity is one of
three main types of validity evidence, alongside content validity and criterion validity.
Modern validity theory defines construct validity as the overarching concern of validity
research, subsuming all other types of validity evidence.
Wikipedia
it
Content validity
Instrument development and psychometric validation 030222
Content validity
• Item validity
• I-CVI
• Scale validity
• S-CVI
• Content validity ratio (CVR)
Content validity index (I-CVI)
• I-CVI is computed as the number of experts giving a rating of “very
relevant” for each item divided by the total number of experts.
• Values range from 0 to 1 where:
• I-CVI > 0.79, the item is relevant
• between 0.70 and 0.79, the item needs revisions
• if the value is below 0.70 the item is eliminated
Isabel B. Rodrigues, Jonathan D. Adachi, Karen A. Beattie & Joy C.
MacDermid
BMC Musculoskeletal Disorders 18, Article number: 540 (2017)
Instrument development and psychometric validation 030222
Content validity (S-CVI)
• Similarly, S-CVI is calculated using the number of items in a tool that have
achieved a rating of “very relevant”
• There are two methods to calculating S-CVI:
Universal Agreement (UA) among experts (S-CVI/UA):
• S-CVI/UA is calculated by adding all items with I-CVI equal to 1 divided
by the total number of items
• S-CVI/UA ≥ 0.8 = excellent content validity
Average CVI (S-CVI/Ave) (less conservative):
• S-CVI/Ave is calculated by taking the sum of the I-CVIs divided by the
total number of items
• S-CVI/Ave ≥ 0.9 = excellent content validity
Content validity ratio (CVR)
• CVR …is computed to specify whether an item is necessary for operating a
construct in a set of items or not
• For this, an expert panel is asked to give a score of 1 to 3 to each item
ranging from essential, useful but not essential, and not necessary
• The formula for computation of CVR = (Ne – N / 2) / (N / 2) in which Ne is
the number of panellists indicating “essential” and N is the total number of
panellists
• The numeric value of CVR ranges from -1 to 1 (Lawshe, 1975)
Criterion validity
Instrument development and psychometric validation 030222
Criterion validity
• …the extent to which an operationalization of a construct, such as a
test, relates to, or predicts, a theoretical representation of the
construct—the criterion. (Wikipedia)
it
Construct validity
it
Factorial validity
Factorial validity
• Factorial validity examines the extent to which the underlying
putative structure of a scale is recoverable in a set of test scores.
(Piedmont R.L. (2014) FactorialValidity. In: Michalos A.C. (eds) Encyclopedia of
Quality of Life and Well-Being Research. Springer, Dordrecht.
https://doi.org/10.1007/978-94-007-0753-5_984)
Instrument development and psychometric validation 030222
Types of factor analysis
Exploratory (EFA)
• principal axis factoring
• maximum likelihood factoring
• principal components analysis (PCA)*
Confirmatory (CFA)
• structural equation modelling
* - not strictly EFA
(http://www.stat-help.com/factor.pdf)
Item response theory
Item response theory
• The item response theory (IRT), also known as the latent response
theory refers to a family of mathematical models that attempt to
explain the relationship between latent traits (unobservable
characteristic or attribute) and their manifestations (i.e. observed
outcomes, responses or performance).
(https://www.publichealth.columbia.edu/research/population-health-methods/item-
response-theory)
05/04/2022 © The University of Sheffield / Department of Marketing and Communications
Item response theory (IRT)
• The unit of analysis in IRT:
• The item characteristic curve (ICC)
• Also known as:
• The item response curve (IRC)
• The item response function (IRF)
05/04/2022 © The University of Sheffield / Department of Marketing and Communications
Item characteristic curves
P(θ)
θ
1 -
Item 1 Item 2
• item 2 is more
‘difficult’ than item 1
• it represents more of
the latent variable
• more difficult items
will have lower mean
scores on the latent
variable
05/04/2022 © The University of Sheffield / Department of Marketing and Communications
Item response theory
• Rasch analysis
• Partial credit model
• Mokken scaling
Advantages of item response theory:
• only a specific set of items produces a given score on the latent variable
• therefore, you know what the score means
Screening questionnaires
Screening questionnaires
• How does a screening questionnaire work?
Screening questionnaires
Why use them?
Screening questionnaires
• Used to find out if someone has something, or is likely to have
something
• For example:
• Depression
• Problem drinking
• Eating disorder
• Bowel cancer
• Medications management risk
Screening questionnaires
• Why use them when a diagnosis is available?
• Many reasons:
• Speed and volume (many people quickly)
• Potential to save lives and prevent morbidity
• Appropriate use of resources
• Investigate or intervene only when necessary
• Lower risk of dangerous procedures
Screening questionnaires
• But how do we decide what the questionnaire is telling us?
• We need to attach a score on the questionnaire to the level of risk or
probability of diagnosis
• Problems:
• Sometimes the questionnaire will be wrong
• Some people at risk will be screened as ‘OK’
• Some people not at risk will be screened as ‘not OK’
Screening questionnaires
• Example: Bowel cancer screening (fictitious)
• Questions:
• Bloating yes/no
• Pain yes/no
• Changed bowel habit yes/no
• Blood yes/no
• Score range 0-4
• Is someone at risk of bowel cancer at 1, 2, 3 or 4?
How do we make decisions?
True positives
False negatives
True negatives
False positives
Parameters
Parameters
Parameters
Sensitivity and Specificity
Sensitivity
Sensitivity
• In medical diagnosis, test sensitivity is the ability of a test to correctly
identify those with the disease (true positive rate), whereas
test specificity is the ability of the test to correctly identify those without
the disease (true negative rate).
Sensitivity
Sensitivity = 0.66
Specificity
Specificity
• In medical diagnosis, test sensitivity is the ability of a test to correctly
identify those with the disease (true positive rate), whereas
test specificity is the ability of the test to correctly identify those without
the disease (true negative rate).
Specificity
Sensitivity = 0.66
Specificity = 0.52
Sensitivity and specificity
• In medical diagnosis, test sensitivity is the ability of a test to correctly
identify those with the disease (true positive rate), whereas
test specificity is the ability of the test to correctly identify those without
the disease (true negative rate).
Sensitivity and specificity
Sensitivity and specificity
• There is a ‘trade off’ between sensitivity and specificity
• The more sensitive a test is, the less specific it will be
• You will increase the number of negative people diagnosed as positive
• The more specific a test is the less sensitive it will be
• You will miss more people who are really positive
Sensitivity and specificity
Sensitivity and specificity
• There is a ‘trade off’ between sensitivity and specificity
• The more sensitive a test is, the less specific it will be
• You will increase the number of negative people diagnosed as positive
• The more specific a test is the less sensitive it will be
• You will miss more people who are really positive
SO HOW DO WE DECIDE WHAT IS BEST?
Receiver operating
characteristics curves
Receiver Operating Characteristics
The ROC curve was first
developed by electrical
engineers and radar
engineers during World
War II for detecting
enemy objects in
battlefields (Wikipedia)
Receiver Operating Characteristic (ROC)
curve
• The term “Receiver Operating Characteristic” has its roots in World War II.
ROC curves were originally developed by the British as part of the “Chain
Home” radar system.
• ROC analysis was used to analyze radar data to differentiate between
enemy aircraft and signal noise (e.g. flocks of geese).
• As the sensitivity of the receiver increased, so did the number of false
positives (in other words, specificity went down).
Receiver Operating Characteristic (ROC)
curve
• The ROC allows us to combine, graphically, sensitivity AND specificity
• The ROC allows us to find the optimal level of sensitivity and specificity
Receiver Operating Characteristic (ROC)
curve
ROC plot of:
Sensitivity
against
1- Specificity
Receiver Operating Characteristic (ROC)
curve
ROC plot of:
Sensitivity
against
1- Specificity
No better
than
guessing
Worse than
guessing
Better than
guessing
Receiver Operating Characteristic (ROC)
curve
ROC plot of:
Sensitivity
against
1- Specificity
Area under the curve
(AUC) > 0.7 indicates
acceptable ROC
Receiver operating
characteristics curves – Example:
PAID questionnaire
PAID – Problem areas in diabetes questionnaire
• 20 items
• 5-point scale
• e.g.: ‘Feeling depressed when you think about living with diabetes?’
• ‘Not a problem’ (0) to ‘Serious problem’ (4)
Instrument development and psychometric validation 030222
Instrument development and psychometric validation 030222
Fig. 1 ROC curve of the
PAID questionnaire score
for screening for clinical
depression
Fig. 2 ROC curve of the
PAID questionnaire score
for screening for
subclinical depression
How do we decide the optimal levels of sensitivity
and specificity?
• Youden index (J)
• J = Specificity + Sensitivity - 1
Youden index
Sensitivity and Specificity vs PPV & NPV
Instrument development and psychometric validation 030222
Instrument development and psychometric validation 030222
PPV versus Sensitivity
• The Positive Predictive Value definition is similar to the sensitivity of
a test and the two are often confused.
• However, PPV is useful for the patient, while sensitivity is more useful
for the physician.
• Positive predictive value will tell you the odds of you having a disease
if you have a positive result.
(https://www.statisticshowto.com/probability-and-statistics/statistics-
definitions/sensitivity-vs-specificity-statistics/)
Instrument development and psychometric validation 030222
Summary
• Diagnostic or screening questionnaires:
• Only work where you have a binary outcome (yes/no)
• Sensitivity and specificity are antagonistic
• Sensitivity and specificity can be combined to find an optimal level of each
• Receiver operating characteristic curves help us to optimise diagnostic and screening
questionnaires
• PPV & NPV are helpful in real screening situations
rwatson1955@gmail.com
0000-0001-8040-7625
@rwatson1955

More Related Content

PPTX
The research process steps
PPTX
Clinical research design
PPTX
A strategic approach to publication
PDF
Getting started with a systematic review: planning and doing a search
PPTX
Scholarly Research: Therapeutic Recreation
PPT
Research methodology ppt_1
PPTX
Clinical Research for Medical Students
PPT
EoE HLN summer conference 2015 IK - open access & research data management
The research process steps
Clinical research design
A strategic approach to publication
Getting started with a systematic review: planning and doing a search
Scholarly Research: Therapeutic Recreation
Research methodology ppt_1
Clinical Research for Medical Students
EoE HLN summer conference 2015 IK - open access & research data management

What's hot (20)

PPT
Research methods.
PPT
Introduction to research methodology
PPTX
Updated final exam review psyc 2950.002(1)
PPTX
Methodology and IRB/URR
PPTX
How to conduct a systematic review
PPT
systematic reviews and what the library can do to help
PPTX
Publishing in ajp lung 4 21
PPTX
Systematic reviews
PPT
Systematic Reviews: Context & Methodology for Librarians
PPTX
Systematic reviews - a "how to" guide
PPT
Fishing for pearls 2015 - part 2s
PDF
Introduction to systematic reviews
PPT
Systematic Review: Beginner's Guide
PDF
Research Methods: Searches & Systematic Reviews
PPTX
How to write a biomedical research paper
PDF
Do Citations and Readership Predict Excellent Publications?
PPT
systematic review : why & How
PPT
Searching the medical literature aug 2010
PPTX
Systematic Reviews: the researcher's perspective and the research question. E...
Research methods.
Introduction to research methodology
Updated final exam review psyc 2950.002(1)
Methodology and IRB/URR
How to conduct a systematic review
systematic reviews and what the library can do to help
Publishing in ajp lung 4 21
Systematic reviews
Systematic Reviews: Context & Methodology for Librarians
Systematic reviews - a "how to" guide
Fishing for pearls 2015 - part 2s
Introduction to systematic reviews
Systematic Review: Beginner's Guide
Research Methods: Searches & Systematic Reviews
How to write a biomedical research paper
Do Citations and Readership Predict Excellent Publications?
systematic review : why & How
Searching the medical literature aug 2010
Systematic Reviews: the researcher's perspective and the research question. E...
Ad

Similar to Instrument development and psychometric validation 030222 (20)

PPT
251109 rm-c.s.-assessing measurement quality in quantitative studies
PDF
Questionnaire and Instrument validity
PPT
Reliability and validity
PPTX
Diagnotic and screening tests
PPT
Lesson 11 Understanding Data and Ways to Systematically Collect Data.ppt
PPT
Screenings in the community
PDF
GENERIC AND SPECIFIC INSTRUMENTS IN PHARMACOEPIDEMIOLOGICAL RESEARCH.pdf
DOCX
Chapter 15Measurement and Data QualityCopyright © 2017
PPTX
VALIDITY AND RELIABLITY OF A SCREENING TEST seminar 2.pptx
PPTX
Validity and reliability of questionnaires
PPT
23APR_NR_Data collection Methods_Part 3.ppt
PPT
23APR_NR_Data collection Methods_Part 3.ppt
PPTX
Ag Extn.504 :- RESEARCH METHODS IN BEHAVIOURAL SCIENCE
PPTX
Validity
PPTX
Development of health measurement scales - part 1
PDF
Designing questionnaires
PPT
Reliability and validity
PPTX
week_10._validity_and_reliability_0.pptx
PPT
Adler clark 4e ppt 06
PDF
Reability & Validity
251109 rm-c.s.-assessing measurement quality in quantitative studies
Questionnaire and Instrument validity
Reliability and validity
Diagnotic and screening tests
Lesson 11 Understanding Data and Ways to Systematically Collect Data.ppt
Screenings in the community
GENERIC AND SPECIFIC INSTRUMENTS IN PHARMACOEPIDEMIOLOGICAL RESEARCH.pdf
Chapter 15Measurement and Data QualityCopyright © 2017
VALIDITY AND RELIABLITY OF A SCREENING TEST seminar 2.pptx
Validity and reliability of questionnaires
23APR_NR_Data collection Methods_Part 3.ppt
23APR_NR_Data collection Methods_Part 3.ppt
Ag Extn.504 :- RESEARCH METHODS IN BEHAVIOURAL SCIENCE
Validity
Development of health measurement scales - part 1
Designing questionnaires
Reliability and validity
week_10._validity_and_reliability_0.pptx
Adler clark 4e ppt 06
Reability & Validity
Ad

More from Roger Watson (20)

PPTX
Academic citizenship (in higher education)
PPTX
Implementing evidence based nursung: problems and pitfalls
PPTX
Impact of research: definition and capture
PPTX
Research Excellence Framework Exercise 2029
PPTX
Work environment and quality of nursing care
PPTX
Articifical Intelligence (AI) in adademic publicartion
PPTX
Increasing readership of research publications
PPTX
Technology in Nursing Education: the NEP view
PPTX
Research into mealtimes and older people with dementia.pptx
PPTX
Research into mealtimes and older people with dementia
PPTX
Manuscript organisation
PPTX
How to be a good supervisor
PPTX
Alternative experimental designs
PPTX
Clinical research ethics and regulation
PPTX
Effective online teaching
PPTX
Global health & nursing
PPTX
Effective online teaching
PPTX
Open access, predatory journals and open science
PPTX
Publication ethics
PPTX
Better writing techniques
Academic citizenship (in higher education)
Implementing evidence based nursung: problems and pitfalls
Impact of research: definition and capture
Research Excellence Framework Exercise 2029
Work environment and quality of nursing care
Articifical Intelligence (AI) in adademic publicartion
Increasing readership of research publications
Technology in Nursing Education: the NEP view
Research into mealtimes and older people with dementia.pptx
Research into mealtimes and older people with dementia
Manuscript organisation
How to be a good supervisor
Alternative experimental designs
Clinical research ethics and regulation
Effective online teaching
Global health & nursing
Effective online teaching
Open access, predatory journals and open science
Publication ethics
Better writing techniques

Recently uploaded (20)

PDF
Nursing manual for conscious sedation.pdf
PDF
The Digestive System Science Educational Presentation in Dark Orange, Blue, a...
PPTX
Neoplasia III.pptxjhghgjhfj fjfhgfgdfdfsrbvhv
PPTX
Post Op complications in general surgery
PPTX
preoerative assessment in anesthesia and critical care medicine
PPT
Dermatology for member of royalcollege.ppt
PPTX
NRP and care of Newborn.pptx- APPT presentation about neonatal resuscitation ...
DOCX
PEADIATRICS NOTES.docx lecture notes for medical students
PDF
AGE(Acute Gastroenteritis)pdf. Specific.
PPTX
ROJoson PEP Talk: What / Who is a General Surgeon in the Philippines?
PDF
OSCE SERIES ( Questions & Answers ) - Set 5.pdf
PPTX
Introduction to Medical Microbiology for 400L Medical Students
PPTX
Antepartum_Haemorrhage_Guidelines_2024.pptx
PDF
Forensic Psychology and Its Impact on the Legal System.pdf
PDF
MNEMONICS MNEMONICS MNEMONICS MNEMONICS s
PPTX
thio and propofol mechanism and uses.pptx
PPTX
ANESTHETIC CONSIDERATION IN ALCOHOLIC ASSOCIATED LIVER DISEASE.pptx
PDF
OSCE SERIES - Set 7 ( Questions & Answers ).pdf
PPT
neurology Member of Royal College of Physicians (MRCP).ppt
PDF
B C German Homoeopathy Medicineby Dr Brij Mohan Prasad
Nursing manual for conscious sedation.pdf
The Digestive System Science Educational Presentation in Dark Orange, Blue, a...
Neoplasia III.pptxjhghgjhfj fjfhgfgdfdfsrbvhv
Post Op complications in general surgery
preoerative assessment in anesthesia and critical care medicine
Dermatology for member of royalcollege.ppt
NRP and care of Newborn.pptx- APPT presentation about neonatal resuscitation ...
PEADIATRICS NOTES.docx lecture notes for medical students
AGE(Acute Gastroenteritis)pdf. Specific.
ROJoson PEP Talk: What / Who is a General Surgeon in the Philippines?
OSCE SERIES ( Questions & Answers ) - Set 5.pdf
Introduction to Medical Microbiology for 400L Medical Students
Antepartum_Haemorrhage_Guidelines_2024.pptx
Forensic Psychology and Its Impact on the Legal System.pdf
MNEMONICS MNEMONICS MNEMONICS MNEMONICS s
thio and propofol mechanism and uses.pptx
ANESTHETIC CONSIDERATION IN ALCOHOLIC ASSOCIATED LIVER DISEASE.pptx
OSCE SERIES - Set 7 ( Questions & Answers ).pdf
neurology Member of Royal College of Physicians (MRCP).ppt
B C German Homoeopathy Medicineby Dr Brij Mohan Prasad

Instrument development and psychometric validation 030222

  • 1. Instrument development and psychometric validation Roger Watson
  • 2. Instrument development and psychometric validation • Questionnaire design • Questionnaire validation • Content validation • Screening questionnaires • Receiver operating characteristics • Predictive values
  • 4. Designing a questionnaire What do you want to find out? How will you analyse the data?
  • 5. Designing a questionnaire What do you want to find out? How will you analyse the data?
  • 6. Authenticity and directness The balance between these will dictate the length and utility of your questionnaire If you need to ask it, ask it! If you don’t need to ask it, don’t! Avoid the ‘just one more question’ trap Most items will be obvious and come early Question every additional item
  • 7. Response formats (contd.) Points to consider: • Have you included all the possible options where options are provided? • Have you provided a balanced spread of choices where choices such as opinions are to be selected? • Are the options mutually exclusive? • Should you provide a neutral or mid-point response?
  • 8. Standardised questions Statements Strongly disagree Disagree Neutral Agree Strongly agree My job provides me with an opportunity to advance professionally My income is adequate for normal expenses My job provides an opportunity to use a variety of skills When instructions are inadequate, I do what I think is best Job satisfaction scale – 5-point Likert scale
  • 9. Demographic aspects Gender: Male Female Age: _______ (Please specify) Educational qualifications: ___________________________ (Please specify the type of degree) Years of experience as a nurse: ______yrs ________mths Current post is my ______nursing job. first second others: ________________________ (Please specify) The nature of current employment is: full time part time: _____________________________ (Please specify the number of hours/week)
  • 10. Presentation THIS: What do you think about…? NOT: What do you think about…? OR: WHAT DO YOU THINK ABOUT…?
  • 13. Reliability the extent to which an instrument provides the same measure each time it is used
  • 14. Validity the extent to which an instrument measures what it is supposed to measure
  • 15. Establishing validity Construct validity (unobtainable) Construct validity is "the degree to which a test measures what it claims, or purports, to be measuring." In the classical model of test validity, construct validity is one of three main types of validity evidence, alongside content validity and criterion validity. Modern validity theory defines construct validity as the overarching concern of validity research, subsuming all other types of validity evidence. Wikipedia
  • 16. it
  • 19. Content validity • Item validity • I-CVI • Scale validity • S-CVI • Content validity ratio (CVR)
  • 20. Content validity index (I-CVI) • I-CVI is computed as the number of experts giving a rating of “very relevant” for each item divided by the total number of experts. • Values range from 0 to 1 where: • I-CVI > 0.79, the item is relevant • between 0.70 and 0.79, the item needs revisions • if the value is below 0.70 the item is eliminated Isabel B. Rodrigues, Jonathan D. Adachi, Karen A. Beattie & Joy C. MacDermid BMC Musculoskeletal Disorders 18, Article number: 540 (2017)
  • 22. Content validity (S-CVI) • Similarly, S-CVI is calculated using the number of items in a tool that have achieved a rating of “very relevant” • There are two methods to calculating S-CVI: Universal Agreement (UA) among experts (S-CVI/UA): • S-CVI/UA is calculated by adding all items with I-CVI equal to 1 divided by the total number of items • S-CVI/UA ≥ 0.8 = excellent content validity Average CVI (S-CVI/Ave) (less conservative): • S-CVI/Ave is calculated by taking the sum of the I-CVIs divided by the total number of items • S-CVI/Ave ≥ 0.9 = excellent content validity
  • 23. Content validity ratio (CVR) • CVR …is computed to specify whether an item is necessary for operating a construct in a set of items or not • For this, an expert panel is asked to give a score of 1 to 3 to each item ranging from essential, useful but not essential, and not necessary • The formula for computation of CVR = (Ne – N / 2) / (N / 2) in which Ne is the number of panellists indicating “essential” and N is the total number of panellists • The numeric value of CVR ranges from -1 to 1 (Lawshe, 1975)
  • 26. Criterion validity • …the extent to which an operationalization of a construct, such as a test, relates to, or predicts, a theoretical representation of the construct—the criterion. (Wikipedia)
  • 27. it
  • 29. it
  • 31. Factorial validity • Factorial validity examines the extent to which the underlying putative structure of a scale is recoverable in a set of test scores. (Piedmont R.L. (2014) FactorialValidity. In: Michalos A.C. (eds) Encyclopedia of Quality of Life and Well-Being Research. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-0753-5_984)
  • 33. Types of factor analysis Exploratory (EFA) • principal axis factoring • maximum likelihood factoring • principal components analysis (PCA)* Confirmatory (CFA) • structural equation modelling * - not strictly EFA (http://www.stat-help.com/factor.pdf)
  • 35. Item response theory • The item response theory (IRT), also known as the latent response theory refers to a family of mathematical models that attempt to explain the relationship between latent traits (unobservable characteristic or attribute) and their manifestations (i.e. observed outcomes, responses or performance). (https://www.publichealth.columbia.edu/research/population-health-methods/item- response-theory)
  • 36. 05/04/2022 © The University of Sheffield / Department of Marketing and Communications Item response theory (IRT) • The unit of analysis in IRT: • The item characteristic curve (ICC) • Also known as: • The item response curve (IRC) • The item response function (IRF)
  • 37. 05/04/2022 © The University of Sheffield / Department of Marketing and Communications Item characteristic curves P(θ) θ 1 - Item 1 Item 2 • item 2 is more ‘difficult’ than item 1 • it represents more of the latent variable • more difficult items will have lower mean scores on the latent variable
  • 38. 05/04/2022 © The University of Sheffield / Department of Marketing and Communications Item response theory • Rasch analysis • Partial credit model • Mokken scaling Advantages of item response theory: • only a specific set of items produces a given score on the latent variable • therefore, you know what the score means
  • 40. Screening questionnaires • How does a screening questionnaire work?
  • 43. Screening questionnaires • Used to find out if someone has something, or is likely to have something • For example: • Depression • Problem drinking • Eating disorder • Bowel cancer • Medications management risk
  • 44. Screening questionnaires • Why use them when a diagnosis is available? • Many reasons: • Speed and volume (many people quickly) • Potential to save lives and prevent morbidity • Appropriate use of resources • Investigate or intervene only when necessary • Lower risk of dangerous procedures
  • 45. Screening questionnaires • But how do we decide what the questionnaire is telling us? • We need to attach a score on the questionnaire to the level of risk or probability of diagnosis • Problems: • Sometimes the questionnaire will be wrong • Some people at risk will be screened as ‘OK’ • Some people not at risk will be screened as ‘not OK’
  • 46. Screening questionnaires • Example: Bowel cancer screening (fictitious) • Questions: • Bloating yes/no • Pain yes/no • Changed bowel habit yes/no • Blood yes/no • Score range 0-4 • Is someone at risk of bowel cancer at 1, 2, 3 or 4?
  • 47. How do we make decisions?
  • 57. Sensitivity • In medical diagnosis, test sensitivity is the ability of a test to correctly identify those with the disease (true positive rate), whereas test specificity is the ability of the test to correctly identify those without the disease (true negative rate).
  • 60. Specificity • In medical diagnosis, test sensitivity is the ability of a test to correctly identify those with the disease (true positive rate), whereas test specificity is the ability of the test to correctly identify those without the disease (true negative rate).
  • 62. Sensitivity and specificity • In medical diagnosis, test sensitivity is the ability of a test to correctly identify those with the disease (true positive rate), whereas test specificity is the ability of the test to correctly identify those without the disease (true negative rate).
  • 64. Sensitivity and specificity • There is a ‘trade off’ between sensitivity and specificity • The more sensitive a test is, the less specific it will be • You will increase the number of negative people diagnosed as positive • The more specific a test is the less sensitive it will be • You will miss more people who are really positive
  • 66. Sensitivity and specificity • There is a ‘trade off’ between sensitivity and specificity • The more sensitive a test is, the less specific it will be • You will increase the number of negative people diagnosed as positive • The more specific a test is the less sensitive it will be • You will miss more people who are really positive SO HOW DO WE DECIDE WHAT IS BEST?
  • 68. Receiver Operating Characteristics The ROC curve was first developed by electrical engineers and radar engineers during World War II for detecting enemy objects in battlefields (Wikipedia)
  • 69. Receiver Operating Characteristic (ROC) curve • The term “Receiver Operating Characteristic” has its roots in World War II. ROC curves were originally developed by the British as part of the “Chain Home” radar system. • ROC analysis was used to analyze radar data to differentiate between enemy aircraft and signal noise (e.g. flocks of geese). • As the sensitivity of the receiver increased, so did the number of false positives (in other words, specificity went down).
  • 70. Receiver Operating Characteristic (ROC) curve • The ROC allows us to combine, graphically, sensitivity AND specificity • The ROC allows us to find the optimal level of sensitivity and specificity
  • 71. Receiver Operating Characteristic (ROC) curve ROC plot of: Sensitivity against 1- Specificity
  • 72. Receiver Operating Characteristic (ROC) curve ROC plot of: Sensitivity against 1- Specificity No better than guessing Worse than guessing Better than guessing
  • 73. Receiver Operating Characteristic (ROC) curve ROC plot of: Sensitivity against 1- Specificity Area under the curve (AUC) > 0.7 indicates acceptable ROC
  • 74. Receiver operating characteristics curves – Example: PAID questionnaire
  • 75. PAID – Problem areas in diabetes questionnaire • 20 items • 5-point scale • e.g.: ‘Feeling depressed when you think about living with diabetes?’ • ‘Not a problem’ (0) to ‘Serious problem’ (4)
  • 78. Fig. 1 ROC curve of the PAID questionnaire score for screening for clinical depression
  • 79. Fig. 2 ROC curve of the PAID questionnaire score for screening for subclinical depression
  • 80. How do we decide the optimal levels of sensitivity and specificity? • Youden index (J) • J = Specificity + Sensitivity - 1
  • 85. PPV versus Sensitivity • The Positive Predictive Value definition is similar to the sensitivity of a test and the two are often confused. • However, PPV is useful for the patient, while sensitivity is more useful for the physician. • Positive predictive value will tell you the odds of you having a disease if you have a positive result. (https://www.statisticshowto.com/probability-and-statistics/statistics- definitions/sensitivity-vs-specificity-statistics/)
  • 87. Summary • Diagnostic or screening questionnaires: • Only work where you have a binary outcome (yes/no) • Sensitivity and specificity are antagonistic • Sensitivity and specificity can be combined to find an optimal level of each • Receiver operating characteristic curves help us to optimise diagnostic and screening questionnaires • PPV & NPV are helpful in real screening situations