my report
my report
Introduction
Assessments are a crucial tool in various fields, including education, employment, and healthcare.
They help measure knowledge, skills, and abilities, informing decisions that can have a significant
impact on individuals and organizations. However, for assessments to be effective, they must be valid,
reliable, and free from bias. Testing an assessment for these essential qualities is critical to ensuring
that the results are accurate, fair, and meaningful.
II. Objectives
At the end of the topic, you should be able to:
express the importance of testing an assessment for validity, reliability, and bias.
What is reliability?
Reliability refers to how consistently a method measures something. If the same result can be
consistently achieved by using the same methods under the same circumstances, the
measurement is considered reliable.
You measure the temperature of a liquid sample several times under identical conditions. The
thermometer displays the same temperature every time, so the results are reliable.
A doctor uses a symptom questionnaire to diagnose a patient with a long-term medical
condition. Several different doctors use the same questionnaire with the same patient but give
different diagnoses. This indicates that the questionnaire has low reliability as a measure of the
condition.
What is validity?
Validity refers to how accurately a method measures what it is intended to measure. If research
has high validity, that means it produces results that correspond to real properties,
characteristics, and variations in the physical or social world.
High reliability is one indicator that a measurement is valid. If a method is not reliable, it
probably isn’t valid.
If the thermometer shows different temperatures each time, even though you have carefully
controlled conditions to ensure the sample’s temperature stays the same, the thermometer is
probably malfunctioning, and therefore its measurements are not valid.
If a symptom questionnaire results in a reliable diagnosis when answered at different times and
with different doctors, this indicates that it has high validity as a measurement of the medical
condition.
However, reliability on its own is not enough to ensure validity. Even if a test is reliable, it may
not accurately reflect the real situation.
The thermometer that you used to test the sample gives reliable results. However, the
thermometer has not been calibrated properly, so the result is 2 degrees lower than the true
value. Therefore, the measurement is not valid.
A group of participants take a test designed to measure working memory. The results are
reliable, but participants’ scores correlate strongly with their level of reading comprehension.
This indicates that the method might have low validity: the test may be measuring participants’
reading comprehension instead of their working memory.
Validity is harder to assess than reliability, but it is even more important. To obtain useful
results, the methods you use to collect data must be valid: the research must be measuring
what it claims to measure. This ensures that your discussion of the data and the conclusions
you draw are also valid.
Types of reliability
Different types of reliability can be estimated through various statistical methods.
Types of reliability
Test-retest reliability The consistency of a measure across time: do you A group of participants complete
get the same results when you repeat the a questionnaire designed to measure personality
measurement? traits. If they repeat the questionnaire days, weeks
or months apart and give the same answers, this
indicates high test-retest reliability.
Interrater reliability The consistency of a measure across raters or Based on an assessment criteria checklist, five
observers: do you get the same results when examiners submit substantially different results for
different people conduct the same measurement? the same student project. This indicates that the
assessment checklist has low inter-rater reliability
(for example, because the criteria are too
subjective).
Types of reliability
Internal consistency The consistency of the measurement itself: do You design a questionnaire to measure self-esteem.
you get the same results from different parts of a If you randomly split the results into two halves,
test that are designed to measure the same thing? there should be a strong correlation between the
two sets of results. If the two results are very
different, this indicates low internal consistency.
Types of validity
The validity of a measurement can be estimated based on three main types of evidence.
Each type can be evaluated through expert judgement or statistical methods.
Types of validity
Construct validity The adherence of a measure to existing theory and A self-esteem questionnaire could be assessed by
knowledge of the concept being measured. measuring other traits known or assumed to be
related to the concept of self-esteem (such as social
skills and optimism). Strong correlation between
the scores for self-esteem and associated traits
would indicate high construct validity.
Content validity The extent to which the measurement covers all A test that aims to measure a class of students’
aspects of the concept being measured. level of Spanish contains reading, writing and
speaking components, but no listening component.
Experts agree that listening comprehension is an
essential aspect of language ability, so the test
lacks content validity for measuring the overall
level of ability in Spanish.
Criterion validity The extent to which the result of a measure A survey is conducted to measure the political
corresponds to other valid measures of the same opinions of voters in a region. If the results
concept. accurately predict the later outcome of an election
in that region, this indicates that the survey has
high criterion validity.
To assess the validity of a cause-and-effect relationship, you also need to consider internal
validity (the design of the experiment) and external validity (the generalizability of the
results).
Calculate and describe the properties of the following measures of variability; frequencies and percents,
variance, and standard deviation
Variance
Example:
Score Mean
(X) (x̄ )
(x- x̄ ) (x- x̄ )2
7 12 -5 25
11 12 -1 1
8 12 -4 16
8 12 -4 16
19 12 7 49
15 12 3 9
7 12 -5 25
9 12 -3 9
9 12 -3 9
20 12 8 64
17 12 5 25
14 12 2 4
ƩX= 144 Ʃ(x- x̄ )2 = 252
2
( x− x̄)
Variance = ∑
N
Standard Deviation = √ Variance