RELIABILITY AND VALIDITY
RELIABILITY AND VALIDITY
Reliability
Consistency = Reliability
In the psychometric sense it really only refers to something that
is consistent—not necessarily consistently good or bad, but
simply consistent
Variance
Valid test: what is really meant is that the test has been shown to be
valid for a particular use with a particular population of testtakers at a
particular time.
No test or measurement technique is “universally valid” for all time, for
all uses, with all types of testtaker populations.
Rather, tests may be shown to be valid within what we would
characterize as reasonable boundaries of a contemplated usage. If those
boundaries are exceeded, the validity of the test may be called into
question.
Further, to the extent that the validity of a test may diminish as the
culture or the times change, the validity of a test may have to be re-
established with the same as well as other testtaker populations.
Types of validity
Content validity
Construct validity
Criterion validity
Face Validity
Traditionally, construct validity has been viewed as the unifying concept for
all validity evidence (American Educational Research Association et al.,
1999).
All types of validity evidence, including evidence from the content- and
criterion-related varieties of validity, come under the umbrella of construct
validity.
The researcher investigating a test’s construct validity must formulate
hypotheses about the expected behavior of high scorers and low scorers
on the test.
These hypotheses give rise to a tentative theory about the nature of the
construct the test was designed to measure. If the test is a valid measure
of the construct, then high scorers and low scorers will behave as predicted
by the theory.
Evidence of Construct Validity
Convergent validity
Divergent (discriminant) validity
Convergent validity:
Convergent validity shows whether a test that is designed to assess a
particular construct correlates with other tests that assess the same
construct.
We can analyze convergent validity by comparing the results of a test
with those of others that are designed to measure the same construct. If
there is a strong positive correlation between the results, then the test
can be said to have high convergent validity.
Cont..
For example:
A job applicant takes a performance test during the interview process. If
this test accurately predicts how well the employee will perform on the
job, the test is said to have criterion validity.
A graduate student takes the GRE. The GRE has been shown as an
effective tool (i.e. it has criterion validity) for predicting how well a
student will perform in graduate studies.
The first measure (in the above examples, the job performance test and
the GRE) is sometimes called the predictor variable or the estimator. The
second measure is called the criterion variable
Types of Criterion Validity
If test scores are obtained at about the same time as the criterion
measures are obtained, measures of the relationship between the test
scores and the criterion provide evidence of concurrent validity.
Statements of concurrent validity indicate the extent to which test
scores may be used to estimate an individual’s present standing on a
criterion.
Concurrent validity measures how well a new test compares to an well-
established test.
If we create a new test for depression levels, we can compare its
performance to previous depression tests that have high validity.
Predictive Validity
Test scores may be obtained at one time and the criterion measures
obtained at a future time, usually after some intervening event has
taken place.
The intervening event may take varied forms, such as training,
experience, therapy, medication, or simply the passage of time.
Measures of the relationship between the test scores and a criterion
measure obtained at a future time provide an indication of the predictive
validity of the test; that is, how accurately scores on the test predict
some criterion measure.
Cont..