Pa CH6
Pa CH6
Learning Objectives:
A. Discuss the process of test design and construction.
B. Differentiate and discuss the various scaling techniques.
C. Discuss how to prepare and construct test items.
Lesson Proper
Category of the test according to domain
Maximum performance test/Ability Test Typical Performance Tests
1. Intelligence 1. Personality
2. Aptitude 2. Interest
3. Achievement 3. Attitude
c. Semantic Differential- this format asks the examinee to choose between two
opposite (polarities) positions usually on 5 or 7 scale points, such as boring on one
end of the scale and interesting on the other.
How I feel when I am out and among other people
Warm __:__:__:__:__:__:__ Cold
Tense __:__:__:__:__:__:__ Relaxed
Weak __:__:__:__:__:__:__ Strong
Brooks Brothers suit __:__:__:__:__:__:__ Hawaiian shirt
d. Method of Paired Comparison – the process of presenting the test take with pairs of
stimuli (e.g., statements, photographs, etc.) which they are asked to compare then
select one based on a set of rules or criteria. E.g., EPI, EPPS, Love Language
Select the behavior that you think would be more justified:
a. cheating on taxes if one has a chance
b. accepting a bribe in the course of one’s duties
e. Guttman Scale - is a scaling method that yields ordinal-level measures. Item is
ranged from weaker to stronger expressions of the attitude, belief or feeling being
measured. All respondents who agree with the stronger items will also agree with the
weaker or milder items.
3. Item Writing
• An item pool is a reservoir or a group of items from which the items in the final
test will be drawn.
E.g., a test called “Philippine History: 1950 to 1990” is to have 30 questions in its
final version, it would be useful to have as many as 60 items in the item pool.
• Item pooling may be done as follows:
a. Intelligence Tests
A pool of items that presumably measures some aspect of the construct
“intelligence” is assembled. The items may be constructed according to a specific theory
of intelligent behavior or simply with reference to the kinds of tasks that highly
intelligent people presumably perform more effectively than those of lower intelligence.
b. Aptitude Tests
Constructing an aptitude test to be used in the industrial setting requires a task analysis or
job analysis which consists of specifying components of the job so that the test items can
be devised to predict employee performance. The specifications may include critical
incidents- behaviors that are critical to successful or unsuccessful performance. In
school, items that are able to predict school achievement or academic success must be
considered in the test construction.
c. Achievement Tests
Items are constructed based on the rationale and purpose of the test. The items that meet
the objectives are then written
d. Personality Inventories and Scales
Personality inventories are constructed by combining theoretical, rational, and empirical
approaches.
➢ One popular classification of test items is the objective type. The crucial feature of an
objective test is not the form of the response, but rather how objectively the items can be
scored.
➢ Multiple-choice type is common among mental ability tests intended for group testing.
The test item is composed of the stem and the response options. The response options
include the correct answer and the distracters.
➢ Some forms of Multiple-choice items are classification, If-the Conditions, Multiple
Conditions, Oddity and Relations, and Correlates.
3. Multiple Conditions- the examinee uses two or more conditions or statements listed
in the stem to draw a conclusion.
Example: Given that John’s score on a test is 60, the test mean is 59, and the standard
deviation is 2, what is John’s z score?
a. -2.00 c. .50
b. -.50 d. 2.00
4. Oddity- the examinee indicates which option does not belong with the others.
Example: Which of the following names does not belong with the others?
a. Alfred Adler c. Carl Jung
b. Sigmund Freud d. Carl Rogers
3 types of constructed-response items: completion item, the short answer, and the essay.
3. Beyond a paragraph or two, the item is more properly referred to as an essay item. We may
define an essay item as a test item that requires the testtaker to respond to a question by writing a
composition, typically one that demonstrates recall of facts, understanding, analysis, and/or
interpretation.
e.g. Compare and contrast definitions and techniques of classical and operant
conditioning. Include examples of how principles of each have been applied in clinical as
well as educational settings.
In tests that employ class or category scoring, testtaker responses earn credit toward placement
in a particular class or category with other testtakers whose pattern of responses is presumably
similar in some way. This approach is used by some diagnostic systems wherein individuals
must exhibit a certain number of symptoms to qualify for a specific diagnosis.
A third scoring model, ipsative scoring, departs radically in rationale from either cumulative or
class models. A typical objective in ipsative scoring is comparing a testtaker’s score on one scale
within a test to another scale within that same test.
E.g., a personality test called the Edwards Personal Preference Schedule (EPPS), is designed
to measure the relative strength of different psychological needs. The EPPS ipsative scoring
system yields information on the strength of various needs in relation to the strength of other
needs of the testtaker. The test does not yield information on the strength of a testtaker’s need
relative to the presumed strength of that need in the general population. Edwards constructed his
test of 210 pairs of statements in a way such that respondents were “forced” to answer true or
false or yes or no to only one of two statements. Prior research by Edwards had indicated that the
two statements were equivalent in terms of how socially desirable the responses were. Here is a
sample of an EPPS-like forced-choice item, to which the respondents would indicate which is
“more true” of themselves:
I feel depressed when I fail at something.
I feel nervous when giving a talk before a group.