ASSESSMENT OF LEARNING (DAY 3): LET 8.
Which combination of indices suggests the
STANDARD QUESTIONS item should be retained?
A. DI below .20, difficulty 0.26–0.75
1. What does a difficulty index of 0.20 B. DI .20+, difficulty 0.26–0.75
indicate about a test item? C. DI below .20, difficulty below .25
A. Too easy D. DI above .50, difficulty 0.80+
B. Just right
C. Too difficult 9. A low-performing student gets an item
D. Highly discriminating right, but high performers do not. This shows:
A. High validity
2. A test item is answered correctly by 90% of B. Negative discrimination
students. What should be done? C. Positive discrimination
A. Retain D. Good reliability
B. Revise or discard
C. Use as benchmark 10. Which item would MOST LIKELY be used
D. Keep for high achievers only to revise a test?
A. DI = 0.80, D = 0.70
3. What does a discrimination index (DI) of B. DI = 0.10, D = 0.90
-0.60 suggest? C. DI = -0.20, D = 0.30
A. Excellent item D. DI = 0.50, D = 0.40
B. Keep the item
C. Neutral item 11. If a test item is too easy and not
D. Discard the item discriminating, what should be done?
A. Retain
4. Which formula is used for calculating B. Revise
discrimination index? C. Reward students
A. D = RU - RL D. Use it again
B. DI = DU + DL
C. DI = DU - DL 12. Which principle focuses on a test being
D. DI = R - W easy to give with clear directions?
A. Reliability
5. What does positive discrimination B. Administrability
indicate? C. Usability
A. High scorers got the item wrong D. Fairness
B. Low scorers guessed correctly
C. High scorers answered correctly 13. Which principle of assessment reflects
D. All students got it wrong scoring clarity?
A. Validity
6. When both high and low groups answer an B. Interpretability
item equally, the discrimination is: C. Scoreability
A. High D. Transparency
B. Zero
C. Negative 14. A teacher says, "I can interpret this score
D. Valid easily." This reflects:
A. Face validity
7. The item difficulty index is 0.30 and DI is B. Scoreability
0.55. What should be done? C. Interpretability
A. Revise D. Standardization
B. Reject
C. Retain 15. The ability of a test to be scored and
D. Flag for review reused efficiently refers to:
A. Practicality
B. Objectivity A. Retained
C. Economy B. Rejected
D. Accessibility C. Revised
D. Given extra credit
16. The mean of 5 scores: 70, 75, 80, 85, 90
is: 24. A test item with a discrimination index of
A. 80 0.60:
B. 82 A. Favors low performers
C. 75 B. Has high discrimination
D. 85 C. Should be removed
D. Has no value
17. The median of these values: 45, 50, 55,
60, 65 is: 25. A teacher wants to measure current
A. 60 performance of students using two similar
B. 50 tests at once. What validity is being checked?
C. 55 A. Face validity
D. 65 B. Predictive validity
C. Concurrent validity
18. Which is affected MOST by outliers? D. Construct validity
A. Mode
B. Mean 26. The predictive validity of a test is best
C. Median shown when:
D. All equally A. It reflects prior knowledge
B. It is checked with standardized tests
19. Which measure of central tendency is C. It forecasts future performance
MOST appropriate for qualitative data? D. It improves student confidence
A. Mean
B. Median 27. In classroom testing, face validity refers
C. Mode to:
D. Range A. Alignment with learning outcomes
B. Clear appearance and design
20. What do you call a dataset with two C. Statistically proven validity
modes? D. Equal difficulty level
A. Trimodal
B. Bimodal 28. A test score remains stable after two
C. Unimodal administrations. This shows:
D. Polymodal A. Validity
B. Reliability
21. When should median be used over mean? C. Objectivity
A. Data are normally distributed D. Practicality
B. Many extreme scores exist
C. Equal weights are needed 29. A teacher splits a test into two halves
D. Values are all identical (odd-even) to check reliability. What method
is used?
22. What do we call the most frequently A. Test-retest
occurring score in a data set? B. Split-half
A. Mean C. KR-20
B. Median D. Equivalent forms
C. Mode
D. Range 30. KR-20 is most appropriate for:
A. Essay exams
23. An item with DI = 0.10 and difficulty index B. Binary-scored objective tests
= 0.30 should be:
C. Observation rubrics 37. Which principle of assessment deals with
D. Oral recitations ease of scoring the test?
A. Administrability
31. A teacher uses parallel tests with B. Scoreability
different questions but same objectives. What C. Validity
reliability method is this? D. Objectivity
A. Test-retest
B. Split-half 38. If both the DI and difficulty index are
C. Equivalent forms outside acceptable ranges, what should you
D. Predictive method do?
A. Retain
32. A teacher gives a test and the same test B. Use anyway
again two weeks later. This is an example of: C. Revise
A. Equivalent form reliability D. Reject
B. Test-retest reliability
C. Internal consistency 39. Which principle ensures scores can be
D. Face validity clearly interpreted in terms of performance
levels?
33. What is the main difference between A. Transparency
validity and reliability? B. Interpretability
A. Reliability measures difficulty; validity C. Reliability
measures discrimination D. Practicality
B. Validity checks clarity; reliability checks
scoreability 40. The best way to improve the objectivity
C. Reliability is about consistency; validity is of an essay test is to:
about accuracy A. Randomly score it
D. Validity ensures fairness; reliability B. Use a scoring rubric
ensures speed C. Use true-or-false questions
D. Allow student self-grading
34. A test item was too difficult, and high
achievers still failed to answer correctly. This 41. When a teacher considers the cultural
item likely has: background of students when designing test
A. High positive discrimination items, this reflects:
B. High reliability A. Validity
C. Negative discrimination B. Fairness
D. Low objectivity C. Reliability
D. Scoreability
35. What principle is violated if a test
disadvantages learners from a specific 42. What is the BEST use of the mode in data
background? interpretation?
A. Fairness A. When you want to find trends in qualitative
B. Validity data
C. Interpretability B. When the dataset is symmetrical
D. Scoreability C. When looking for the average of numerical
data
36. What is the BEST reason for rejecting an D. When the dataset is perfectly normal
item in an item analysis?
A. DI = 0.55 and difficulty = 0.65 43. Which is true about a trimodal
B. DI = 0.10 and difficulty = 0.80 distribution?
C. DI = 0.35 and difficulty = 0.40 A. It has three or more modes
D. DI = 0.20 and difficulty = 0.50 B. It has no mode
C. It has two modes B. Validity
D. It is a symmetric bell curve C. Practicality
D. Authenticity
44. What is the most appropriate measure of
central tendency when dealing with outliers? 51. The average score on a 10-item test is 8.
A. Mean This average represents the:
B. Median A. Median
C. Mode B. Mode
D. Average C. Mean
D. Variance
45. Which index shows how well a test item
distinguishes between strong and weak 52. When using the same test for pre-test and
students? post-test over time, the method is:
A. Difficulty Index A. Test-retest
B. Item Validity B. Split-half
C. Discrimination Index C. Equivalent forms
D. Central Tendency D. Internal validity
46. A test consistently gives high scores 53. A test item that is difficult but only low-
regardless of actual performance. This performing students get it right suggests:
violates: A. Positive discrimination
A. Validity B. Negative discrimination
B. Reliability C. High validity
C. Fairness D. Zero discrimination
D. Usability
54. The teacher wants to revise items based
47. A test includes complicated instructions on student performance. Which process is
that confuse students. This test lacks: being used?
A. Interpretability A. Interpretation
B. Objectivity B. Norm referencing
C. Administrability C. Item analysis
D. Practicality D. Content validation
48. A DI of 0.48 and difficulty index of 0.70 55. A reliable test must be:
suggest the item is: A. Consistent and objective
A. Weak and difficult B. Valid and long
B. Strong and moderately easy C. Fun and relevant
C. Poorly discriminating D. Easy and fast
D. Too difficult
56. What do you call a test that measures how
49. A test that includes varied types of well a student mastered a standard?
questions to match diverse learners best A. Norm-referenced test
reflects: B. Summative test
A. Economy C. Criterion-referenced test
B. Validity D. Placement test
C. Fairness
D. Administrability 57. In a class of 50, 10 students got an item
correct. What is the difficulty index?
50. A classroom test matches the course A. 0.10
objectives but is hard to implement. What is B. 0.20
compromised? C. 0.25
A. Reliability D. 0.50
58. An item has a difficulty index of 0.60 and No. Ans Rationale
DI of 0.51. What’s your action?
A. Retain between 0.26–0.75 and DI ≥ 0.20.
B. Revise
If low scorers get it right, that’s
C. Reject 9 B
negative discrimination.
D. Flag
59. The most appropriate assessment method Low DI and acceptable difficulty →
10 C
for evaluating real-world problem-solving revise or discard.
skills is: Too easy and poor discrimination →
A. Multiple choice 11 B
needs revision.
B. Performance-based assessment
C. True-or-false Administrability refers to ease of
D. Fill-in-the-blanks 12 B
giving the test.
60. A test is aligned with objectives, easy to Scoreability is about clarity in
score, and reusable. It satisfies which 13 C
scoring.
principles?
A. Validity and economy Easy to understand score =
14 C
B. Fairness and reliability interpretability.
C. Authenticity and usability
D. Transparency and difficulty Economy means it’s cost-effective
15 C
and reusable.
16 A Mean = (70+75+80+85+90)/5 = 80.
✅ ANSWER KEY WITH RATIONALES
Ordered set median = 55 (middle
No. Ans Rationale 17 C
value).
A 0.20 difficulty index means the 18 B Mean is most affected by outliers.
1 C
item is too difficult.
Mode is best for qualitative data
A 0.90 difficulty index is too easy, so 19 C
2 B (e.g., most common response).
revise or discard.
Bimodal = two most frequent
A DI of -0.60 shows negative 20 B
scores.
3 D discrimination, thus should be
discarded. Use median when data has extreme
21 B
scores (outliers).
DI = DU - DL is the correct formula
4 C 22 C Most frequent score = mode.
for discrimination index.
Positive discrimination means 23 C Difficulty is fine, but low DI = revise.
5 C high-performing students got it
right. DI > 0.46 = high discrimination, so
24 B
keep it.
Equal performance between
6 B high/low groups shows zero Two simultaneous tests →
25 C
discrimination. concurrent validity.
Acceptable difficulty and high DI → Predictive validity shows future
7 C 26 C
retain the item. performance.
8 B Best item condition: difficulty 27 B Face validity is based on
No. Ans Rationale No. Ans Rationale
appearance and clarity. it lacks validity.
28 B Score consistency = reliability. Complicated instructions = poor
47 C
administrability.
Splitting test in halves = split-half
29 B
reliability. 0.70 difficulty and 0.48 DI = strong,
48 B
moderately easy item.
KR-20 is used for binary scored
30 B
tests (right/wrong). Adapting to student diversity =
49 C
fairness.
Parallel tests = equivalent forms
31 C
reliability. Difficult to implement = low
50 C
practicality.
Re-administering same test = test-
32 B
retest reliability. 51 C Average = mean.
Reliability is about consistency; Using the same test twice = test-
52 A
33 C validity is about measuring what retest.
should be measured.
If low performers get it right, that’s
53 B
If high scorers fail, it’s negative negative discrimination.
34 C
discrimination.
Revising items based on
54 C
Disadvantaging students violates performance = item analysis.
35 A
fairness.
Consistency and clear rules =
55 A
36 B High difficulty and low DI = reject. reliable test.
37 B Ease of scoring = scoreability. Measures against standards =
56 C
criterion-referenced.
38 D Outside both index ranges = reject.
57 B 10/50 = 0.20 difficulty index.
Clear description of student ranking
39 B
= interpretability. DI and difficulty are within ideal
58 A
range = retain.
40 B Scoring rubric enhances objectivity.
Real-world skills = performance-
Considering student diversity = 59 B
41 B based assessment.
fairness.
Aligned + reusable = validity and
Mode works best for trends in 60 A
42 A economy.
qualitative data.
43 A Trimodal = three or more modes.
44 B Median is unaffected by outliers.
Discrimination index measures
45 C item effectiveness in distinguishing
learners.
46 A If it doesn’t measure what it should,