Module
establishing
1 4 :reliability
PRAYER
ENERGIZER
pre activity: what am
•i?
The class will be divided into five
groups.
• Each group will have 5 minutes to
analyze a set of statistical symbols.
• The group will interpret and identify
each symbols.
• The first group to finish will earn plus 5
points in the assessment later.
POPULATION SAMPLE STANDARD
MEAN DEVIATION
MEAN
SAMPLE
SAMPLE SUMMATION
Learning Objective:
At the end of this lesson, students must
be able to use procedures and statistical
analysis to establish test reliability with at
least 70% accuracy.
RELIABILITY
• Reliability refers to the
consistency and accuracy of the
test.
• A reliable test, therefore, should
yield essentially the same scores
when administered twice to the
FACTORS THAT AFFECT RELIABILITY :
• The number of items in a test - the more items a
test has, the likelihood of reliability is high.
• The individual differences in participants -
such as fatigue, concentration, natural ability, perseverance,
and motivation—impact their test performance.
• External Environment - room temperature, noise
level, depth of instruction, exposure to materials, and quality
of instruction.
Interpreting reliability Result
The following table is a standard followed almost university
in educational test and measurement.
METHODS OF ESTABLISHING
RELIABILITY
1. Test-
• In this method we administer the test twice to the same group of
retest
students with any time interval between tests. Generally the longer
the interval between test administration, the lower the correlation.
Since students can be expected to change with the passage of time.
• The procedure is to correlate the test results using the Pearson
Product- moment coefficient of correlation( r).
example:
Ms Clara administered her statistics test to ten ( 10) first year college
students. After a week, the same test was given to the same group of
students. Their scores in the first and the second test are shown below.
Compute the reliability of test.
Score in the first test : 40 35 30 20 19 20 37 38 40 25
Score in the second test : 41 40 25 20 20 23 34 35 40 25
try this!
Ms. Kzura administered a science test to six (6) first-year
college students. A week later, the same test was given to
the same group of students. Their scores in the first and
second test are shown below. Compute the reliability of
the test using Pearson R Correlation.
score in the first test (x): 15 18 20 12 14 16
score in the second test (y): 14 17 19 13 19 15
METHODS OF ESTABLISHING
RELIABILITY
2. Parallel or Alternate
•forms
In this method, we give two forms of a test similar in
content, type of items, difficulty, and other in close
succession to the same group of students.
• To establish the reliability of the test results for this
method, one should use the correlation technique
(Pearson R correlation)
example:
Mr. Alfredo administered two forms of mathematics test to ten (10) third
year students. The first form of the test was given in the morning and the
second form was given in the afternoon. Their scores in the first and
second forms are presented below. Determine the reliability of the test.
Score in the first form: 60 84 40 65 70 33 42 50 70 90
Score in the second form: 48 82 37 72 89 40 37 69 80 74
try this!
Mr. Santos gave a spelling test to five (5) Grade 3
students in the morning and then gave a similar
spelling test in the afternoon. Their scores in the
morning and afternoon tests are shown below.
Determine the reliability of the test using Pearson
R Correlation.
score in the first form (x): 8 6 9 7 5
score in the second form (y): 7 5 9 6 4
METHODS OF ESTABLISHING
RELIABILITY
3. Split Half
• In this method, a test is conducted once and the results are broken
down into half. The most common procedure in breaking down test
results into halves is by “odd-even” division wherein the test
scores are divided into odd-numbered items and even-numbered
items.
• To compute the reliability of the entire test, a reliability of half is to
be determined first using a correlation coefficient formula
(Pearson r ). Once r has been established, the reliability of the
wholeistest
Rw- thecan be determined
correlation ofby using the Spearman- Brown
the
Formula below:
whole test,
Rh- is the correlation between
odd-even
division
example:
Ms Judith administered a 50 –item science test to her (10) Grade 5
pupils. To find out the reliability of her test, she used the split-half
method. The scores of the pupils in odd and even item numbers
are presented below. Find the reliability of the whole test.
Odd (x) : 14 19 17 15 20 11 24 16 15 15
Even(Y) ; 19 18 18 13 15 9 20 15 15 13
If the correlation between the two halves is r= 0.69, What is the
reliability of the whole test?
try this!
Ms. Aira administered a 10-item math test to her 5
Grade 4 students. To find out the reliability of her
test, she used the split-half method. The scores of
the pupils on the odd and even items are
presented below. Find the reliability of the whole
test using Pearson R Correlation and Spearman
Brown Coefficient.
odd (x): 4 5 3 2 4
even (y): 3 4 2 3 4
METHODS OF ESTABLISHING
RELIABILITY
4. Test of Internal Consistency
: Kuder Richardson Formula
21
• In this method, a test is conducted only once . It is a
statistical method used to measure the internal consistency
reliability of a test, especially when the items are
dichotomous (e.g., true/false or yes/no).
= the mean of the obtained
s =score.
standard deviation
k = total number of items
example:
Mr. Marvin administered a 50- item mathematics test to his Grade
5 pupils. The scores of his pupils are shown below. Find the
reliability of his test by using Kuder- Richardson Formula 21.
Pupils : A B C D E F G H I J
Score: 32 36 36 22 38 15 43 25 18 23
Compute the mean and standard deviation
n = total numbers of x
standard deviation:
try this!
Mr. Marvin administered a 10-item mathematics
test to his 5 Grade 5 pupils. The scores of his
pupils are shown below. Find the reliability of his
test using the Kuder-Richardson Formula 21 (KR-
21).
student: A B C D E
scores: 8 6 7 9 5
METHODS OF ESTABLISHING
RELIABILITY
5. Inter Rater Reliability
• It is used to determine the consistency of
multiple raters when using rating scales and
rubrics to judge the performance.
• Applicable when the assessment requires the
use of multiple raters.
Inter Rater Reliability
Hypothesis:
Extroverts Earn
More Than
Introverts
application
(transferring)
• The class will be divided into five groups.
• Each group will be provided with a set of
test scores.
• A representative from each group will
randomly pick a method for calculating test
reliability.
• Groups will have 5 minutes to collaborate
and compute the test's reliability based on
their assigned method.
assessment
Direction: Choose the letter of the correct
1. It is a statistical method used to measure the internal consistency
answer.
reliability of a test, especially when the items are dichotomous.
A. Inter Rater Reliability
B. Pearson R Correlation
C. Internal Consistency
D. Kuder Richardson 21
2. It is used to determine the consistency of multiple raters when using
rating scales and rubrics to judge performance.
A. Test of Internal Consistency
B. Test-retest
C. Alternate Forms
D. Inter Rater Reliability
3. There are two versions of this test; each test version is called a
form.
A. Parallel forms
B. Split half
C. Alter forms
D. Assessment forms
4. It is a statistical method used to determine the reliability of the
whole test in Split half method.
A. Pearson R Correlation
B. Kuder Richardson 21
C. Spearman Brown Coefficient
5. In this method we administer the test twice to the same group of
students with any time interval between tests.
A. Retest-test Method
B. Test-retest Method
C. Parallel test
D. Re-testing Method
6. Which reliability value range indicates "excellent reliability" for a
test?
A. 0.50 – 0.59
B. 0.60 – 0.69
C. 0.90 and above
D. 0.70 – 0.79
7. A reliability score of 0.80 – 0.89 is best interpreted as:
A. Very good for classroom tests.
B. Good for classroom tests, with some room for improvement.
C. Questionable reliability; needs revision.
D. Somewhat low, requires supplementary measures.
8. What does a reliability score of 0.50 – 0.59 suggest about the test?
A. Excellent reliability at the level of standardized tests.
B. Needs revision unless it is a very short test (10 or fewer items).
C. Good for classroom tests, but some items may need improvement.
D. The test should not contribute heavily to the course grade and
needs revision.
9. If a test has a reliability score of 0.49 or lower, it means:
A. The test should not contribute heavily to grades and needs
revision.
B. The test is very good for classroom purposes.
C. The test requires only minor adjustments for improvement.
D. The test is excellent for standardized use.
10. Which reliability range indicates a test that is in the range of most
classroom tests, with possible minor improvements?
A. 0.60 – 0.69
B. 0.80 – 0.89
C. 0.70 – 0.79
THANK YOU