0% found this document useful (0 votes)
10 views

Notes on Educational Measurement and Evaluation 1

The document discusses the importance of assessment in the teaching-learning environment, emphasizing the need to differentiate between tests, measurement, and evaluation. It outlines various classifications of tests, including ability, personality, aptitude, and achievement tests, and highlights the functions of educational measurement and evaluation. Additionally, it contrasts teacher-made tests with standardized tests, detailing their preparation, administration, scoring, and purposes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Notes on Educational Measurement and Evaluation 1

The document discusses the importance of assessment in the teaching-learning environment, emphasizing the need to differentiate between tests, measurement, and evaluation. It outlines various classifications of tests, including ability, personality, aptitude, and achievement tests, and highlights the functions of educational measurement and evaluation. Additionally, it contrasts teacher-made tests with standardized tests, detailing their preparation, administration, scoring, and purposes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 29

INTRODUCTION

In the teaching-learning environment, there is a constant need to gauge the outcome or the

quality of responsiveness of the teaching and learning process. This important symbiotic process

generally referred to as assessment, does not only occur after teaching but can also be undertaken

before teaching is affected or during the teaching process. More specifically, concepts of test,

measurement, and evaluation continue to dominate educational practice around the world.

Though several scholars have advanced multiple interpretations, definitions and clarifications to

these important educational concepts , the temptation to misconstrue one construct for the other

have been a regular occurrence for student-teachers, educationists and even academics. In other

words, these concepts have more often than not been erroneously used synonymously by

practitioners to mean the same thing. As professional educators, this is unacceptable to the extent

that our ability to distinguish these concepts and appropriately apply one or more within a given

context is an important component of a teacher’s professional practice. More so, depending on

the nature and stage at which it is conducted, teachers have over the years applied different types

of assessments for varied purposes. Therefore, until classroom teachers have an appropriate

appreciation of the nature of tests, measurement, and evaluation, an effective educational

assessment will remain a mirage. Thus, these notes will attempt to provide an overview of tests,

measurement, and evaluation and explain the uses of these key co-dependent concepts in relation

to educational practice.

Functions of educational measurement and evaluation

Teaching, learning and evaluation are three interdependent aspects of the educative process.

Therefore, measurement and evaluation becomes an indispensable part of the teaching-learning

1
process. It involves measurement and assigning qualitative meaning through value judgments. It

is a means of determining the effectiveness of teaching methodologies, instructional materials

and other elements affecting the teaching-learning situation.

Measurement and evaluation play very important functions in education. Indeed there can be no

effective teaching without measurement and evaluation.

Mehrens and Lehman (1975) have broadly categorized the functions which measurement and

evaluation play in education into four categories as follows;

 Instructional functions

 Administrative functions

 Guidance functions

 Research functions (see the handwritten notes)

The concept of testing

What is a test?

A test is an instrument or a tool. It follows a systematic procedure for measuring a sample of

behavior by posing a set of questions in a uniform manner. It is an attempt to measure what

a person knows or can do at a particular point in time. Furthermore, a test answers the question

‘how well’ the individual performs either in comparison with others or in comparison with a

domain of performance tasks. Simply put, a test refers to a tool, technique or a method that is

intended to measure students’ knowledge or their ability to complete a particular task. Bachman

(1990) defines a test as “a measurement instrument designed to elicit a specific sample of an

individual’s behavior….a test necessarily quantifies characteristics of individuals according to

explicit procedures.”

2
More specifically, a test is considered to be a kind or class of measurement device typically used

to find out something about a person. Most of the times, when you finish a lesson or lessons in a

week, you give a test to find out if your objectives were achieved.

Testing on the other hand is the process of administering the test on the pupils. In other words

the process of making you or letting you take the test in order to obtain a quantitative

representation of the cognitive or non-cognitive traits you possess is called testing. So the

instrument or tool is the test and the process of administering the test is testing.

Testing is one of the significant and most usable techniques in any system of examination or

evaluation. It envisages the use of instruments or tools for gathering information or data. In

written examinations, question paper is one of the most potent tools employed for collecting and

obtaining information about pupils’ achievement.

Classification of Tests

Tests have been classified in a number of ways. Some of these even overlap. Here is some

classification of tests in education;

1. Classification based on area of knowledge measured

a. Ability tests

Ability tests assess cognitive and motor skill sets that have been acquired over a long period of

time and that are not attributable to any specific program of instruction. These are tests that

measure unrestricted areas of knowledge. They are designed to measure a wide range of

functioning in terms of those mental abilities that are useful in almost any aspects of thinking.

Examples include;

Intelligence test

3
A test designed to determine the relative mental capacity of a person. An IQ test is an

assessment that measures a range of cognitive abilities and provides a score that is intended to

serve as a measure of an individual's intellectual abilities and potential. IQ tests are among the

most commonly administered psychological tests. There are a number of different intelligence

tests in existence and their content can vary considerably. Some are used with adults, but many

are specifically designed to be administered to children.

Some commonly used intelligence tests include:

 Kaufman Assessment Battery for Children

 Stanford-Binet Intelligence Scale

 Wechsler Adult Intelligence Scale

 Wechsler Intelligence Scale for Children

Personality Test

A personality test is a tool used to assess human personality. Personality testing and assessment

refer to techniques designed to measure the characteristic patterns of traits that people exhibit

across various situations. Personality tests can be used to help clarify a clinical diagnosis, guide

therapeutic interventions, and help predict how people may respond in different situations.

Aptitude tests

An aptitude test is an exam used to determine an individual's propensity to succeed in a given

activity. Aptitude tests assume that individuals have inherent strengths and weaknesses, and have

a natural inclination toward success or failure in specific areas based on their innate

characteristics. An aptitude test is used to determine an individual's skill or ability, assessing how

4
they perform in an area in which they have no prior training or knowledge. Schools use aptitude

tests to determine if students are inclined toward advanced placement classes or certain areas of

study, like engineering or a foreign language. In the work world, human resources departments at

some companies will use career assessment tests to learn about a potential candidate's strengths

and weaknesses.

Achievement tests

Achievement tests are developed to measure skills and knowledge learned in a given grade level,

usually through planned instruction, such as training or classroom instruction. Achievement tests

are often contrasted with aptitude tests. Typically, an achievement test is administered following

a period of instruction designed to teach the motor or cognitive skill to be examined. The

prototypical achievement test is the periodic classroom exam that is administered to determine

how much the student has learned

Classification according to mode of administration

Oral tests

The oral test is a practice in many schools and disciplines in which an examiner poses questions

to the student in spoken form. The student has to answer the question in such a way as to

demonstrate sufficient knowledge of the subject to pass the test.

The need for development of oral skills and expressions, which are necessary in day-to-day

living, was stressed as back as in 1964 in the fifth Conference of Chairman and Secretaries of the

Boards of Secondary Education. But unless oral skills are tested in the external examinations or

certified in school-based assessment, these are not going to attract the needed attention of the

teachers in developing these skills. Subjectivity in assessment, greater number of examinees,

time span, inter examiner variance, subjective interpretation etc. are a few among many more

5
problems and difficulties in using oral tests in external examinations. However, their use in

instructional process would continue to provide the diagnosis, feedback and their use as

instructional tool for readiness testing and review of lessons.

Purpose of oral examination is:

 To test oral skills that cannot be tested through written examinations.

 To confirm and probe further evidences gathered through written examination whenever

desired.

 To judge the extent to which such skills are warranted by the nature of subject; and

 To make quick oral review for informal assessment of what the learners have learnt or

their deficiencies.

Written tests

Written tests are tests that are administered on paper or on a computer. A test taker who takes a

written test could respond to specific items by writing or typing within a given space of the test

or on a separate form or document.

Classification based on time limit

Speed test

Speed tests are tests designed to determine the rapidity with which a student completes some

given tasks. The emphasis is not whether such tasks can be done but rather on how quickly the

tasks can be carried out.

Power test

Here the tests are constructed with items of varying difficulty levels but with such a liberal time

limit that all students may finish the task at varying levels of success. Power tests are therefore

6
designed to assess the full range of a students’ skills or abilities which place little or no time

limit.

Mastery test

In a speed test, the level of difficulty of items is low and the time is limited, in the power test, the

general level of difficulty is high and the time allowed is generous while in the mastery test, the

items are fairly low in difficulty and the time is also liberal (Nworgu,2015). Mastery tests are

designed to assess the knowledge and skills which every member of a class is expected to have

learnt. Therefore it is assumed that virtually all pupils will perform perfectly well on the tesr.

Classification based on reference point

Normed-referenced testing

Norm-referenced refers to tests that are designed to compare and rank test takers in relation to

one another. Norm-referenced tests report whether test takers performed better or worse than a

hypothetical average student, which is determined by comparing scores against the performance

results of a statistically selected group of test takers, typically of the same age or grade level,

who have already taken the exam

Criterion- referenced testing

Criterion-referenced tests and assessments are designed to measure student performance

against a fixed set of predetermined criteria or learning standards—i.e., concise, written

descriptions of what students are expected to know and be able to do at a specific stage of their

education.

Classification based on Scoring and interpretation of results

(a) Subjective. (b) Objective.

7
Subjective tests aim to assess areas of students’ performance that are complex and qualitative,

using questioning which may have more than one correct answer or more ways to express it.

.Different readers can rate identical responses differently, the same reader can rate the same

paper differently over time.

Objective tests aim to assess a specific part of the learner’s knowledge using questions which

have a single correct answer. They are so definite and so clear that a single, definite answer is

expected.

Classification based on quality

Standardized tests

Standardised tests arc carefully constructed tests which have uniformity of procedure in scoring,

administering and interpreting the test results. A standardised test is generally made by a

professional tester or a group of testers.

Standardized tests are not restricted to use in a school or a few schools but to larger population,

so that many schools can use such types of tests to assess their own performance etc. in relation

to others and the general population for which the test has been standardized

A standardised test is one that has been carefully constructed by experts in the light of acceptable

objectives or purposes; procedure for administering, scoring and interpreting scores are specified

in detail so that no matter who gives the test or where it may be given, the result should be

comparable; and norms or average for different age or grade levels have been pre-determined.

Characteristics of standardized achievement tests

(i) Standardization of the content and questions:

8
Due weightage is given to the content and objectives. Items are to be prepared according to the

blue-print. Relevant items are included and irrelevant items are omitted, giving due consideration

to item difficulty and discriminating value. Internal consistency is also taken into account.

(ii) Standardization of the method of administration:

Procedure of test administration, conditions for administration, time allowed for the test etc., are

to be clearly stated.

(iii) Standardization of the scoring procedure:

To ensure objective and uniform scoring, the adequate scoring key and detailed instruction for

method of scoring is to be provided. Standardized tests are are always accompanied by manuals.

(iv)Standardization of interpretation:

Adequate norms to be prepared to interpreted the results. Test is administered over a large

sample (representative one). Test scores are interpreted with reference to norms. Derivation of

norms is an integral part of the process of standardization.

Some characteristics of these tests are:

1. They consist of items of high quality. The items are pretested and selected on the basis of

difficulty value, discrimination power, and relationship to clearly defined objectives in

behavioural terms.

2. As the directions for administering, exact time limit, and scoring are precisely stated, any

person can administer and score the test.

9
3. A manual is supplied that explains the purposes and uses of the test, describes briefly how it

was constructed, provides specific directions for administering, scoring, and interpreting results,

contains tables of norms and summarizes available research data on the test.

Uses of Standardized Tests:

1. Standardized test assesses the rate of development of a student’s ability. It provides a basis for

ascertaining the level of intellectual ability-strength and weakness of the pupils.

2. It checks and ascertains the validity of a teacher-made test.

3. These tests are useful in diagnosing the learning difficulties of the students.

4. It helps the teacher to know the casual factors of learning difficulties of the students.

5. Provides information’s for curriculum planning and to provide remedial coaching for

educationally backward children.

I. Teacher-made tests

Teacher-made tests are normally prepared and administered for testing classroom achievement of

students, evaluating the method of teaching adopted by the teacher and other curricular

programmes of the school.Teacher-made test is one of the most valuable instrument in the hands

of the teacher to solve his purpose. It is designed to solve the problem or requirements of the

class for which it is prepared.

Differences between teacher made test and standardized tests

10
Basis for Teacher made test Standardized test

comparison

Preparation and the same person (classroom Preparation and construction done by

construction teacher) as instructor, test a team of experts

developer , and evaluator

Administration no uniform procedures standard uniform procedures

Content and determined by the teacher in determined by ministry of education,

objectives coverage the classroom or examination board following

existing curricula and syllabi

Scoring subjective and usually biased objective, usually accompanied with a

and judgment evaluative scoring manual and be machine-

scored

Purpose and use measures particular measures broad objectives and is used

objectives and is used to to make interclass, school, and

make intra class comparisons national comparisons

11
Items commonly used for Tests of Achievement

Two major types of items have been identified as far as achievement tests are concerned

1. Constructed Response / Supply items (essay items)

2. Structured Response / Select items

Constructed Response / Supply items

Constructed response assessments are typically characterized by lengthy responses to questions

posed. In the supply type items the question is so framed that the examinee has to supply or

construct the answer in his words.. Constructed response tests are most commonly referred to as

essays.

Essays can be used to require learners to:

• analyse and/or integrate different ideas or points of view

• contrast or compare theories or ideas

• develop a logical argument

• evaluate views or ideas

• demonstrate creativity

• apply what has been learnt in real life situations

• substantiate their own views.

Essay questions can be classified into two types;

Restricted response essays; limit both content and response as indicated within the question.

The restricted response essay addresses a limited sample of the learning outcomes. Restricted-

response demands specific, precise response. Boundaries of response are clear – use words like

list, define, explain. For example list and explain the three types of evaluation.

12
Extended response essays; These essays provide the freedom of response to a question and

assess the ability to research a topic, creatively organize, integrate and evaluate ideas, and

construct an argument. Extended response allows considerable freedom in determining the form

and scope of answers. For example write a composition about your first day in school.

2. Structured Response / Select items

In the select type items, as the name suggests the examinee is required to select the correct

answer from amongst the given or structured options. They are often called objective items.

They include:

Alternate Response

Multiple-choice

Matching

complertion

Alternative Response Items

Alternative response item, by definition is the one that offers two options to choose from. They

often consists of a declarative statement that the examinee is asked to mark true or false, right or

wrong, correct or incorrect, yes or no, agree or disagree, or the like. Incomplete sentences

providing two options to choose from to fill in the blank also fall in this category. Very common

use of such items is to test the knowledge of grammar. Appropriate use ‘tense’ and also,

contextual meaning of words or spelling mainly of words that sound alike. In each case there are

only two possible answers. The most common form it takes is True - False questions.

Uses of True-False Items

13
Most common use of the true-false item is in measuring the examinee’s ability to identify the

correctness of statements of fact, definitions of terms, statements of principles, and the like, also

to distinguish fact from opinion.

Another aspect of understanding that can be measured by the true-false item is the ability to

recognize cause-and-effect relationships. This type of item usually contains two true propositions

in one statement, and the examinee is to judge whether the relationship between them is true or

false.

Short-Answer / Completion Items

The short –answer item and the completion item both are supply-type test items. Yet, they are

included here for their simplicity. They can be answered by a word, phrase, number, or symbol.

The short-answer item uses a direct question whereas the completion item consists of an

incomplete statement. Short-answer item is especially useful for measuring problem-solving

ability in science and mathematics. Complex interpretations can be made when the short-

answer item is used to measure the ability to interpret diagrams, charts, graphs, and pictorial

data.

When short-answer items are used the question must be stated clearly and concisely. It should be

free from irrelevant clues, and require an answer that is both brief and definite.

Multiple Choice Questions

What is a Multiple Choice Item?

The multiple choice item (MCQ) consists of two distinct parts:

1. The first part that contains task or problem is called stem of the item. The stem of the item

may be presented either as a question or as an incomplete statement. The form makes no

difference as long as it presents a clear and a specific problem to the examinee.

14
2. Second part presents a series of options or alternatives. Each option represents possible

answer to the question. In a standard form one option is the correct or the best answer called the

keyed response and the others are misleads or foils called distracters.

The number of options used differs from one test to the other. An item must have at least three

answer choices to be classified as a multiple choice item. The typical pattern is to have four or

five choices to reduce the probability of guessing the answer. A good item should have all the

presented options look like probable answers at least to those examinees who do not know the

answer.

Terminology: Multiple Choice Questions

1. Stem: presents the problem

2. Keyed Response: correct or best answer

3. Distracters: appear to be reasonable answers to the examinee who does not know the content

4. Options: include the distracters and the keyed response.

Desired characteristics of items

Desirable difficulty level

Ability to discriminate between high and low performers

Effective distracters

Advantages of MCQ

Wide sampling of content

The problem or the task is well structured or clearly defined.

Flexible Difficulty Level

Efficient scoring of items

Objective scoring

15
Limitations of MCQ

The multiple choice items, despite having advantages over other items, have some serious

limitations as well.

It takes time to construct MCQ.

Susceptible to guessing

Do not provide any diagnostic information.

Matching Exercises

Matching exercise consists of two parallel columns with each word, number, or symbol in one

column being matched to a word, sentences, or phrase in the other column.

Items in the column for which a match is sought are called premises, and the items in the

column from which the selection is made are called responses.

Uses of Matching Exercises

When you have a number of questions of the same type (homogeneous), it is advisable to frame a

matching item in place of a number of similar MCQs. Whenever learning outcomes emphasize

the ability to identify the relationship between two things and a sufficient number of

homogeneous premises and responses can be obtained, a matching exercise seems most

appropriate.

Summary

• Testing is one of the significant and most usable techniques in any system of examination or

evaluation. It envisages the use of instruments or tools for gathering information or data.

• A test of educational achievement is one designed to measure knowledge, understanding, or

skills in a specified subject or group of subjects.

16
• Tests of educational achievement differ from those of intelligence in that (1) the former are

concerned with the quantity and quality of learning attained in a subject of study, or group to

subjects, after a period of instruction and (2) the latter are general in scope and are intended for

the measurement and analysis of psychological processes.

• Most educational achievement tests are devoted largely to the measurement of the amount of

information acquired or the skills and techniques developed.

• The principles upon which tests of educational achievement are standardized are the same as

those of the other types already presented; the same principles of definition of aim, sampling,

validity and reliability apply here as elsewhere. A standardized test of educational achievement

should be based upon a careful analysis of materials taught in a given field;

Characteristics of a good test

A good test should possess the following qualities.

• Objectivity

• A test is said to be objective if it is free from personal biases in interpreting its scope as well as

in scoring the responses.

• The test should be based on pre-determined objectives.

• The test setter should have definite idea about the objective behind each item.

• Comprehensiveness

The test should cover the whole syllabus.

• Validity. A said to be valid if it measures what it intends to measure.

• Reliability. Reliability of a test refers to the degree of consistency with which it measures what

it indented to measure

The concept of Educational Assessment

17
The term assess is derived from a Latin word “asoidere” meaning “to sit by” in judgment.

Assessment is an ongoing process aimed at understanding and improving student learning. It

involves making expectations explicit and public; setting appropriate criteria and high standards

for learning quality; systematically gathering, analyzing, and interpreting evidence to determine

how well performance matches those expectations and standards, and using the resulting

information to document, explain, and improve performance. There are many definitions and

explanations of assessment in education. Let us look at few of them;

Palomba and Banta (1999) define assessment as the systematic collection, review, and use of

information about educational programs undertaken to improve learning and development

Freeman and Lewis (1998) to assess, is to judge the extent of students’ learning.b

Rowntree (1977): Assessment in education can be thought of as occurring whenever one person,

in some kind of interaction, direct or indirect, with another, is conscious of obtaining and

interpreting information about the knowledge and understanding of abilities and attitudes of that

other person. To some extent or other, it is an attempt to know the person.

Erwin, in Brown and Knight, (1994). Assessment is a systematic basis for making inference

about the learning and development of students… the process of defining, selecting, designing,

collecting, analyzing, interpreting and using information to increase students’ learning and

development.

Assessment is a human activity. Assessment involves interaction, which aims at seeking to

understand what the learners have achieved. Assessment can be formal or informal. Assessment

may be descriptive rather than judgment in nature. Its role is to increase students’ learning and

development. It helps learners to diagnose their problems and to improve the quality of their

subsequent learning. In a note shell assessment refers to all procedures, techniques that are used

18
for the systematic collection, review, and use of quantitative and qualitative data of what

students can do and how much they possess.

Summary

Ongoing process

• To understand and improve student learning which involves Systematic data collection,

analysis, and interpretation to : determine whether learning meets expectations and standards

and to document, explain, and/or improve performance

Aspects of Assessment

Comprehensiveness is the significant factor in the assessment of the whole chi1d vis-a-vis his

total

development to form the basis of assessment.

Assessment of Scholastic Aspects

Tests are tools in the assessment of scholastic aspects

Assessment of Non-Scholastic Aspects

This would include assessment of the following aspects;

1. Physical health, covering basic understanding about nutrition and health, physical fitness,

development of positive attitudes etc.

2. Habits like health habits, study habits and work habits.

3. Interests in artistic, scientific, musical, literary and social service activities.

4. Attitudes towards students, teachers, class-mates, programmes, school property etc.

5. Character-building values like cleanliness, truthfulness, industriousness, cooperation, equality

etc.

19
6. Participation in games, sports, gymnasium, literacy, scientific, cultural, social and community

service activities.

The concept of Measurement

This is a broad term that refers to the systematic determination of outcomes or characteristics by

means of some sort of assessment device. It is a systematic process of obtaining the quantified

degree to which a trait or an attribute is present in an individual or object. In other words it is a

systematic assignment of numerical values or figures to a trait or an attribute in a person or

object.

Generally, to measure and show the weight, length and volume of an object in definite units is

called measurement; for example, to show the weight of a person in kilograms, length of cloth in

metres and volume of milk in litres. But the field of measurement is very wide. It includes to

define any characteristic of any object or person or activity in words, symbols or units.

As far as explaining the qualities of objects, persons and activities is concerned, it has been in

vague from very ancient times, of course, without any definite base of measurement. In the

present times, the bases of most of the qualities of objects, persons and activities have been

defined; their standards and units have been specified; measuring tools and methods have been

devised and methods to demonstrate the results of measurement in brief have been decided.

Now, a characteristic of an object, person or activity is described in definite words, symbols and

units in brief. Many scholars have attempted to delimit the definition of this process. Most

scholars are in agreement with the definition given by James M. Bradefield. In his words:

Measurement is the process of assigning symbols to the dimension of phenomenon in order

to characterise the status of phenomenon as precisely as possible.

--- James M. Bradefield

20
Factors of Measurement

The above definition of measurement shows that there are four factors of measurement :

(1) The object, person or activity any of which characteristic has to he measured.

(2) The characteristic of that object, person or activity which has to be measured.

(3) The tools and devices of measuring such characteristic.

(4) The person who measures it.

Qualitative Measurement

Perceiving the characteristics of an object, person or activity in the form of a quality is called

qualitative measurement; for example, describing a student as very intelligent, or dull is

qualitative measurement.

Quantitative Measurement

Measuring the characteristics of an object, person or activity in the form of quantity is called

quantitative measurement; for example, to measure the I.Q (Intelligence Quotient) of a student as

140, 120 or 110 is quantitative measurement

Scales of Measurement

The bases of educational measurement are data. Whatever the type of measurement-physical,

social, economic or psychological, it is necessary to gather data. From the viewpoint of

convenience, we place the available data into four levels. These four levels are arranged in a

definite order. The lower level can be easily measured, but the measurement done by it will be

under some doubt. On the contrary, measurement in the higher level is more complex, but the

inferences drawn from it will be more accurate. Thus, accuracy of measurement depends on its

level.

21
Measurement Scales are used to categorize or quantity variables. Measurement has the following

four chief levels :

1. Nominal scale.

2. Ordinal scale.

3. Interval scale, and

4. Ratio scale;

Properties of Measurement Scales

Each scale of measurement satisfies one or more of the following properties of measurement.

• Identity: Each value on the measurement scale has a unique meaning.

• Magnitude: Values on the measurement scale have an ordered relationship to one another.

That is, some values are larger and some are smaller.

• Equal intervals: Scale units along the scale are equal to one another .This means, for example,

that the difference between 1 and 2 would be equal to the difference between 19 and 20.

• Absolute zero: The scale has a true zero point, below which no values exist.

Nominal Scale

This is the lowest level of measurement. Some people call it by the name of classification level

too. Under this scale, the measured objects or events are classified into separate groups on the

basis of their certain attributes, and this group is given a separate name, number or code for its

easy identification. The chief feature of this group is that all elements or individuals will be

similar to each other within the group but they will be entirely different when compared to those

of another group. The main aim of labeling at the nominal scale is only for identification and so

no comparison is accepted. e.g labeling boys as 1 and girls as 2 in research sample. Some more

22
examples at this scale include the numbering of house in a street, footballer in a team etc. This

level is not important from the viewpoint of research, because the only statistical operation or

technique involved is counting or calculation

Ordinal Scale

In the arrangement of scales, the ordinal scale is put at the second place from down below. In this

scale, objects, individuals, events, characteristics or responses are arranged in hierarchical order

in ascending or descending order depending on the basis of certain attributes. After that, they are

given ranks. Giving first, second or third position or rank to students on the basis of their scores,

giving preference in employment to candidates on the basis of eligibility and experience,

awarding trophy to players on the basis of their performance, selecting Miss World or Miss

Universe on the basis of beauty etc.

Though this scale is used more than the nominal scale, yet from the standpoint of research, this is

not accepted as very valid and reliable. Under this scale, though the median, percentiles,

correlation multiple (r) etc. can be used to distinguish the difference between two individuals, yet

it does not clarify the actual difference between the two. This is the chief limitation of this scale.

The ordinal scale has the property of both identity and magnitude. Each value on the ordinal

scale has a unique meaning, and it has an ordered relationship to every other value on the scale

Interval scale

This is the third level of measurement. This scale endeavours to do away with the limitations of

the above two scales. Under this scale, we display the difference between any two classes,

individuals or objects by the medium of scores. The distance between two differences is equal.

Lack of exact zero point is a shortcoming of this scale, due to which the measurement done by

this scale is relative measurement, and not absolute; that is, if a student obtains zero marks in this

23
scale, then it should not be concluded that the student is fully ignorant of the given subject. Some

examples of this scale are thermometer, hour, minute, week month, year etc. In a thermometer,

the normal temperature of an individual is considered at 98.4°F, but if due to certain reasons, this

temperature is read at 97°F, then on the one hand, this shows that that person has no fever, but it

should not be concluded that the individual’s body has no heat or temperature at all. A

thermometer is the most appropriate example of this scale. A thermometer indicates from 98°F to

108°F. It has the same distance between 98°F and 99°F, so it is between 107°F and 108°F. Under

this scale, several statistical calculations can be used, such as mean, percentiles, standard

deviation etc. Though there is no exact or absolute zero point, in internal scale, still the scores

arranged at equal distances are considered the constant unit of this scale

Ratio Scale

This is the highest level of measurement. This scale comprises of all features of all other scales.

The presence of exact or true zero point is the chief feature of this scale. This zero point is not

arbitrary point, rather it is related with the zero amount of certain attribute or feature. In physical

measurement, there is always an absolute zero point, such as meter, km, gram, liter, millimeter

etc. Measurement of height, length, weight or distance is started from zero point. In ratio scale,

the true zero point is considered the initial point of the scale. So, we can find out the ratio

between the distance of any two places, and on its basis, we can say with certainty how distant is

one place from another. Thus, if Manka, Yaya and Beri are awarded 10, 20 and 40 marks on the

basis of certain attribute, then we will say as per this scale in what measure this attribute exists in

Manka, and it exists in Yaya in the double measure and in Beri it is four times. There is

possibility of applying all basic operations in this scale

24
The concept of Evaluation

We are aware that measurement is used to express a trait of an object, person or activity in

standard words, symbols or units. In evaluation, these results are analysed and this analysis is

done on the basis of certain social, cultural or scientific standards (Norms) and by this analysis,

the relative condition of the trait of the object, person or activity is clarified.

James M. Bradefield has defined this process of evaluation in the following words; Evaluation

is the process in which the analysis of the result obtained from measurement of a trait of an

object, person or activity is done on the basis of certain social, cultural or scientific

standards (Norms), and the relative position of the object, person or activity is determined as

relative to that trait.

According to Tuckman (1975) evaluation is a process wherein the parts, processes, or outcomes

of a programme are examined to see whether they are satisfactory, particularly with reference to

the stated objectives of the programme, our own expectations, or our own standards of

excellence.

According to Cronbach et al (1980) evaluation means the systematic examination of events

occurring in and consequent on a contemporary programme. It is an examination conducted to

assist in improving this programme and other programmes having the same general purpose. For

Thorpe (1993) evaluation is the collection analysis and interpretation of information about

training as part of a recognized process of judging its effectiveness, its efficiency and any other

outcomes it may have.

If you study these definitions very well, you will note that evaluation as an integral part of the

instructional process involves three steps.

These are

25
1) Identifying and defining the intended outcomes.

2) Constructing or selecting tests and other evaluation tools relevant to the specified

outcomes.

3) Using the evaluation results to improve learning and teaching.

You will also note that evaluation is a continuous process. It is essential in all fields of teaching

and learning activity where judgment needs to be made.

Types of Evaluation

Types of Evaluation based on purpose

Evaluation can be classified into different types. Most popular types of evaluation are based on

the purpose and timing of evaluation and the object of evaluation

Types of Evaluation based on purpose

The different types of evaluation based on purpose are: placement, formative, diagnostic and

summative evaluations.

Placement Evaluation

This is a type of evaluations carried out in order to fit the students in the appropriate group or

class. In some schools for instance, students are assigned to classes according to their subject

combinations, such as science, Technical, arts, Commercial etc. before this is done an

examination will be carried out. This is in form of pretest or aptitude test. It can also be a type of

evaluation made by the teacher to find out the entry behaviour of his students before he starts

teaching. This may help the teacher to adjust his lesson plan. Tests like readiness tests, ability

tests, aptitude tests and achievement tests can be used.

Formative Evaluation

26
This is a type of evaluation designed to help both the student and teacher to pinpoint areas where

the student has failed to learn so that this failure may be rectified. It provides a feedback to the

teacher and the student and thus estimating teaching success e.g. weekly tests, terminal

examinations etc. Introduction to Educational

Diagnostic Evaluation

This type of evaluation is carried out most of the time as a follow up evaluation to formative

evaluation. As a teacher, you have used formative evaluation to identify some weaknesses in

your students. You have also applied some corrective measures which have not showed success.

What you will now do is to design a type of diagnostic test, which is applied during instruction to

find out the underlying cause of students persistent learning difficulties. These diagnostic tests

can be in the form of achievement tests, performance test, self rating, interviews observations,

etc.

Summative evaluation:

This is the type of evaluation carried out at the end of the course of instruction to determine the

extent to which the objectives have been achieved. It is called a summarizing evaluation because

it looks at the entire course of instruction or programme and can pass judgment on the teacher

and students, the curriculum and the entire system. It is used for certification.

Types of evaluation based on what is being evaluated

Student Evaluation

Achievement is one of the variables on which student is assessed; other major variables include

aptitude, intelligence, personality, attitudes and interests. In order to assess achievement, tests,

both standardized and teacher-made, are administered; projects, procedures and oral

presentations are rated; and formal and informal observations are made. A teacher uses

27
performance data not only to evaluate student progress but also to evaluate his/her own

instruction. In other words, the process of evaluating students provides feedback to the teacher.

Feedback on current student progress also gives direction to future instructional activities.

Curriculum evaluation

Curriculum evaluation is an attempt to toss light on two questions: Do planned courses,

programs, activities, and learning opportunities as developed and organized actually produce

desired results? How can the curriculum offerings best be improved?

Curriculum evaluation involves the evaluation of any instructional program or instructional

materials, and includes evaluation of such factors as instructional strategies, textbooks,

audiovisual materials, and physical and organizational arrangements. Curriculum evaluation may

involve evaluation of a total package or evaluation of one small aspect of a total curriculum, such

as a film. Although ongoing programs are subject to evaluation, curriculum evaluation is usually

associated with innovation, a new or different approach; the approach may be general or specific

to a given area. Curriculum evaluation usually involves both internal and external criteria and

comparisons. Internal evaluation is concerned with whether the new process or product achieves

its stated objectives, that is whether it does what it purports to do, as well as with evaluation of

the objectives themselves. External evaluation is concerned with whether the process or the

product does whatever it does better than some other process or product.

The task of curriculum evaluation

Curriculum evaluation has four tasks;

Evaluation of the teachers’ use of the document

Evaluation of the curriculum design

Evaluation of student performances

28
Evaluation of the process

School evaluation

Evaluation of a school involves evaluation of the total educational program of the school and

entails the collection of data on all aspects of its functioning. The purpose of the school

evaluation is to determine the degree to which school objectives are being met and to identify

areas of strength and weakness in the total program.

Evaluation of personnel

Evaluation of personnel (staff evaluation) includes evaluation of all persons responsible, either

directly or indirectly, for educational outcomes, i.e., teachers, administrators, counsellors and so

forth. It has been found out that this area of evaluation is very complicated; it is difficult to

determine what behaviours are to be evaluated. The best solution to problem of personnel

evaluation is to collect the best and most data possible, from as many sources as possible.

29

You might also like