100% found this document useful (4 votes)
107 views53 pages

Instant Ebooks Textbook Measurement and Evaluation in Psychology and Education 8th Edition Robert Thorndike Download All Chapters

ebook

Uploaded by

versosxhoina
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (4 votes)
107 views53 pages

Instant Ebooks Textbook Measurement and Evaluation in Psychology and Education 8th Edition Robert Thorndike Download All Chapters

ebook

Uploaded by

versosxhoina
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Full download test bank at ebook textbookfull.

com

Measurement and Evaluation in


Psychology and Education 8th

CLICK LINK TO DOWLOAD

https://textbookfull.com/product/measurement-
and-evaluation-in-psychology-and-
education-8th-edition-robert-thorndike/

textbookfull
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Handbook on Measurement Assessment and Evaluation in


Higher Education Charles Secolsky

https://textbookfull.com/product/handbook-on-measurement-
assessment-and-evaluation-in-higher-education-charles-secolsky/

Research and Evaluation in Education and Psychology


Integrating Diversity with Quantitative Qualitative and
Mixed Methods Donna M. Mertens

https://textbookfull.com/product/research-and-evaluation-in-
education-and-psychology-integrating-diversity-with-quantitative-
qualitative-and-mixed-methods-donna-m-mertens/

Research Methods and Statistics in Psychology 8th


Edition Coolican

https://textbookfull.com/product/research-methods-and-statistics-
in-psychology-8th-edition-coolican/

Developing Cross-Cultural Measurement In Social Work


Research And Evaluation Keith T. Chan

https://textbookfull.com/product/developing-cross-cultural-
measurement-in-social-work-research-and-evaluation-keith-t-chan/
Methodological Issues in Psychology : Concept, Method,
and Measurement 1st Edition David Trafimow

https://textbookfull.com/product/methodological-issues-in-
psychology-concept-method-and-measurement-1st-edition-david-
trafimow/

Uncertainty in Acoustics-Measurement, Prediction and


Assessment 1st Edition Robert Peters (Editor)

https://textbookfull.com/product/uncertainty-in-acoustics-
measurement-prediction-and-assessment-1st-edition-robert-peters-
editor/

Non-Invasive Instrumentation and Measurement in Medical


Diagnosis, Second Edition Robert B. Northrop

https://textbookfull.com/product/non-invasive-instrumentation-
and-measurement-in-medical-diagnosis-second-edition-robert-b-
northrop/

Program Evaluation and Performance Measurement: An


Introduction to Practice James C Mcdavid

https://textbookfull.com/product/program-evaluation-and-
performance-measurement-an-introduction-to-practice-james-c-
mcdavid/

Evaluation A Systematic Approach 8th Edition Rossi

https://textbookfull.com/product/evaluation-a-systematic-
approach-8th-edition-rossi/
Measurement and Evaluation
Measurement and Evaluation
in Psychology and Education
Thorndike Thorndike-Christ
Eighth Edition

Thorndike Thorndike-Christ
.........................................
.........................................
.........................................
.........................................
.........................................
.........................................
.........................................
.........................................
........................................
........................................
........................................
........................................
8e ........................................
........................................
........................................
ISBN 978-1-29204-111-7

........................................
........................................
........................................
9 781292 041117
........................................
........................................
........................................
........................................
........................................
Pearson New International Edition

Measurement and Evaluation


in Psychology and Education
Thorndike Thorndike-Christ
Eighth Edition
Pearson Education Limited
Edinburgh Gate
Harlow
Essex CM20 2JE
England and Associated Companies throughout the world

Visit us on the World Wide Web at: www.pearsoned.co.uk

© Pearson Education Limited 2014

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted
in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without either the
prior written permission of the publisher or a licence permitting restricted copying in the United Kingdom
issued by the Copyright Licensing Agency Ltd, Saffron House, 6–10 Kirby Street, London EC1N 8TS.

All trademarks used herein are the property of their respective owners. The use of any trademark
in this text does not vest in the author or publisher any trademark ownership rights in such
trademarks, nor does the use of such trademarks imply any affiliation with or endorsement of this
book by such owners.

ISBN 10: 1-292-04111-0


ISBN 10: 1-269-37450-8
ISBN 13: 978-1-292-04111-7
ISBN 13: 978-1-269-37450-7

British Library Cataloguing-in-Publication Data


A catalogue record for this book is available from the British Library

Printed in the United States of America


P E A R S O N C U S T O M L I B R A R Y

Table of Contents

1. Fundamental Issues in Measurement


Robert M. Thorndike/Tracy Thorndike-Christ 1
2. Giving Meaning to Scores
Robert M. Thorndike/Tracy Thorndike-Christ 23
3. Qualities Desired in Any Measurement Procedure: Reliability
Robert M. Thorndike/Tracy Thorndike-Christ 75
4. Qualities Desired in Any Measurement Procedure: Validity
Robert M. Thorndike/Tracy Thorndike-Christ 111
5. Assessment and Educational Decision Making
Robert M. Thorndike/Tracy Thorndike-Christ 157
6. Principles of Test Development
Robert M. Thorndike/Tracy Thorndike-Christ 183
7. Performance and Product Evaluation
Robert M. Thorndike/Tracy Thorndike-Christ 227
8. Attitudes and Rating Scales
Robert M. Thorndike/Tracy Thorndike-Christ 247
9. Aptitude Tests
Robert M. Thorndike/Tracy Thorndike-Christ 283
10. Standardized Achievement Tests
Robert M. Thorndike/Tracy Thorndike-Christ 333
11. Interests, Personality, and Adjustment
Robert M. Thorndike/Tracy Thorndike-Christ 353
12. Appendix: Percent of Cases Falling Below Selected Values on the Normal Curve
Robert M. Thorndike/Tracy Thorndike-Christ 391
References
Robert M. Thorndike/Tracy Thorndike-Christ 393

I
Index 407

II
CHAPTER

1 Fundamental Issues
in Measurement

Introduction Quantifying the Attribute


A Little History Problems Relating to the Measurement
The Early Period Process
The Boom Period Some Current Issues in Measurement
The First Period of Criticism Testing Minority Individuals
The Battery Period Invasion of Privacy
The Second Period of Criticism The Use of Normative Comparisons
The Age of Accountability Other Factors That Influence Scores
Types of Decisions Rights and Responsibilities of Test Takers
Measurement and Decisions Summary
The Role of Values in Decision Making Questions and Exercises
Steps in the Measurement Process Suggested Readings
Identifying and Defining the Attribute
Determining Operations to Isolate and
Display the Attribute

INTRODUCTION

Societies and individuals have always had to make decisions. Making decisions is a re-
quirement of daily life for everyone. We all decide when to get up in the morning, what to
have for breakfast, and what to wear; we make some kinds of decisions with such regularity
that we hardly think about the process. Other decisions that we make less frequently

From Chapter 1 of Measurement and Evaluation in Psychology and Education, 8/e. Robert M. Thorndike.
Tracy Thorndike-Christ. Copyright © 2010 by Pearson Education. All rights reserved.

1
PART ONE Technical Issues

require careful thought and analysis: Which college should I attend? What should I major in?
Should I accept a job with XYZ Enterprises? Decisions of this kind are best made based on infor-
mation about the alternative choices and their consequences: Am I interested in a particular col-
lege? What job prospects will it open for me? What are the chances that I will succeed? What are
the benefits and working conditions at XYZ Enterprises?
Other decisions are made by individuals acting for the larger society. Daily, teachers must
make decisions about the best educational experiences to provide for students, based on an as-
sessment of their students’ current knowledge and abilities. A school psychologist may have to
decide whether to recommend a special educational experience for a child who is having diffi-
culty in reading or mathematics. School, district, and state educational administrators must
make decisions about educational policy and often have to produce evidence for state and local
school boards and state legislatures on the achievement of students. Employers must decide
which job applicants to hire and which positions they should fill. A college counselor must de-
cide what action to take with a student who is having difficulty adjusting to the personal freedom
that a college environment provides. The list of decisions that people must make is as long as the
list of human actions and interactions.
We generally assume that the more people know about the factors involved in their deci-
sions, the better their decisions are likely to be. That is, more and better information is likely to
lead to better decisions. Of course, merely having information is no guarantee that it will be used
to the best advantage. The information must be appropriate for the decision to be made, and the
decision maker must also know how best to use the information and what inferences it does and
does not support. Our purpose in this book is to present some of the basic concepts, tools, prac-
tices, and methods that have been developed in education and psychology to aid in the decision-
making process. With these tools and methods, potential decision makers will be better prepared
to obtain and use the information needed to make sound decisions.

A LITTLE HISTORY

Although educational measurement has gained considerable prominence in recent decades, par-
ticularly with the No Child Left Behind Act and the rise in interest in accountability, the formal
evaluation of educational achievement and the use of this information to make decisions have
been going on for centuries. As far back as the dawn of the common era, the Chinese used com-
petitive examinations to select individuals for civil service positions (DuBois, 1970; R. M.
Thorndike, 1990a). Over the centuries, they developed a system of checks and controls to elim-
inate possible bias in their testing—procedures that in many ways resembled the best of modern
practice. For example, examinees were isolated to prevent possible cheating, compositions were
copied by trained scribes to eliminate the chance that differences in penmanship might affect
scores, and each examination was evaluated by a pair of graders, with differences being resolved
by a third judge. Testing sessions were extremely rigorous, lasting up to 3 days. Rates of passing
were low, usually less than 10%. In a number of ways, Chinese practice served as a model for de-
veloping civil service examinations in western Europe and America during the 1800s.
Formal measurement procedures began to appear in Western educational practice during
the 19th century. For several centuries, secondary schools and universities had been using essays
and oral examinations to evaluate student achievement, but in 1897, Joseph M. Rice used some
of the first uniform written examinations to test spelling achievement of students in the public

2
CHAPTER 1 Fundamental Issues in Measurement

schools of Boston. Rice wanted the schools to make room in the curriculum for teaching science
and argued that some of the time spent on spelling drills could be used for that purpose. He
demonstrated that the amount of time devoted to spelling drills was not related to achievement
in spelling and concluded that this time could be reduced, thus making time to teach science.
His study represents one of the first times tests were used to help make a curricular decision.
Throughout the latter half of the 19th century, pioneering work in the infant science of psy-
chology involved developing new ways to measure human behavior and experience. Many mea-
surement advances came from laboratory studies such as those of Hermann Ebbinghaus, who in
1896 introduced the completion test (fill in the blanks) as a way to measure mental fatigue in
students. Earlier in the century, the work of Ernst Weber and Gustav Fechner on the measure-
ment of sensory processes had laid the logical foundation for psychological and educational
measurement. Other important advances, such as the development of the correlation coefficient
by Sir Francis Galton and Karl Pearson, were made in the service of research on the distribution
and causes of human differences. The late 1800s were characterized by DuBois (1970) as the
laboratory period in the history of psychological measurement. This period has also been called
the era of brass instrument psychology because mechanical devices were often used to collect mea-
surements of physical or sensory characteristics.
Increasing interest in measuring human characteristics in the second half of the 19th cen-
tury can be traced to the need to make decisions in three contexts. First, enactment of mandatory
school attendance laws resulted in a growing demand for objectivity and accountability in
assessing student performance in the public schools. These laws brought into the schools for the
first time a large number of students who were of middle or lower socioeconomic background
and were unfamiliar with formal education. Many of these children performed poorly and were
considered by some educators of the time to be “feebleminded” and unable to learn. The devel-
opment of accurate measurement methods and instruments was seen as a way to differentiate
children with true mental handicaps from those who suffered from disadvantaged backgrounds.
Second, the medical community was in the process of refining its ideas about abnormal behavior.
Behavioral and psychological measurements were seen as a way to classify and diagnose patients.
Third, businesses and government agencies began to replace patronage systems for filling corpo-
rate and government jobs with competitive examinations to assess prospective employees’ abili-
ties. Tests began to be used as the basis of employee selection.
Not until the first years of the 20th century did well-developed prototypes of modern edu-
cational and psychological measurements begin to appear. Although it is difficult to identify a
single critical event, the 1905 publication of the Binet-Simon scales of mental ability is often con-
sidered to be the beginning of the modern era in what was at the time called mental measure-
ment. The Binet-Simon scales, originally published in French but soon translated into English
and other languages, have been hailed as the first successful attempt to measure complex mental
processes with a standard set of tasks of graded complexity. These scales were designed to help
educators identify students whose mental ability was insufficient for them to benefit from stan-
dard public education. On the basis of the mental measurement, a decision was then made
whether to place these students in special classes. Subsequent editions of the scales, published in
1908 and 1911, contained tasks that spanned the full range of abilities for school-age children
and could be used to identify students at either extreme of the ability continuum. (See R. M.
Thorndike, 1990a, for a more complete description of these scales.)
At the same time that Binet and Simon were developing the first measures of intelligence,
E. L. Thorndike and his students at Teachers College of Columbia University were tackling prob-
lems related to measuring school abilities. Their work ranged from theoretical developments on

3
PART ONE Technical Issues

the nature of the measurement process to the creation of scales to assess classroom learning in
reading and arithmetic and also level of skill development in tasks such as handwriting. The era
of mental testing had begun.
It is convenient to divide the history of mental testing in the 20th century into six periods:
an early period, a boom period, a first period of criticism, a battery period, a second period of
criticism, and a period of accountability.

The Early Period


The early period, which comprises the years before American entry into World War I, was a pe-
riod of tentative exploration and theory development. The Binet-Simon scales, revised twice by
Binet, were brought to the United States by several pioneers in measurement. The most influen-
tial of these was Lewis Terman of Stanford University. In 1916, Terman published the first ver-
sion of a test that is still one of the standards by which measures of intelligence are judged: the
Stanford-Binet Intelligence Scale. (The fifth edition of the Stanford-Binet was released in 2003.)
Working with Terman, Arthur Otis began to explore the possibility of testing the mental ability
of children and adults in groups. In Australia, S. D. Porteus prepared a maze test of intelligence
for use with people with hearing or language handicaps.
In 1904 Charles Spearman published two important theories relating to the measurement of
human abilities. The first was a statistical theory that proposed to describe and account for the
inconsistency in measurements of human behavior. The second theory claimed to account for
the fact that different measures of cognitive ability showed substantial consistency in the ways in
which they ranked people. The statistical theory to describe inconsistency has developed into the
concept of reliability that we will discuss in Chapter 4. Spearman’s second theory, that there is a
single dimension of ability underlying most human performance, played a major role in deter-
mining the direction that measures of ability took for many years and is still influential in
theories of human cognitive abilities. Spearman proposed that the consistency of people’s per-
formance on different ability measures was the result of the level of general intelligence that they
possessed. We will discuss modern descendants of this theory and the tests that have been
developed to measure intelligence in Chapter 12.

The Boom Period


American involvement in World War I created a need to expand the army very quickly. For the
first time, the new science of psychology was called on to play a part in a military situation. This
event started a 15-year boom period during which many advances and innovations were made in
the field of testing and measurement. As part of the war effort, a group of psychologists led by
Robert Yerkes expanded Otis’s work to develop and implement the first large-scale group testing
of ability with the Army Alpha (a verbal test) and the Army Beta (a test using mazes and puzzles
similar to Porteus’s that required no spoken or written language). The Army Alpha was the first
widely distributed test to use the multiple-choice item form. The first objective measure of per-
sonality, the Woodworth Personal Data Sheet, was also developed for the army to help identify
those emotionally unfit for military service. The Alpha and Beta tests were used to select officer
trainees and to remove those with intellectual handicaps from military service.
In the 12 years following the war, the variety of behaviors that were subjected to measure-
ment continued to expand rapidly. E. K. Strong and his students began to measure vocational
interests to help college students choose majors and careers consistent with their interests.

4
CHAPTER 1 Fundamental Issues in Measurement

Measurements of personality and ability were developed and refined, and the use of standardized
tests for educational decisions became more widespread. In 1929, L. L. Thurstone proposed
ways to scale and measure attitudes and values. Many people considered it only a matter of a few
years before accurate measurement and prediction of all types of human behavior would be
achieved.
The period immediately following World War I was also a low point for the mental testing
movement. High expectations about what test scores could tell us about people’s abilities and
character led test developers and users to place far too much reliance on the correctness of test
scores. The results of the U.S. Army testing program revealed large score differences between
White American examinees and those having different ethnic backgrounds. Low test scores for
African Americans and immigrants from southern and eastern Europe were interpreted as re-
vealing an intellectual hierarchy, with people of northern European ancestry (“Nordics”) at the
top. Members of the lowest scoring ethnic groups, particularly those of African ancestry, were
labeled “feebleminded.” A number of critics, most notably Walter Lippmann, questioned both
the tests themselves and the conclusions drawn from the test scores.

The First Period of Criticism


The 1930s saw a crash not only in the stock market but also in the confidence in and expec-
tations for mental measurement. This time covered a period of criticism and consolidation. To be
sure, new tests were published, most notably the original Kuder scales for measuring vocational
interests, the Minnesota Multiphasic Personality Inventory, and the first serious competitor for
the Stanford-Binet, the Wechsler-Bellevue Intelligence Scale. Major advances were also made in
the mathematical theory underlying tests, particularly L. L. Thurstone’s refinements of the statis-
tical procedure known as factor analysis. However, it was becoming clear that the problems of
measuring human behavior had not all been solved and were much more difficult than they had
appeared to be during the heady years of the 1920s.
The rapid expansion in the variety of tests being produced, the increasing use of test scores
for decision making, and the criticisms of testing in the press led a young psychologist named
Oscar Buros to call on the professional psychological and educational testing community to po-
lice itself. Buros observed that many tests had little or no objective evidence to support the uses
to which they were being put. In 1935, he initiated the Mental Measurements Yearbook (MMY) as
a place where critical reviews of tests and testing practices could be published. His objective was
to obtain reviews of tests from the leading experts in testing, which in turn would cause test pro-
ducers to provide better tests and more evidence supporting specific uses of tests. As we shall see
in Chapter 6, the MMY publications remain one of the best sources of information about tests.

The Battery Period


In the 1940s, psychological measurement was once again called on for use in the military ser-
vice. As part of the war effort, batteries of tests were developed that measured several different
abilities. Based on the theory developed by Thurstone and others that there were several distinct
types or dimensions of abilities, these test batteries were used to place military recruits in the po-
sitions for which they were best suited. The success of this approach in reducing failure rates in
various military training programs led the measurement field into a period of emphasis on test
batteries and factor analysis. For 25 years, until about 1965, efforts were directed toward analyz-
ing the dimensions of human behavior by developing an increasing variety of tests of ability and

5
PART ONE Technical Issues

personality. Taxonomies of ability, such as those of Bloom (1956) and Guilford (1985), were of-
fered to describe the range of mental functioning.
During the 1950s, educational and psychological testing grew into a big business. The use of
nationally normed, commercially prepared tests to assess student progress became a common
feature of school life. The Scholastic Aptitude Tests (SAT, now called the Scholastic Assessment
Tests) or the American College Testing Program (ACT Assessment) became almost universally re-
quired as part of a college admissions portfolio. Business, industry, and the civil service system
made increasing use of measurements of attitudes and personality, as well as ability, in hiring and
promotion decisions. The General Aptitude Test Battery (GATB) was developed by the U.S. Em-
ployment Service, and other test batteries were developed by private testing companies to assist
individuals and organizations in making career and hiring decisions. Patients in mental institu-
tions were routinely assessed through a variety of measures of personality and adjustment. In
1954, led by the American Psychological Association, the professional testing community pub-
lished a set of guidelines for educational and psychological tests to provide public standards for
good test development and testing practice. Testing became part of the American way of life. The
widespread use—and misuse—of tests brought about a new wave of protests.

The Second Period of Criticism


The beginning of a second period of criticism was signaled in 1965 by a series of congressional
hearings on testing as an invasion of privacy. The decade of the 1960s was also a time when the
civil rights movement was in full swing and women were reacting against what they perceived to
be a male-dominated society. Because the ability test scores of Blacks were generally lower than
those of Whites, and the scores of women were lower than those of men in some areas (although
they were higher than men’s scores in other areas), tests were excoriated as biased tools of White
male oppression. Since that time, debate has continued over the use of ability and personality
testing in public education and employment. A major concern has been the possible use of tests
to discriminate, intentionally or otherwise, against women or members of minority groups in ed-
ucation and employment. As a result of this concern, the tests themselves have been very care-
fully scrutinized for biased content, certain types of testing practices have been eliminated or
changed, and much more attention has been given to the rights of individuals. The testing in-
dustry responded vigorously to the desire to make tests fair to all who take them, but this has not
been sufficient to forestall both legislation and administrative and court decisions restricting the
use of tests. A recent example of an administrative decision is the proposal to eliminate performance
on the SAT as a tool in making admissions decisions at institutions in the University of California
system. This situation is unfortunate because it deprives decision makers of some of the best
information on which to base their actions. In effect, we may have thrown out the baby with the
bath water in our efforts to eliminate bias from the practice of testing. In Chapter 8 we will take
a closer look at the controversies surrounding educational and psychological uses of tests.

The Age of Accountability


At the same time that public criticism of testing was on the rise, governments were putting
greater faith in testing as a way to determine whether government-funded programs were achiev-
ing their objectives. With passage of the 1965 Elementary and Secondary Education Act (ESEA),
federally funded education initiatives began to include a requirement that programs report some
form of assessment, often in the form of standardized test results. In 2002, President George

6
CHAPTER 1 Fundamental Issues in Measurement

W. Bush signed the No Child Left Behind (NCLB) act into law. The primary goal of NCLB is
to ensure that all public school students achieve high standards of performance in the areas of
reading-language arts, mathematics, and science. To meet this goal, each state was required to
generate a set of rigorous achievement standards and to develop assessments measuring student
proficiency relative to those standards. States must hold schools and school districts accountable
for the performance of their students. Low-performing schools are subject to a variety of inter-
ventions ranging from technical assistance to sanctions and restructuring. Many states have also
enacted laws requiring students to pass standardized tests in order to earn a high school
diploma. While this high-stakes use of testing is not new (Chinese practices of 1000 years ago
had greater social consequences and were far more demanding), the number of states using
mandatory exams as a condition of graduation is steadily increasing. Scores on the Washington
Assessment of Student Learning, for example, are used to meet the accountability requirements
of NCLB and to determine, beginning with the graduating class of 2008, which students meet
the achievement standards necessary for a high school diploma. Nationwide, NCLB also requires
new teachers to pass standardized tests to earn certification and demonstrate the subject matter
competence necessary to be considered “highly qualified.” Thus, at the same time that wide-
spread criticism of tests, particularly standardized tests used to make educational decisions, has
arisen, the role of such tests in ensuring that schools are accountable for the learning of their
students has steadily increased.

TYPES OF DECISIONS

Educational and psychological evaluation and measurement have evolved to help people make
decisions related to people, individually or in groups. Teachers, counselors, school administra-
tors, and psychologists in business and industry, for example, are continuously involved in
making decisions about people or in helping people make decisions for and about themselves.
The role of measurement procedures is to provide information that will permit these decisions to
be as informed and appropriate as possible.
Some decisions are instructional; many decisions made by teachers and school psychologists
are of this sort. An instructional decision may relate to a class as a whole: For example, should
class time be spent reviewing “carrying” in addition? Or does most of the class have adequate
competency in this skill? Other decisions relate to specific students: For example, what reading
materials are likely to be suitable for Mary, in view of her interests and level of reading skill? If
such decisions are to be made wisely, it is important to know, in the first case, the overall level of
skill of the class in “carrying” and, in the second, how competent a reader Mary is and what her
interests are.
Some decisions are curricular. A school may be considering a curricular change such as in-
troducing computer-assisted instruction (CAI) to teach the principles of multiplying fractions or
a Web-based module on African geography. Should the change be made? A wise decision hinges
on finding out how well students progress in learning to multiply fractions using CAI or African
geography from Web materials rather than the conventional approaches. The evidence of
progress, and hence the quality of the decision, can only be as good as the measures of mathe-
matics competence or geography knowledge we use to assess the outcomes of the alternative
instructional programs.
Some decisions are selection ones made by an employer or a decision maker in an educa-
tional institution. A popular college must decide which applicants to admit to its freshman class.

7
PART ONE Technical Issues

Criteria for admission are likely to be complex, but one criterion will usually be that each admit-
ted student is judged likely to be able to successfully complete the academic work that the
college requires. When combined with other sources, such as high school grades, standardized
tests of academic ability such as the SAT or ACT Assessment can add useful information about
who is most likely to succeed at the college. Selection decisions also arise in employment. The
employer, seeking to identify the potentially more effective employees from an applicant pool,
may find that performance in a controlled testing situation provides information that can im-
prove the accuracy and objectivity of hiring decisions, resulting in improved productivity and
greater employee satisfaction.
Sometimes decisions are placement, or classification, decisions. A high school may have to
decide whether a freshman should be put in the advanced placement section in mathematics or
in the regular section. An army personnel technician may have to decide whether a recruit should
be assigned to the school for electronic technicians or the school for cooks and bakers. A family
doctor makes a classification decision when he or she diagnoses a backache to be the result of
muscle strain or a pinched nerve. For placement decisions, the decision maker needs informa-
tion to help predict how much the individual will learn from or how successful the candidate
will be in each of the alternative programs. Information helps the person making a classification
decision to identify the group to which the individual most likely or properly belongs.
Finally, many decisions can best be called personal decisions. They are the choices that each
individual makes at the many crossroads of life. Should I plan to go on for a master’s degree or to
some other type of postcollege training? Or should I seek a job at the end of college? If a job,
what kind of job? In light of this decision, what sort of program should I take in college? Guid-
ance counselors frequently use standardized tests to help young adults make decisions like these.
The more information people have about their own interests and abilities, the more informed
personal decisions they can make.

MEASUREMENT AND DECISIONS

Educational and psychological measurement techniques can help people make better decisions
by providing more and better information. Throughout this book, we will identify and describe
properties that measurement devices must have if they are to help people make sound decisions.
We will show the form in which test results should be presented if they are to be most helpful to
the decision maker. As we look at each type of assessment technique, we will ask, “For what
types of decisions can the particular technique contribute valuable information?” We must be
concerned with a variety of factors, including poor motivation, emotional upset, inadequate
schooling, or atypical linguistic or cultural background, all of which can distort the information
provided by a test, questionnaire, or other assessment procedure. We will also consider precau-
tions that need to be observed in using the information for decision making.

The Role of Values in Decision Making


Measurement procedures do not make decisions; people make decisions. At most, measurement
procedures can provide information on some of the factors that are relevant to the decision. The
SAT can provide an indication of how well Grace is likely to do in college-level work. Combined
with information about how academically demanding the engineering program is at Siwash

8
CHAPTER 1 Fundamental Issues in Measurement

University, the test score can be used to make a specific estimate of how well Grace is likely to do
in that program. However, only Grace can decide whether she should go to Siwash and whether
she should study engineering. Is she interested in engineering? Does she have a personal reason
for wanting to go to Siwash rather than to some other university? Are economic factors of
concern? What role does Grace aspire to play in society? Maybe she has no interest in further
education and would rather be a competitive surfer.
This example should make it clear that decisions about courses of action to take involve
values as well as facts. The SAT produces a score that is a fact, and that fact may lead to a predic-
tion that Grace has five chances in six of being admitted to Siwash and only one chance in six of
being admitted to Stanford. But, if she considers Stanford to be 10 times more desirable than
Siwash, it still might be a sensible decision for her to apply to Stanford despite her radically
lower chance of being admitted. The test score provides no information about the domain of
values. This information must be supplied from other sources before Grace can make a sensible
decision. One of the important roles that counselors play is to make people aware of the values
they bring to any decision.
The issue of values affects institutional decision makers as well as individuals. An aptitude
test may permit an estimate of the probability of success for a Black or a Hispanic or a Native
American student, in comparison with the probability of success for a White or Asian student, in
some types of academic or professional training. However, an admission decision would have to
include, explicitly or implicitly, some judgment about the relative value to society of adding more
White or Asian individuals to a profession in comparison with the value of having increased
Black, Hispanic, or Native American representation. Concerns about social justice and the role of
education in promoting equality, both of opportunity and of outcome, have assumed an increas-
ing importance in decision making over the last 35 years. However, in recent years three states,
California, Washington, and Michigan, have passed laws prohibiting the use of race in educational
decision making.
Issues of value are always complex and often controversial, but they are frequently deeply
involved in decision making. Very few decisions are value neutral. It is important that this fact be
recognized, and it is also important that assessment procedures that can supply better informa-
tion not be blamed for the ambiguities or conflicts that may be found in our value systems. We
should not kill the messenger who brings us news we do not want to hear, nor should we cover
our eyes or ears and refuse to receive the news. Rather, we should consider policies and proce-
dures that might change the unwelcome facts.
As we suggested at the beginning of this chapter, all aspects of human behavior involve mak-
ing decisions. We must make decisions. Even taking no action in a situation is a decision. In most
cases, people weigh the evidence on the likelihood of various outcomes and the positive and nega-
tive consequences and value implications of each possible outcome. The role of educational and
psychological assessment procedures can be no more than to provide some of the information on
which certain kinds of decisions may be based. The evidence suggests that, when properly used,
these procedures can provide useful information that is more accurate than that provided by alter-
nate approaches. Your study of educational and psychological assessment should give you an un-
derstanding of the tools and techniques available for obtaining the kinds of information about
people that these measures can yield. Beyond that, such study should provide criteria for evaluat-
ing the information that these tools offer, for judging the degree of confidence that can be placed in
the information, and for sensing the limitations inherent in that information.
After voters in the state of California voted to end the use of ethnic identity as a factor in uni-
versity admissions, state education officials proposed to eliminate the use of SAT scores in

9
PART ONE Technical Issues

admissions decisions. The apparent reason for the decision not to use the test scores is that
White and Asian American students typically earn higher scores on these tests than do Black and
Hispanic students, thereby giving them a better chance of admission to the state’s universities. If
proportional representation by all ethnic groups is a valued objective for system administrators,
then using test scores has negative value because it would tend to produce unequal admissions
rates. On the other hand, legislators in Washington State wished to be able to reward schools and
districts whose students showed higher than average achievement of the state’s educational ob-
jectives, as measured by the state’s assessment tests. Like the SAT, the Washington State tests also
show different levels of achievement for different ethnic groups, but the high value placed on re-
warding achievement outweighed the negative outcome of revealing ethnic differences. The two
state education establishments reached contradictory conclusions about the use of standardized
tests due to the different values each was trying to satisfy.
So far, we have considered practical decisions leading to action. Measurement is also impor-
tant in providing information to guide theoretical decisions. In these cases, the desired result is
not action but, instead, understanding. Do girls of a certain age read at a higher level than boys?
A reading test is needed to obtain the information on which to base a decision. Do students who
are anxious about tests perform less well on them than students who are not anxious? A ques-
tionnaire on “test anxiety” and a test of academic achievement could be used to obtain informa-
tion helpful in reaching a decision on this issue. Even a question as basic as whether the size of
reward a rat receives for running through a maze affects the rat’s running speed requires that the
investigator make measurements. Measurement is fundamental to answering nearly all the ques-
tions that science asks, not only in the physical sciences but also in the behavioral and biological
sciences. The questions we choose to ask and how we ask them, however, are guided and limited
by our assumptions about the nature of reality and the values we bring to the task.

STEPS IN THE MEASUREMENT PROCESS

In this book, we discuss the measurement of human abilities, interests, and personality traits. We
need to pause for a moment to look at what is implied by measurement and what requirements
must be met if we are legitimately to claim that a measurement has been made. We also need to
ask how well the available techniques for measuring the human characteristics of interest do in
fact meet these requirements.
Measurement in any field involves three common steps: (1) identifying and defining the
quality or the attribute that is to be measured, (2) determining the set of operations by which the
attribute may be isolated and displayed for observation, and (3) establishing a set of procedures
or definitions for translating our observations into quantitative statements of degree or amount.
An understanding of these steps, and of the difficulties that each presents, provides a sound
foundation for understanding the procedures and problems of measurement in psychology and
education.

Identifying and Defining the Attribute


We never measure a thing or a person. We always measure a quality or an attribute of the thing
or person. We measure, for example, the length of a table, the temperature of a blast furnace, the
durability of an automobile tire, the flavor of a soft drink, the intelligence of a schoolchild, or the

10
CHAPTER 1 Fundamental Issues in Measurement

emotional maturity of an adolescent. Psychologists and educators frequently use the term
construct to refer to the more abstract and difficult-to-observe properties of people, such as
their intelligence or personality.
When we deal with simple physical attributes, such as length, it rarely occurs to us to won-
der about the meaning or definition of the attribute. A clear meaning for length was established
long ago in the history of both the species and the individual. The units for expressing length
and the operations for making the property manifest have changed over the years (we no longer
speak of palms or cubits); however, the underlying concepts have not. Although mastery of the
concepts of long and short may represent significant accomplishments in the life of a preschool
child, they are automatic in adult society. We all know what we mean by length.
The construct of length is one about which there is little disagreement, and the operations
by which length can be determined are well known. However, this level of construct agreement
and clarity of definition do not exist for all physical attributes. What do we mean by durability in
an automobile tire? Do we mean resistance to wear and abrasion from contact with the road? Do
we mean resistance to puncture by pointed objects? Do we mean resistance to deterioration or
decay with the passage of time or exposure to sunlight? Or do we mean some combination of
these three and possibly other factors? Until we can reach some agreement on what we mean by
durability, we can make no progress toward measuring it. To the extent that we disagree on what
durability means (i.e., on a definition of the construct), we will disagree on what procedures are
appropriate for measuring it. If we use different procedures, we are likely to get different results
from our measurements, and we will disagree on the value that we obtain to represent the dura-
bility of a particular brand of tire.
The problem of agreeing on what a given construct means is even more acute when we start
to consider those attributes of concern to the psychologist or educator. What do we mean by intel-
ligence? What kinds of behavior shall we characterize as intelligent? Shall we define the construct
primarily in terms of dealing with ideas and abstract concepts? Or will it include dealing with
things—with concrete objects? Will it refer primarily to behavior in novel situations? Or will it
include responses in familiar and habitual settings? Will it refer to speed and fluency of response
or to level of complexity of the response without regard to time? Will it include skill in social in-
teractions? What kinds of products result from the exercise of intelligence: a theory about atomic
structures, a ballet, or a snowman? We all have a general idea of what we mean when we charac-
terize behavior as intelligent, but there are many specific points on which we may disagree as we
try to make our definition sufficiently precise to allow measurement. This problem of precisely
defining the attribute is present for all psychological constructs—more for some than for others.
The first problem that psychologists or educators face as they try to measure attributes is arriving
at clear, precise, and generally accepted definitions of those attributes.
Of course, we must answer another question even before we face the problem of defining
the attribute. We must decide which attributes are relevant and important to measure if our
description is to be useful. A description may fail to be useful for the need at hand because the
chosen features are irrelevant. For example, in describing a painting, we might report its height,
breadth, and weight with great precision and reach high agreement about the amount of each
property the painting possesses. If our concern were to crate the picture for shipment, these
might be just the items of information that we would need. However, if our purpose were to
characterize the painting as a work of art, our description would be useless; the attributes of the
painting we just described would be irrelevant to its quality as a work of art.
Similarly, a description of a person may be of little value for our purpose if we choose the
wrong attributes to describe. A company selecting employees to become truck drivers might test

11
PART ONE Technical Issues

their verbal comprehension and ability to solve quantitative problems, getting very accurate
measures of these functions. Information on these factors, however, is likely to be of little help in
identifying people who have low accident records and would be steady and dependable on the
job. Other factors, such as eye–hand coordination, depth perception, and freedom from uncon-
trolled aggressive impulses, might prove much more relevant to the tasks and pressures that a
truck driver faces. The usefulness of a measuring procedure for its intended purpose is called
its validity.
Consider a high school music teacher who thoroughly tested the pupils’ knowledge of such
facts as who wrote the “Emperor Concerto” and whether andante is faster than allegro. The
teacher would obtain a dependable appraisal of the students’ knowledge about music and musi-
cians without presenting them with a single note of actual music, a single theme or melody, a sin-
gle interpretation or appraisal of living music. As an appraisal of musical appreciation, such a test
seems almost worthless because it uses bits of factual knowledge about music and composers in
place of information that would indicate progress in appreciation of the music itself. One of the
pitfalls to which psychologists and educators occasionally are prone is to elect to measure some
attribute because it is easy to measure rather than because it provides the most relevant informa-
tion for making the decision at hand. It is important to measure traits that are relevant to the
decisions to be made rather than merely to measure traits that are easy to assess. We will discuss
the issue of relevance again when we cover validity in Chapter 5.

Determining Operations to Isolate and Display the Attribute


The second step in developing a measurement procedure is finding or inventing a set of opera-
tions that will isolate the attribute of interest and display it. The operations for measuring the
length of an object such as a table have been essentially unchanged for many centuries. We con-
vey them to the child early in elementary school. The ruler, the meter stick, and the tape mea-
sure are uniformly accepted as appropriate instruments, and laying one of them along an object
is an appropriate procedure for displaying to the eye the length of the table, desk, or other object.
But the operations for measuring length, or distance, are not always that simple. By what opera-
tions do we measure the distance from New York to Chicago? From the earth to the sun? From
our solar system to the giant spiral galaxy in Andromeda? How shall we measure the length of a
tuberculosis bacillus or the diameter of a neutron? Physical science has progressed by developing
both instruments that extend the capabilities of our senses and indirect procedures that make
accessible to us amounts too great or too small for the simple, direct approach of laying a mea-
suring stick along the object. Some operations for measuring length, or distance, have become
indirect, elaborate, and increasingly precise. These less intuitive methods (such as the shift of
certain wavelengths of light toward the red end of the spectrum) are accepted because they give
results that are consistent, verifiable, and useful, but their relationship to the property of interest
is far from obvious.
Returning to the example of the durability of the automobile tire, we can see that the opera-
tions for eliciting or displaying that attribute will depend on and interact with the definition that
we have accepted for the construct. If our definition is in terms of resistance to abrasion, we need
to develop some standard and uniform procedure for applying an abrasive force to a specimen
and gauging the rate at which the rubber wears away, that is, a standardized simulated road test.
If we have indicated puncture resistance as the central concept, we need a way of applying grad-
uated puncturing forces. If our definition is in terms of resistance to deterioration from sun, oil,
and other destructive agents, our procedure must expose the samples to these agents and provide

12
CHAPTER 1 Fundamental Issues in Measurement

some index of the resulting loss of strength or resilience. A definition of durability that incorpo-
rates more than one aspect will require an assessment of each, with appropriate weight, and
combine the aspects in an appropriate way; that is, if our definition of durability, for example, in-
cludes resistance to abrasion, punctures, and deterioration, then a measure of durability must as-
sess all of these properties and combine them in some way to give a single index of durability.
Many of the constructs we wish to measure in education and psychology are similar to our con-
struct of durability in that the global construct includes a combination of more or less indepen-
dent simpler constructs. How to combine a person’s ability to answer questions that require
reasoning with unfamiliar material, their short-term memory, their knowledge of culturally
salient facts, and many other relatively narrowly defined constructs into a global construct of
intelligence, or whether to combine them at all, is a hotly debated topic in both psychology and
education.
The definition of an attribute and the operations for eliciting it interact. On the one hand,
the definition we have set up determines what we will accept as relevant and reasonable opera-
tions. Conversely, the operations that we can devise to elicit or display the attribute constitute in
a very practical sense the definition of that attribute. An attribute defined by how it is measured
is said to have an operational definition. The set of procedures we are willing (or forced by
our lack of ingenuity) to accept as showing the durability of an automobile tire become the op-
erational definition of durability for us and may limit what we can say about it.
The history of psychological and educational measurement during the 20th century and
continuing into the 21st century has largely been the history of the invention of instruments and
procedures for eliciting, in a standard way and under uniform conditions, the behaviors that
serve as indicators of the relevant attributes of people. The series of tasks devised by Binet and
his successors constitute operations for eliciting behavior that is indicative of intelligence, and
the Stanford-Binet and other tests have come to provide operational definitions of intelligence.
The fact that there is no single, universally accepted test and that different tests vary somewhat in
the tasks they include and the order in which they rank people on the trait are evidence that we
do not have complete consensus on what intelligence is or on what the appropriate procedures
are for eliciting it. This lack of consensus is generally characteristic of the state of the art in psy-
chological and educational measurement. There is enough ambiguity in our definitions and
enough variety in the instruments we have devised to elicit the relevant behaviors that different
measures of what is alleged to be the same trait may rank people quite differently. This fact requires
that we be very careful not to overgeneralize or overinterpret the results of our measurements.
The problem of developing an operational definition for the characteristic of interest is also
present in the classroom. Teachers regularly face the problem of assessing student performance,
but what we will call performance is closely linked with the way we assess it. Only to the extent
that teachers agree on their operational definitions of student achievement will their assessments
have comparable meaning. If one teacher emphasizes quick recall of facts in his assessments and
another looks for application of principles in hers, they are, to some undetermined extent, eval-
uating different traits, and their assessments of their students’ accomplishments will mean differ-
ent things. Here, we do not suggest that this difference in emphasis is inappropriate, only that it
is a fact of which educators should be aware. This variability in definitions also provides a major
impetus for standardized achievement tests, because such tests are seen as providing a common
definition of the attribute to be measured and the method of measurement. The definition pro-
vided by the test may not exactly represent what either teacher means by achievement in Subject X,
but such definitions usually are developed to provide an adequate fit to the definitions most
teachers espouse.

13
PART ONE Technical Issues

Quantifying the Attribute


The third step in the measurement process, once we have accepted a set of operations for elicit-
ing an attribute, is to express the result of those operations in quantitative terms. Measurement
has sometimes been defined as assigning numbers to objects or people according to a set of rules. The
numbers represent how much of the attribute is present in the person or thing.
Using numbers has several advantages, two of which concern us. First, quantification makes
communication more efficient and precise. We know much more from the statement that Ralph
is 6 feet tall and weighs 175 pounds than we could learn from an attempt to describe Ralph’s size
in nonquantitative terms. We will see in Chapter 3 that much of the meaning that numbers have
comes from the context in which the measurements are made (i.e., the rules used to guide the as-
signment), but, given that context, information in quantitative form is more compact, more easily
understood, and generally more accurate than is the same information in other forms, such as
verbal descriptions or photographs. In fact, we are so accustomed to communicating some types
of information, such as temperature or age, in quantitative terms that we would have difficulty
using another framework.
A second major advantage of quantification is that we can apply the power of mathematics
to our observations to reveal broader levels of meaning. Consider, for example, trying to describe
the performance of a class on a reading test or the accomplishments of a batter in baseball. In
either case, we are accustomed to using the average as a summary of several individual perfor-
mances. For many purposes, it is useful to be able to add, subtract, multiply, or divide to bring
out the full meaning that a set of information may have. Some of the mathematical operations
that are useful to educators and psychologists in summarizing their quantitative information are
described in Chapter 2.
The critical initial step in quantification is to use a set of rules for assigning numbers that
allows us to answer the question “How many?” or “How much?” The set of rules is called a scale.
In the case of the length of a table the question becomes “How many inches?” or “How many
meters?” The inch, or the meter, represents the basic unit, and the set of rules includes the mea-
suring instrument itself and the act of laying the instrument along the object to be measured.
We can demonstrate that any inch equals any other inch by laying two inch-long objects
side by side and seeing their equality. Such a demonstration is direct and straightforward proof of
equality that is sufficient for some of the simplest physical measures. For measuring devices such
as the thermometer, units are equal by definition. Thus, we define equal increases in temperature
to correspond to equal amounts of expansion of a volume of mercury. One degree centigrade is
defined as 1/100 of the difference between the freezing and boiling points of water. Long experi-
ence with this definition has shown it to be useful because it gives results that relate in an orderly
and meaningful way to many other physical measures. (Beyond a certain point—the boiling
point of mercury—this particular definition breaks down. However, other procedures that can
be used outside this range can be shown to yield results equal to those of the mercury thermo-
meter. The same principle allows educators to use a graded series of tests to assess student progress
over several years.)
None of our psychological attributes have units whose equality can be demonstrated by di-
rect comparison in the way that the equality of inches or pounds can. How will we demonstrate
that arithmetic Problem X is equal, in amount of arithmetic ability that it represents, to arith-
metic Problem Y? How can we show that one symptom of anxiety is equal to another anxiety in-
dicator? For the qualities of concern to the psychologist or educator, we always have to fall back
on a somewhat arbitrary definition to provide units and quantification. Most frequently, we

14
CHAPTER 1 Fundamental Issues in Measurement

consider one task successfully completed—a word defined, an arithmetic problem solved, an
analogy completed, or an attitude statement endorsed—equal to any other task in the series and
then use the total number of successes or endorsements for an individual as the value represent-
ing the person on the particular attribute. This count of tasks successfully completed, or of
choices of a certain type, provides a plausible and manageable definition of amount, but we
have no adequate evidence of the equivalence of different test tasks or different questionnaire
responses. By what right or evidence do we treat a number series item such as “1, 3, 6, 10, 15,
___?___, ___?___” as showing the same amount of intellectual ability as, for example, a verbal
analogies item such as “Hot is to cold as wet is to ___?___?”
The definition of equivalent tasks and, consequently, of units for psychological tests is shaky
at best. When we have to deal with a teacher’s rating of a student’s cooperativeness or a supervi-
sor’s evaluation of an employee’s initiative, for example, where a set of categories such as “supe-
rior,” “very good,” “good,” “satisfactory,” and “unsatisfactory” is used, the meaningfulness of the
units in which these ratings are expressed is even more suspect. The problem of arbitrary metrics
in psychology has been discussed recently by Blanton and Jaccard (2006). We explore ways to
report the results of measurements that yield approximately equal units in Chapter 3, but, as we
shall see, even when units are equal they may only convey information about relative position,
not absolute amount.

Problems Relating to the Measurement Process


In psychological and educational measurement, we encounter problems in relation to each of the
three steps just described.
First, we have problems in selecting the attributes of concern and in defining them clearly,
unequivocally, and in mutually agreeable terms. Even for something as straightforward as read-
ing ability, we can get a range of interpretations. To what extent should a definition include each
of the following abilities?
1. Reads quickly
2. Converts visual symbols to sounds
3. Obtains direct literal meanings from the text
4. Draws inferences that go beyond what is directly stated
5. Is aware of the author’s bias or point of view.
As we deal with more complex and intangible concepts, such as cooperativeness, anxiety, adjust-
ment, or rigidity, we may expect even more diversity in definition.
Second, we encounter problems in devising procedures to elicit the relevant attributes. For
some attributes, we have been fairly successful in setting up operations that call on the individ-
ual to display the attribute and permit us to observe it under uniform and standardized condi-
tions. We have had this success primarily in the domain of abilities, where standardized tests
have been assembled in which the examinee is called on, for instance, to read with understand-
ing, to perceive quantitative relationships, or to identify correct forms of expression. However,
with many attributes we clearly have been less successful. By what standard operations can we
elicit, in a form in which we can assess it, a potential employee’s initiative, a client’s anxiety, or a
soldier’s suitability for combat duty? With continued research and with more ingenuity, we may
hope to devise improved operations for making certain of these attributes or qualities manifest,
but identifying suitable measurement operations for many psychological attributes will remain a
problem.

15
PART ONE Technical Issues

Finally, even our best psychological units of measure leave much to be desired. Units are set
equal by definition; the definition may have a certain plausibility, but the equality of units cannot
be established in any fundamental sense. For this reason, the addition, subtraction, and compar-
ison of scores will always be somewhat suspect. Furthermore, the precision with which the at-
tribute is assessed—the consistency from one occasion to another or from one appraiser to
another—is often discouragingly low.
In spite of the problems involved in developing educational and psychological measuring
devices, the task has been proven worthwhile. The procedures now available have developed a
record of usefulness in helping individuals make decisions in a wide variety of human contexts. Effi-
ciency of education and equality of opportunity are enhanced by the proper use and interpretation
of tests. Use of interest and personality inventories has led to people having greater self-understand-
ing and reduced psychological discomfort. Measures of human abilities have been used to provide
access to educational and occupational positions without regard to ethnic or racial background.
Critics of testing are quick to point out that inequalities in test performance still exist, but,
in the last 25 years, the users of tests have become much more cautious in their interpretations
and much more sensitive to the rights of the test taker. Generally, the information provided by
tests is more accurate than that available from other sources. When we acknowledge that deci-
sions must be made and that accurate information coupled with a clear understanding of our
values generally leads to the best decisions, we will find continuing growth in the importance of
measurement and assessment processes in psychology and education.

SOME CURRENT ISSUES IN MEASUREMENT

Educational and psychological assessments are far from perfect. Since the earliest attempts to de-
velop measurement techniques in a systematic way, the procedures have been a target for a wide
spectrum of critics. Much of this criticism has been a justified response to the naive enthusiasm
of measurement proponents and some of their ill-conceived applications and interpretations of
measurement results. In subsequent chapters in our discussion of test interpretation and use, we
will try to be sensitive to earlier criticisms and to the more technical questions that arise con-
cerning the reliability and validity of test results. At this point, however, we will identify and
comment briefly on some of the issues that have been of special concern.

Testing Minority Individuals


For many years, the use and interpretation of tests within ethnic minority and other groups whose
experiences and cultures may differ from that typical of the general population for which a test was
designed have received a great deal of attention. There are, of course, all sorts of subgroups in Amer-
ican society, differing from one another in a variety of ways. Ethnic and linguistic minorities are prob-
ably the most clear-cut of these; they are the ones for whom the appropriateness of tests and
questionnaires designed to reflect the values and experiences of typical middle-class, White
Americans are most open to question. In recent years, the number of Americans for whom English
is not the preferred language has increased to the point where many ability and achievement tests are
available in Spanish translations. Additionally, feminist groups have complained that test material is
often male oriented. Major test publishers now go to considerable lengths to ensure that their test
items do not present an unfair challenge to women or to members of ethnic or linguistic minority
groups. Test users, teachers, counselors, and school psychologists, for example, are also required to

16
CHAPTER 1 Fundamental Issues in Measurement

have training in recognizing and, to the extent possible, ameliorating the potentially adverse conse-
quences of using particular instruments with people for whom they may not be appropriate.
Some questions arise concerning achievement tests that attempt to assess what a student has
learned to do. In part, these questions center on the degree to which the same educational objec-
tives hold for groups from minority cultures. Is the ability to read standard English with under-
standing an important goal for African American, Hispanic, or Native American children? Is
knowledge about the U.S. Constitution as relevant for these groups as for the middle-class,
White eighth-grader? One senses that as far as the basic skills of dealing with language and num-
bers are concerned, many of the same objectives would apply, but as one moves into the content
areas of history and literature, more divergence might be appropriate.
In part, the questions focus on the specific materials through which basic skills are exhib-
ited. Is the same reading passage about the life of the Zulu appropriate in a reading test for a
Hispanic or a Native American youngster as it would be for a child of White middle-class back-
ground? Or should test materials, and perhaps instructional materials as well, be specifically
tailored to the life and experiences of each ethnic group? We know too little about the impor-
tance of factors of specific content on performance in areas such as reading comprehension and
need to do further research to make informed decisions.
The motivation of minority groups to do well on tests in school is also an issue. Some minority
groups, such as recent immigrants from Southeast Asia, place great emphasis on individual academic
achievement. Others, such as some Native American groups, place much more value on communal
accomplishments and group identity. In many cases of academic difficulty, we must ask whether
unfortunate experiences with testing in school have soured the students on the whole enterprise so
that they withdraw from the task and do not try. It is perhaps a challenge to the whole pattern of
education, not just to the choice of testing instruments, to provide a reasonable mixture of satisfying
and success-enhancing experiences in school with all types of school tasks. As far as possible, tasks
and tests should be adapted to the present capabilities and concerns of the individual student, but
accountability concerns also dictate that this should not come at the expense of achievement.
Many more questions—perhaps more serious ones—are raised when tests are used as a basis
for deciding what an individual can learn to do, that is, as aptitude measures. An inference from a
test score obtained at Time 1 concerning what a person can learn to do by Time 2 is clearly a more
questionable inference than one that merely states what that person can do at Time 1. There are
many intervening events that can throw the prediction off—and there are correspondingly more
possibilities of biasing factors coming in to systematically distort the prediction for minority group
members. Present facts that imply one prediction for the majority group may imply a different
prediction for a minority group member whose experiences before testing were far from typical.
Remembering that decisions must be made, the problem is to learn what types of inferences can
appropriately be made for individuals with differing backgrounds and what types of adjustments
or modifications need to be built into the system to permit the most accurate and equitable infer-
ences and decisions for all people. The best decision-making systems are likely to be those that
employ a monitoring process to assess the progress of each student and contain enough flexibility
that a correction can be made if the original decision appears to be in error.

Invasion of Privacy
A second concern often expressed involves invasion of privacy. What kind of information is
reasonable to require individuals to give about themselves and under what circumstances?
This issue arises not only in relation to testing, but also in relation to all types of information

17
PART ONE Technical Issues

from and about a person. What types of records should appropriately be kept? And to whom
should they be made available? At one end of the spectrum are tests of job knowledge or skill,
such as clerical tests, to which few would object when the skill is clearly and directly relevant
to the position for which the person is applying and access to the information is limited to
potential employers. It is hard to argue that a test of the ability to use a word processor should
not be used to select the best candidate for a job where the chief duties will involve word
processing. At the other end of the spectrum are self-descriptive instruments that lead to
inferences about emotional stability or a scattering of tests that try to assess honesty under
temptation to lie or steal; in these latter tests, individuals are led to give information about
themselves without being aware of what information they are giving or how it will be inter-
preted. The use of these instruments seems most open to question. In between are instruments
that involve varying degrees of self-revelation and that appear to have varying degrees of
immediate relevance to a decision.
The issue is not only what information is being obtained, but also the purpose for which it
is being obtained. Is the information being obtained at the individual’s request to provide help
with personal or vocational problems or for use in a counseling relationship? The fact that the
person has come for help implies a willingness on his or her part to provide the information
needed for help to be given; here, invasion of privacy becomes a matter of relatively minor con-
cern. However, when the information is obtained to further institutional objectives—that is,
those of an employer, of an educational institution, or of “science”—then concern for the indi-
vidual’s right to privacy mounts, and some equitable balance must be struck between individual
values and rights and social ones. For example, more students were willing to allow the use of a
questionnaire to verify the emotional stability of an airline pilot, who is responsible for many
lives, than to verify that of a bank clerk (75% vs. 34%). The rights of the individual are not
absolute, but the feeling is often expressed that these rights have received too little consideration
in the past.
To an increasing extent, the courts are taking a role in deciding what information is allow-
able. Several court cases have required a demonstration of the validity or relevance of test scores
or personality profiles to job performance before such instruments can be used for employee se-
lection. These court decisions have affected the type of information that may be collected and
who may have access to that information. As the security of information stored in computer files
becomes more doubtful, privacy concerns also take on increased salience.

The Use of Normative Comparisons


A somewhat different type of issue has been raised concerning the emphasis, in test interpreta-
tion, on comparing the performance of one person with norms representing the typical perfor-
mance of a national, or sometimes a local, sample of people. The point being made with increasing
fervor is that many of the decisions for which tests are used, especially instructional ones, do not
call for—and are only confused by—comparisons with other people. The essential information is
whether the person can perform a specified task; this information should guide decisions on
what should be taught or what tasks should be undertaken next. Of course, there are settings in
which comparison with others is essential to sound judgment and decision making: Is the latest
applicant at the personnel office a good keyboard operator? How do we define “good keyboard
operator” except in terms of the performance of other job applicants? A rate of 60 words per
minute with two errors per minute is meaningless unless we know that the average graduate
from a commercial school can enter about 70 words per minute with perhaps one error per

18
CHAPTER 1 Fundamental Issues in Measurement

minute. The employer wants an employee who comes up to a standard of typical performance,
and that standard is set by the performance of others. In the same way, a school system trying to
evaluate the reading achievement of its sixth-graders as “good,” “satisfactory,” or “needing im-
provement” needs some benchmark against which to compare its own performance. No absolute
standard of reading performance exists. Whether it is reasonable to expect the sixth-graders in
Centerville to read and understand a particular article in Reader’s Digest can only be judged by
knowing whether representative groups of sixth-graders nationwide are able to read and under-
stand the same or similar articles.
In the past, normative comparisons with an outside reference group have often been used to
guide decisions for which they had little or no relevance. Greater emphasis is now being given to
criterion-referenced and mastery tests and to performance assessments. These procedures have
an important place in making some sorts of decisions, but not all. We need to know which type
of comparison is useful for which situation. When should we ask if a student can satisfactorily
perform a specific task? And when should we ask how that student compares in relation to other
students? This issue is discussed in more detail in Chapter 3.

Other Factors That Influence Scores


An issue that has concerned measurement professionals and consumers for many years is the ef-
fect of extraneous factors on performance. One such factor that has received considerable study
is the effect of anxiety. Does anxiety raise or lower performance on achievement tests? Are some
groups more prone to this effect than others? If test anxiety does have a systematic effect, what
can or should the teacher or examiner do to minimize the effect?
Other factors, such as the nutritional status of students or their ability to concentrate on the
tasks at hand, may also affect scores. Two particular factors that have received considerable theo-
retical attention are (1) the racial, ethnic, or gender relationship between examiner and examinee
and (2) the effect of coaching on test performance. Although the effects of these extraneous
factors are not clear, public concern over their possible effects has prompted educators and
measurement professionals to give them increasing attention.

Rights and Responsibilities of Test Takers


Researchers in all areas of science where living organisms are used in research, including psy-
chologists and educational researchers, have grown increasingly sensitive to the rights of their re-
search participants. Psychologists and counselors in professional practice have also developed a
greater awareness of their responsibilities toward their clients. On the research side, this trend
has led to the development of institutional research review boards (IRBs) at universities and
funding agencies that review planned research on humans and animals to make sure that the
rights of the study participants are respected.
In the area of educational and psychological measurement, the awareness of the rights of
examinees has found voice in several publications by the professional organizations most con-
cerned with tests and testing practice. The most influential publication has been a series of sets
of guidelines for educational and psychological tests that was first developed by the American
Psychological Association in 1954. Since then, the guidelines have been revised four times. The
most recent edition, published in 1999 as a joint project of the American Educational Research
Association (AERA), the American Psychological Association (APA), and the National Council on
Measurement in Education (NCME), titled Standards for Educational and Psychological Testing

19
PART ONE Technical Issues

(Standards), explains the current standards for the practice of test construction, administration,
and interpretation to protect the rights of test takers.
The APA maintains a Web page devoted to issues in the fair use and interpretation of tests at
http://www.apa.org/science/testing.html. Here one can read the APA statement on “Rights and
Responsibilities of Test Takers,” a statement that points out that both the person giving the test
and the test taker have obligations in a testing situation. For example, the test taker has the right
to an explanation of the procedures he or she will encounter and how they will be used, but test
takers also have a responsibility to ask questions about aspects of the testing session they do not
understand and to participate responsibly in the testing enterprise. This site has links to a wide
array of resources about testing and provides an order form to obtain a copy of the current ver-
sion of the Standards.
The APA has published a number of articles on high-stakes testing (testing in which the out-
come will have important life consequences for the examinees) and related issues in all levels of
American public education that may be accessed at the APA Web site (http://www.apa.org). Other
organizations, such as the American Counseling Association, the National Association of School
Psychologists, the NCME, and the Society for Industrial and Organizational Psychology also main-
tain Web sites that may from time to time contain information about tests and testing. Links to all
of these Web sites from the APA Web site mentioned in the preceding paragraph can be found
under the heading “links to other testing-related sites.” A particularly useful site that we will visit in
more detail in Chapter 6 is maintained by the Buros Institute of Mental Measurements.

SUMMARY
The objective of this book is to improve the knowl- types of measures are evaluated, provide a familiarity
edge and understanding of decision makers by with the different ways of reporting test scores, and
giving them a better basis for evaluating different describe and evaluate a number of the techniques
measurement procedures and for knowing what and instruments commonly used for appraising
each signifies and how much confidence can be human characteristics. Our success will be measured
placed on them. To this end, we will describe the by the extent to which our readers use measurement
process of preparing test exercises, develop the gen- results with wisdom and restraint in their decision
eral criteria of validity and reliability by which all making.

QUESTIONS AND EXERCISES


1. List some instances, preferably recent, of deci- surement could have been helpful but was
sions you made about yourself or others made unavailable. On what basis was the decision
about you in which results from some kind of actually made?
educational or psychological measurement 3. What are some alternatives to educational or
played a part. Classify each decision as (1) in- psychological measurements as guides for each
structional, (2) selection, (3) placement or of the following decisions?
classification, or (4) personal. a. How much time should be spent on phonics
2. From your personal experience of a decision in the first-grade reading program?
made, describe one or more instances for b. Which 5 of 15 applicants should be hired as
which an educational or psychological mea- computer programmers?

20
CHAPTER 1 Fundamental Issues in Measurement

c. Should Charles Turner be encouraged to 9. Which of the following practices would you
realize his desire to go to college and law consider acceptable? Which would you con-
school and to become a lawyer? sider to be an invasion of privacy? What factors
4. What are the advantages of the alternatives you influence your opinion?
proposed in Question 3, in comparison with a. Requiring medical school applicants (1) to
some type of test or questionnaire? What are take an achievement test in chemistry,
the disadvantages? (2) to fill out a questionnaire designed
to assess emotional stability, or (3) to take
5. To what extent and in what way might values
a scale of attitudes toward socialized
be involved in each of the decisions stated in
medicine
Question 3?
b. Requiring applicants for a secretarial job
6. Give an example of each of the following types (1) to take a test of general intelligence,
of tests: (2) to take a test of keyboarding speed and
a. A criterion-referenced achievement test accuracy, or (3) to fill out a questionnaire
b. A norm-referenced achievement test designed to appraise dependability
c. An aptitude test c. Giving a 10-year-old boy whose reading
d. A measure of likes and preferences achievement is at the 8-year-old level (1) a
e. A measure of personality or adjustment nonverbal test of intellectual ability, (2) an
f. A measure of a trait or construct interview focused on the conditions in his
7. For one of the attributes listed here, describe home, or (3) a series of diagnostic tests of
how you might (1) define the attribute, (2) set specific reading skills
up procedures to make that attribute observ- 10. Why is it important for test users to know and
able, and (3) quantify the attribute. adhere to professional recommendations for
a. Critical thinking appropriate test use?
b. Friendliness
11. Write down some rights or responsibilities that
c. “Good citizenship” in an elementary school
you believe test users and test takers have.
pupil
Then go to the American Psychological Associa-
d. Competence as an automobile driver
tion Web site (http://www.apa.org/science/
8. The usefulness of tests for making decisions in- testing.html) and read the rights and responsi-
volving minority group members depends on bilities statement mentioned earlier in this
the type of decision involved. For what sorts of chapter. Are factors listed that were not on your
decisions would a standardized test be most de- list? Did you include any factors that the state-
fensible? For what sorts would one be most ment did not mention?
questionable?

SUGGESTED READINGS
Alexander, L., & James, H. T. (1987). The nation’s re- Anastasi, A., & Urbina, S. (1997). Psychological
port card: Improving the assessment of student testing (7th ed.). Upper Saddle River, NJ:
achievement. Washington, DC: National Academy Prentice Hall.
of Education. Blanton, H., & Jaccard, J. (2006). Arbitrary metrics
American Educational Research Association, American in psychology. American Psychologist, 61, 27–41.
Psychological Association, & National Council on Brennan, R. L. (2006). Perspectives on the evolution
Measurement in Education. (1999). Standards for and future of measurement. In R. L. Brennan (Ed.),
educational and psychological testing. Washington, Educational measurement (4th ed., pp. 1–16).
DC: American Psychological Association. Westport, CT: Praeger.

21
PART ONE Technical Issues

Cohen, R. J., & Swerdlik, M. E. (1999). Psychological Jones, L. V. (1971). The nature of measurement. In
testing and assessment: An introduction to tests R. L. Thorndike (Ed.), Educational measurement
and measurements (4th ed.). Mountain View, CA: (2nd ed., pp. 335–355). Washington, DC:
Mayfield. American Council on Education.
Cronbach, L. J. (1975). Five decades of public con- Koretz, D. M., & Hamilton, L. S. (2006). Testing for
troversy over mental testing. American Psycholo- accountability in K-12. In R. L. Brennan (Ed.),
gist, 30, 1–14. Educational measurement (4th ed., pp. 471–516).
DuBois, P. H. (1970). A history of psychological testing. Westport, CT: Praeger.
Boston: Allyn & Bacon. Linn, R. L. (1989). Current perspectives and future
Gottfredson, L. S., & Sharf, J. C. (1988). Fairness in directions. In R. L. Linn (Ed.), Educational
employment testing: A special issue of the Journal of measurement (3rd ed., pp. 1–12). New York:
Vocational Behavior, 33(3). Duluth, MN: Acade- Macmillan.
mic Press. Murphy, K. R., & Davidshofer, C. O. (2001).
Gregory, R. J. (1996). Psychological testing: History, Psychological testing: Principles and applications
principles, and applications (2nd ed.). Boston: (5th ed.). Upper Saddle River, NJ: Prentice Hall.
Allyn & Bacon. Rogers, T. B. (1995). The psychological testing enter-
Haney, W. (1981). Validity, vaudeville, and values: A prise. Pacific Grove, CA: Brooks/Cole.
short history of social concerns over standardized Thorndike, R. M. (1990). A century of ability testing.
testing. American Psychologist, 36, 1021–1034. Chicago: Riverside.
Hartigan, J. A., & Wigdor, A. K. (Eds.) (1989). Fairness Vold, D. J. (1985). The roots of teacher testing in
in employment testing: Validity generalization, America. Educational Measurement: Issues and
minority issues, and the General Aptitude Test Bat- Practice, 4(3), 5–8.
tery. Washington, DC: National Academy Press. Wigdor, A. K., & Garner, W. R. (Eds.). (1982). Ability
Howard, G. S. (1985). The role of values in the testing: Uses, consequences, and controversies: Pt. 1.
science of psychology. American Psychologist, 40, Report of the committee. Washington, DC:
255–265. National Academy Press.

22
CHAPTER

3 Giving Meaning to Scores

The Nature of a Score Interchangeability of Different Types


Frames of Reference of Norms
Domains in Criterion- and Quotients
Norm-Referenced Tests Profiles
Criterion-Referenced Evaluation Criterion-Referenced Reports
Norm-Referenced Evaluation Norms for School Averages
Grade Norms Cautions in Using Norms
Age Norms A Third Frame of Reference: Item Response
Percentile Norms Theory
Standard Score Norms Summary
Normalizing Transformations Questions and Exercises
Stanines Suggested Readings

THE NATURE OF A SCORE

Quadra Quickly got a score of 44 on her spelling test. What does this score mean, and how
should we interpret it?
Standing alone, the number has no meaning at all and is completely uninter-
pretable. At the most superficial level, we do not even know whether this number rep-
resents a perfect score of 44 out of 44 or a very low percentage of the possible score,
such as 44 out of 100. Even if we do know that the score is 44 out of 80, or 55%, what
then?
Consider the two 20-word spelling tests in Table 3–1. A score of 15 on Test A would
have a vastly different meaning from the same score on Test B. A person who gets only 15
correct on Test A would not be outstanding in a second- or third-grade class. Have a few
friends or classmates take Test B. You will probably find that not many of them can spell 15
of these words correctly. When this test was given to a class of graduate students, only 22%
spelled 15 or more of the words correctly. A score of 15 on Test B is a good score among
graduate students of education or psychology.
As it stands, then, knowing that Quadra spelled 44 words correctly, or even that she
spelled 55% correctly, has no direct meaning or significance. The score has meaning only

From Chapter 3 of Measurement and Evaluation in Psychology and Education, 8/e. Robert M. Thorndike.
Tracy Thorndike-Christ. Copyright © 2010 by Pearson Education. All rights reserved.

23
CHAPTER 3 Giving Meaning to Scores

Table 3–1
Two 20-Word Spelling Tests

Test A Test B

bar feet baroque feasible


cat act catarrh accommodation
form rate formaldehyde inaugurate
jar inch jardiniere insignia
nap rent naphtha deterrent
dish lip discernible eucalyptus
fat air fatiguing questionnaire
sack rim sacrilegious rhythm
rich must ricochet ignoramus
sit red citrus accrued

when we have some standard with which to compare it, some frame of reference within which to
interpret it.

Frames of Reference
The way that we derive meaning from a test score depends on the context or frame of reference
in which we wish to interpret it. This frame of reference may be described using three basic di-
mensions. First, there is what we might call a temporal dimension: Is the focus of our concern
what a person can do now or what that person is likely to do at some later time? Are we inter-
ested in describing the current state or in forecasting the future?
A second dimension involves the contrast between what people can do and what they would
like to do or would normally do. When we assess a person’s capacity, we determine maximum per-
formance, and when we ask about a person’s preferences or habits, we assess typical performance.
Maximum performance implies a set of tasks that can be judged for correctness; there is a “right”
answer. With typical performance there is not a right answer, but we may ask whether one indi-
vidual’s responses are like those of most people or are unusual in some way.
A third dimension is the nature of the standard against which we compare a person’s behav-
ior. In some cases, the content of the test itself may provide the standard; in some cases, it is the
person’s own behavior in other situations or on other tests that provides the standard; and in still
other instances, it is the person’s behavior in comparison with the behavior of other people.
Thus, a given measurement is interpreted as being either oriented in the present or oriented in
the future; as measuring either maximum or typical performance; and as relating the person’s
performance to a standard defined by the test itself, to the person’s own scores on this or other
measures, or to the performance of other people.
Many instructional decisions in schools call for information about what a student or group
of students can do now. Wakana Watanabe is making a good many mistakes in her oral reading.
To develop an instructional strategy that will help her overcome this difficulty, we need to deter-
mine the cause of her problem. One question we might ask is whether she can match words with
their initial consonant sounds. A brief test focused on this specific skill, perhaps presented by the

24
Another random document with
no related content on Scribd:
Hoy usamos la misma palabra para designar una pequeña heredad, ó
campito, con su rancho. Debemos á Las Casas la descripción de cómo
el indígena preparaba sus tierras para la siembra de sus yucas, ajes y
batatas. “Hacían los indios, narra el célebre clérigo sevillano, unos
montones de tierra, levantados del suelo como una vara de medir, é
tenían en contorno nueve ó doce pies: un montón estaba apartado
del otro dos ó tres pies: todos por su orden: rengleras de mil é dos
mil é diez mil de luengo: é otros tantos de anchura, según la cantidad
que determinaban poner.”[193]. Ya el ilustrado cubano don Alvaro
Reinoso presentó al Congreso Internacional de Americanistas de
Madrid, el año de 1882, un interesantísimo trabajo sobre el cultivo
en camellones, como dato de la agricultura de los indígenas de Cuba
y Haytí en la época precolombina.[194] Y los boriqueños estaban en
todo más adelantados que los siboneyes de Cuba y en el arte lítico
más que los haytianos.
El cultivo de la yucubía se extendía en el Boriquén á grandes
plantíos; á veces, de más de diez mil montones de matas. A los cinco
ó seis meses los sembrados presentaban un bonito aspecto. Al año ya
se cosechaba la raíz, ó fruto, llamado yuca; y se podían explotar los
yucales hasta tres años.
El boriqueño, ayudado de las mujeres, trabajaba el venenoso
tubérculo de la yucubía para obtener su alimenticia harina. Lavada
la yuca y raspada la película externa con una conchita de almejas,
llamada caguará, reducíanla á una grosera harina, la catibía,
rayando el tubérculo en las asperezas de una tabla cuadrilonga de
palma de yagua, sembrada de piedrecitas silíceas, que llamaban
guayo.[195] Recogían los boriqueños la harina de la yuca en un sitio ó
artesa, llamado guarikitén, según iban rallando los tubérculos.
Luego, echaban esta harinosa masa en un saquito hecho de empleita
de palmera, llamado sibucán, el cual colgaban de un árbol, y dos
indios ó indias, mediante un palo enganchado en el otro extremo de
la manga, según refiere Las Casas,[196] ó ayudado del peso de grandes
piedras, como dice Oviedo, esprimían el saquito, para extraer de la
yuca el jugo venenoso, llamado naiboa. Retirado el mortífero zumo,
tomaban el farináceo producto y lo cernían en el jibi, una especie de
cedazo hecho de cañitas muy finas de carrizo; obteniendo así muy
buena harina, la que extendían en panes redondos, del grueso de dos
dedos, en una cazuela ó plato llano de barro, llamado burén, que
ponían al fuego sobre piedras, dando vueltas á las tortas con una
tablilla, llamada küisa, hasta que el pan casabí quedaba hecho. Con
la mejor flor de harina de yuca hacían un casabe selecto, muy blanco,
que llamaban xau-xau. Sabían también extraer el almidón de la
harina de yuca, cuyo producto llamaban anaiboa y la utilizaban en
sus comidas. Era toda una industria de panadería, tanto ó más
complicada que la de nuestros días con la harina de trigo, cuyo
origen se pierde también en la noche de los tiempos prehistóricos.[197]
Los indígenas de la islita la Mona[198] sembraban mucha yuca y
confeccionaban mucho casabe, y cuando Juan Ponce de León vino
por vez primera á Boriquén, en 1508, tocó en aquella islilla de paso, y
pudo aprovicionarse en ella de pan casabí para su gente, enviando
luego, desde San Juan, la carabela al mando de su lugarteniente don
Juan Gil Calderón, para que los naturales de la Mona le facilitaran de
nuevo bastimento de casabe para los cincuenta hombres de su
expedición.
Otros dos productos sacaba el boriqueño del tubérculo de la
yucubía. Solía hacer un vinagre para sus guisos, hirviendo bien el
jugo venenoso de la yuca, el ponzoñoso naiboa, para que se
evaporase el tósigo mortal, y después de hervido este zumo lo
guardaba para que se acidulase.[199] El otro producto era la bebida
uikú, que la obtenía poniendo pedazos de casabe á fermentar en
vasijas llenas de agua, agregándole algunos trozos del mismo casabe,
masticado por indias jóvenes, para utilizar la saliva como agente de
fermentación.
El aborigen cultivaba además el maíz, dos veces al año. El vocablo
español maíz procede del indo-antillano maisí. El boriqueño comía
el maiz tostado; y le servía también para hacer una bebida
fermentada la xixa,[200], que le gustaba mucho, como la otra bebida
obtenida del casabe, el uikú. Necesitaba también nuestro indio de la
diastasa de la saliva para provocar la fermentación del maíz; por lo
que ponía indias jóvenes á mascar granos de maisí y á echarlos
impregnados de saliva en los tinajones donde se iba á preparar su
preciada original cerveza.
El boriqueño no sabía hacer pan de maíz, como algunos terrícolas
de Tierra Firme, lo que prueba que este avance en la alimentación
fué posterior en Venezuela á la separación de las tribus Aruacas, que
invadieron el Archipiélago antillano. Fenómeno que se ha repetido
mucho en la historia de la humanidad: porque el hombre no produce
nada completo de una vez. El ario trituraba el grano de trigo, pero
desconocía el molino de brazos; el indo-europeo llegó á este avance
cuando se situó en los terrenos de aluvión de la cuenca del Volga.[201]
El aborigen cultivaba en gran escala la batata, de la que la historia
nos conserva los nombres de algunas variedades. Llamaba á la
blanca guanaguax; á la morada guanagüey; y á la que era blanca y
morada, guanaraca. El fruto que hoy se llama boniato, los indígenas
denominaban aje; y al morado lo llamaban aniguamá; y al rojo,
xaxagüeyú. Plantaban también en sus labranzas el lirén sabroso y el
maní, rival de la avellana. Y utilizaban, al azar, la yahutía, el
mapüey, la imocona, el guayaru y otros tubérculos alimenticios;
pero sin ocuparse en sembrarlos. También cosechaban los
boriqueños, sin cultivo, las frutas silvestres de los montes y
maniguas, el mamey, la guayaba, el anón, el jobo, la guanábana, la
pitajaya, el guamá, la tuna, el jicaco, el caimito, el cajuil, las
guiabaras ó uvas de playa, la piña, que llamaban yayama, y las
olvidadas hoy de la guaba, el ausubo y la yagruma.[202]
Había otras tres plantas, que con cuidadosa atención procuraba el
aborigen replantar cerca de su bohío; y estas eran el ají, el cojibá ó
tabaco y el ben purgativo. Del ají tenía dos especies principales, una
dulce y otra picante, y unas cuantas variedades. No se conserva más
que el nombre indígena del picante, que lo llamaba guaguao; á todas
las demás variedades nuestros campesinos le han puesto nombres
caprichosos.
Nuestro indio cultivaba el tabaco, su cojibá, con dos fines, uno
común y otro religioso. Después de la cena, algunos mascaban la
nicociana hoja:[203] otros sabían hacer unos mosquetes ó cigarros mal
enrollados, que llamaban tabacos, y aspiraban su humo
embriagador. No faltaba indígena, que tras la intoxicación de la
nicotina, arrojase cuanto había comido. Lo que indujo á creer, á los
que primeramente observaron esta costumbre, que el indio usaba
esta planta como vomi-purgativo. Error en que han incurrido
después otros escritores modernos. Según el Dr. Crévaux, los
Oyampis de las Guayanas, usan la fumigación de tabaco contra los
cólicos, lanzado el humo directamente sobre el sitio dolorido por el
mismo curandero: cuya medicación está de acuerdo con los soplos y
frotes de que nos hablan los Cronistas y es lógico aceptarlo. Para las
ceremonias religiosas preparaba el bohique, como augur de la tribu,
una especie de rapé, del cojibá, que se tomaba por las narices con su
correspondiente instrumento litúrgico y ceremonial ad hoc. El acto
de tomar estos polvos se llamaba el cojoba. Lo hacía el hechicero
bohique antes de impetrar al zemí bienhechor; y también en
determinadas asambleas, ó consejos de jefes, en unión de los
caciques y nitaynos.
La tercera planta, que el boriqueño procuraba cultivar cerca de su
choza, según lo anota cuidadosamente Las Casas, era la tau-túa, con
la que se había de medicinar. Tenemos en Puerto Rico tres arbolitos,
que dan semillas purgativas: el tau-túa, ó jatropha gossypifolia, que
los franceses llaman grand ben purgative y avelines purgatives; y
los ingleses denominan bastard french physic-nut ó spanish physic-
nut. El tártago, ó sea la jatropha curcas, que en Cuba y Santo
Domingo llaman piñón; los franceses denominan grand pignon d’
Inde y noix de Barbades; y los ingleses conocen con el nombre de
Barbados seeds. Y finalmente don Tomás, ó sea jatropha multifida,
que los franceses dicen medicinier a fleurs scarlates; y los ingleses
french multifid. Las Casas, como indicamos más arriba, nos da la
prueba histórica de nuestro aserto, describiendo el ben y el cuidado
con que el indígena procuraba sembrarlo junto á su casa. Es probable
que el boriqueño usara indistintamente de estas semillas purgativas,
así como de algunos bejucos.[204] Por supuesto, que la enfermedad del
indio siempre era considerada por el curandero augur, ó bohique,
como un daño hecho por los espíritus malignos ó maboyas.
Aunque el boriqueño no cultivaba el algodón, ni la majagua, ni el
maguey, que abundaban por doquiera silvestres, los cosechaba para
utilizarlos. La india hilaba bastante bien el algodón, el sorobei, y tejía
con él los faldellines para las mujeres casadas y los taparrabos para
los hombres. También trabajaba de algodón las hamacas, y unas
especies de pulseras para los brazos y tobillos; algunas carátulas para
los ídolos y otras cosillas; y los hombres tejían de sorobei sus redes
de pescar. En su reducido gusto estético tenía ya el aborigen el
conocimiento de la tintorería, y algunos de esos objetos de algodón
los teñía con el jugo de la jagua, dándoles visos negruzcos, ó con el
zumo del jiquilete, añil cimarrón, hermoseándolos de color azul, ó los
coloraba de amarillo con la bija, nuestro vulgar achiote, voz ésta de
origen azteca, que ha prevalecido en el lenguaje, en vez de la
boriqueña bija. También solía maridar en franjas, ó listas pareadas,
estos colores.
La majagua, cuya fuerte corteza facilita larga fibra, y el maguey,
cuyos blancos hilos son resistentes, eran destinados por los
boriqueños á cordelería, haciendo con ellos muy buenos cordeles,
que llamaban cabuyas, y unas cestas redondas, llamadas jabas, que
acostumbraba el indio llevar al hombro, á las extremidades de un
palo, cuando viajaba: costumbre que aún perdura entre nuestros
campesinos.
Vemos, pues, que nuestro indígena, á la par que agricultor era
también industrial. El aprovechamiento del algodón, de la majagua
y del maguey eran ya industrias nacientes, á las cuales tendrían que
dedicarse determinadas personas, con su correspondiente
aprendizaje. En las edades prehistóricas el arte de tallar y pulimentar
la piedra, la alfarería, la cordelería y el tejido rudimentario de
algunas telas, así como la caza, pesca, pastoreo y agricultura, tuvo
que estar confiado á determinados individuos, que lo hicieron
privativo, primero, de sus familias, pasando luego ese derecho á
ciertas tribus.
El tallar la piedra debe haber requerido su laborioso aprendizaje y
el pulirla suma paciencia. Debieron haberse escogido canteras
apropiadas y en las cercanías de ellas fijarse familias de obreros,
dedicadas constantemente á esta industria. Mediante la permuta,
que es el alborear del comercio, se aprovisionarían de bastimentos
para vivir y realizarían á su vez los efectos pétreos. El boriqueño
tenía canteras escogidas para fabricar sus utensilios de piedra. La
gruta de Miraflores, en Arecibo, es una muestra patente de lo que
decimos. Nosotros llamamos á esta cantera el taller indo-boriqueño.
Examinada la caverna, lo primero que llama la atención del
investigador es un stone-pillar, ó monolito, á medio concluir. El
artista llegó á cincelar los ojos y la boca; y después empezó á formar
el pilar y á separarlo de la roca. En el lado izquierdo de su labor
alcanza hasta unos 40 centímetros, formando la columna: en el lado
derecho hasta unos 25 centímetros. Las cuencas de los ojos de la
figura están trabajadas ligeramente oblicuas; pero con una
oblicuidad dirigida de abajo y fuera hacia arriba, de modo que si se
prolongaran los ángulos internos de estas cuencas irían á
encontrarse en el centro de la frente. Oblicuidad completamente
distinta á la fig. 45 de la colección Látimer, del museo Smithsonian
Institution, de Washington, que guarda relación con una carátula de
piedra de nuestra Colección y con otra cara grabada en una de las
paredes de la citada caverna. Este monolito, á medio concluir, está en
la arcada principal de la gruta que mira al E. en el lado derecho del
observador. En este mismo sitio están casi todos los trabajos. La
segunda figura de importancia es una cara con los ojos oblícuos, á
estilo mogol, pero muy acentuada la oblicuidad. Están delineados los
ojos, la nariz y la boca: y habían empezado á fijar el óvalo de la cara.
En este estado, el trabajo fué suspendido por el artista. A poca
distancia hay otra cara también muy interesante. Tiene los ojos
circulares y paralelos, las cejas unidas, la boca pequeña, el límite del
rostro en forma triangular con la frente baja y el mentón
pronunciado. Tenemos en nuestra Colección un ejemplar
parecidísimo, que nos induce á creer que esta cara pétrea del taller
indo-boriqueño estaba también en vías de fabricación. Más abajo, en
la misma rocosa pared, hay grabados ojos y bocas, pero sin óvalos:
trabajos incipientes en período de iniciación. En uno de los
paredones del camino que conduce á la gruta, hay también una
carita comenzada. Era, pues, indudablemente una cantera en
explotación, un verdadero taller lítico para objetos pétreos y no un
templo, como equivocadamente han anotado algunos viajeros. Es el
primer taller de que se da cuenta, hallado en nuestra Isla. Razón
tenía Ratzel[205] para afirmar que “las antiguas esculturas de piedras
de Puerto Rico demuestran una habilidad especial en el labrado de la
piedra, que no encontramos en ningún otro punto de las Indias
Occidentales”. Los siboneyes, los yucayos y los jamaikinos
desconocían el arte de trabajar la piedra. Los naturales de Boriquén
estaban más adelantados que los de Haytí en esta industria. El
obrero boriqueño, en los trabajos líticos, era el primero del
Archipiélago antillano.
El objeto de piedra más necesario para el boriqueño era el hacha ó
manaya. En nuestra Colección tenemos cuatro tamaños principales;
pero el indígena tendría indudablemente una completa variedad de
mayor á menor tamaño, según sus necesidades. Servíale la manaya
para tumbar el árbol, el giié-giié; ahuecar el tronco corpulento de los
cedros y ceibas y hacer la almadía, la canoa, cuyo vocablo ha tomado
carta de naturaleza en nuestros idiomas. Tenía el boriqueño una
pequeña canoa, en la que escasamente cabían dos personas, y otras
de capacidad mayor, que podían conducir un pelotón de hombres. El
Almirante vió esquifes de éstos, que él llamó almadías, que
contenían de setenta á ochenta indios; Fernando Colón cita una de
capacidad para 150 personas; y Las Casas dice, que vió canoas, que
podían llevar de cincuenta á cien indígenas. Dedicaba la pequeña
canoa el boriqueño á la pesca; y la grande al desarrollo paulatino de
su incipiente comercio entre las islas, cuyo tráfico era principalmente
con los terrícolas de la Mona y los naturales del Higiiey, de la
inmediata Haytí.
Calculemos, por un momento, la actividad desplegada en la
cantera, en el taller indo-boriqueño. Unos tallando hachas; otros,
morterillos; aquí dujos; allí, collares; acá, cincelando monolitos para
los límites de sus juegos de pelota; allá, ídolos mamiformes de
Yukiyu, el dios bienhechor de Boriquén. Cada cual dedicado á su
especialidad lítica. Habría manos privilegiadas para ciertas labores.
El gusano roedor de la envidia profesional también en el taller indo-
boriqueño hincaría su venenoso diente. Los trabajos más sencillos
serían encomendados á los aprendices, como los guayos, que
requerían solamente fijar pedacitos de sílex en una tabla de palma de
yagua, para hacer el utilísimo rallo. También se les conferiría á los
aprendices la penosa labor de pulir hachas, una vez talladas. En el
cincelamiento y ornamentación entraría el maestro á dar los últimos
toques. El colesibí ó collar de piedrezuelas marmóreas, la tatagua ó
arracada, revelan ya el incipiente gusto artístico. ¡Con qué infantil
orgullo contemplaría el artista de la edad de la piedra pulimentada
su pétreo objeto ya terminado!
Igual actividad habría en la alfarería. Lo primero que se harían
cuidadosamente serían los burén, para el cocido del pan casabí al
fuego. Era necesario que estos lebrillos quedasen bien templados
para que no se resquebrajasen á la fuerte lumbre á que tenían que
someterlos. Otros se dedicarían á hacer cazuelas; otros, canaris para
el agua, ó tinajones para el uikú y la xixá. El refinamiento de adornar
las abrazaderas de las cazuelas y ollitas correspondería á los más
expertos en el arte. El escogido de la arcilla y su manipulación para
ponerla en condiciones de modelaje requiriría inteligente dirección.
Las grotescas figurillas de barro cocido, que constituían los dioses
penates del indígena, tendrían obreros especiales, para no separarse
del modelo del zemí tutelar. ¡Con qué positiva seguridad puede
nuestra mente retrotraerse á aquellos lejanos tiempos y darse cuenta
exacta del desenvolvimiento de la época neolítica! ¡No en vano
avanza la Paleontología, arrancándole al pasado sus secretos!
También trabajaba el boriqueño la madera con esmero. En la Isla
de Guanabo[206] se hacían primorosamente dujos, bateas, cucharas y
otros objetos de una madera negra, que suponemos fuera la caoba ó
la maga, ó tal vez la negruzca raíz del mangle viejo. Refiere el
cronista Pedro Mártir, que catorce de esos curiosos asientos,
llamados dujos, labrados con arte maravillosa, fueron regalados por
Bojekio, cacique de Jaragua, en Haytí, á don Bartolomé Colón,
cuando el Adelantado visitó el cacicazgo del célebre hermano de
Anacaona; y que, además, le obsequió con sesenta utensilios de
arcilla, propios para el servicio de mesa.[207]
De varas de cupey[208] hacía el aborigen sus azagayas; y de corteza
de palma de yagua[209] sus macanas, fuertes garrotes de combate, de
cuatro palmos de largos. Como arma ofensiva, también tenía la
flecha. Hacía el arco, el paira, de un grueso bejuco; y con cogollos de
caña silvestre preparaba la flecha, en cuyo extremo colocaba una
espina de pescado ó una punta de pedernal.[210] Aprovechaba el fruto
maduro de la higiiera[211] para hacer cucharas y vasijas útiles para el
uso doméstico. Las pequeñas vértebras de pescados sabía utilizarlas
para sugetar plumas de colores en sus espesas cabelleras; y también
de algunos huesos de peces hacían anzuelillos de pescar.
Buenos flecheros los boriqueños, más diestros que sus vecinos los
quisqueyanos y siboneyes, cazaban en las costas la yaguasa y otras
aves marinas; y en los montes y sabanas el guaraguao, el mukáru, la
iguana, la sasabí (la cotorra) y las tórtolas, en abundancia. También
eran hábiles pescadores, y con sus redes de algodón, anzuelos de
hueso, y otros medios artificiosos, se proveían de dajaos, lisas,
anguilas, biajacas, jureles, guabinas, pargos, mojarras, manatíes,
cazones y otra multitud de peces, que tanto abundan en nuestros ríos
y mares. Los ribereños del Abacoa (río Grande de Arecibo) y los de
Camuy y Manatí tenían abundantemente el setí, en los plenilunios de
Agosto, Septiembre y Octubre. Entre los crustáceos, disponían los
indígenas del carey (tortuga de mar), la jicotea (tortuga de agua
dulce), el juev (cangrejo de mangle), la jaiba (congrejo de agua
dulce) y la buruquena (cangrejillo de río); y además, langostas y
camarones.
Por lo tanto, el boriqueño, con su agricultura é industria
incipientes, había avanzado en su rudimentaria civilización,
guardando harmonía con el período de la edad de la piedra
pulimentada, en que se encontraba, y en relación también con el
medio ambiente de que disponía. Prisionero en una triste roca, en
pleno mar, se adelantó á los siboneyes, yucayos y jamaikinos y
rivalizó con los haytianos y quisqueyanos, al par que mantenía á
raya á los audaces caribes, que invadían piraticamente de cuando en
cuando el Boriquén.[212]
CAPITULO IX.

Lenguaje boriqueño.—Lengua general indo-antillana.—Dialectos.


—Datos del Diario de Colón.—Su carta desde Lisboa á los
Reyes Católicos.—El dialecto de Macorix.—Fray Román
Pane.—Cristóbal Rodríguez.—Datos de Bernal Díaz del
Castillo.—Informes del padre Raymond Breton.—
Imposibilidad de los primeros misioneros para recoger el
idioma indo-antillano.—Las reliquias de la lengua general
de las Antillas en ríos, montañas, árboles, frutas, lugares,
puertos, cabos, etc.—Lo mismo en aves, peces y objetos
domésticos.—Alguna que otra palabra en los Cronistas.—
Dos ó tres frases.—Error de Juan Ignacio de Armas y otros
escritores en la manera de explicar las voces indo-
antillanas.—El idioma indo-antillano se formó con el
trascurso del tiempo, pues la separación de las tribus
Aruacas, que invadieron el Archipiélago era muy remota,
hasta el punto de haber perdido el recuerdo de ella.—
Enlace del habla Aruaca continental y del idioma indo-
antillano.—Datos á granel en los mapas.—Viajeros
modernos.—Sagot.—Los hermanos Hernhutes de Zittau.—
El misionero Schulz.—Enlace del habla boriqueña y del
habla caribe insular.—Su origen continental.—El lenguaje
boriqueño era rico en vocales y de muy dulce conversación.
—El aborigen tenía una aspiración parecida á la del árabe.
—La fijaron los Cronistas en las voces con una h.—Pruebas
de la aglutinación y del polisintetismo.—El estudio de los
restos del idioma indo-antillano nos ha dado una prueba
fehaciente de que el orígen del indo-boriqueño está en el
Aruaca de la América Meridional.
El boriqueño usaba un lenguaje en el período de aglutinación, con
polisintetismo, sin escritura que fijase sus vocablos.[213] En todo el
Archipiélago antillano ocurría lo mismo; y estando el idioma en
perenne fermentación, habían de producirse necesariamente
neologismos, en cada isla, que tenían que alterar en algo la común
lengua.

Por el Diario del primer viaje del Almirante vemos, que los indios
que tomó Colón en Guanahaní para que le sirvieran de intérpretes,
cumplieron su cometido en todas las islas del grupo de las Lucayas á
que arribara el Descubridor, y también en Cuba y Haytí. El
Almirante, sagaz observador, anotó en su libro de bitácora, con fecha
16 de Octubre, las siguientes palabras, comprobatorias de la unidad
de lenguaje: “Los habitantes de esta isla Fernandina[214] se parecen á
los de las demás, hablan el mismo idioma y tienen las mismas
costumbres.” Al llegar el gran Navegante á la isla de Cuba, los
enviados, ó embajadores, Rodrigo de Jerez y el judío Luis de Torres,
muy versado éste en idiomas, no pudieron entenderse con el cacique
del Camagiiey. El políglota Torres, creyendo que ellos habían llegado
al reino del Gran Kan en el Continente asiático, habló primero al
régulo cubano en hebreo, después en caldeo y por último en árabe,
teniendo que apelar al intérprete de Guanahaní, el cual hizo al
cacique del Camagiiey y á sus asombrados súbditos una fogosa
descripción del poder de los españoles.
Luego pasó el Almirante á la isla de Haytí, á la cual bautizó con el
nombre de La Española, y prontamente entró en fáciles tratos y
amistosa correspondencia con los aborígenes. Corrobora lo dicho la
carta que escribió en el mar á los Reyes Católicos, remitida desde
Lisboa, en Marzo de 1493, diciéndole entre otras cosas: “En todas
estas islas no vide mucha diversidad en la fechura de la gente, ni en
las costumbres, ni en la lengua, salvo que todos se entienden, que es
cosa muy singular.”
Al regresar Colón á España se llevó diez indígenas, de los cuales
algunos sirvieron de intérpretes en la segunda aventurada empresa;
sabiéndose por ellos, en virtud de los diálogos tenidos con las
mujeres boriqueñas, cautivas recogidas en la isla de Guadalupe, que
los indios caribes eran belicosos y antropófagos.[215] Al llegar la
expedición colombina al puerto de Navidad, las indias de Boriquén,
recogidas á bordo, concertaron su fuga con el hermano del cacique
Guacanagarí, lo que efectuaron por la noche. Estos datos prueban,
que usaban una misma lengua yucayos, haytianos y boriqueños.
En el Memorial que dió el Almirante al piloto Antonio de Torres,
en la Isabela, á 30 de Enero de 1494, para entregar á los Reyes
Católicos, léese: “Como esta gente platican poco los de la una isla con
los de la otra, en las lenguas hay alguna diferencia entre ellos, según
como están más cerca ó más lejos.”[216]. De manera que, según
confirmaba el mismo Colón, por experiencia propia, entre las
lenguas no había más que alguna diferencia. En el Archipiélago
existía, por lo tanto, un idioma general; y con motivo del aislamiento
insular, y los neologismos, se iban formando los dialectos yucayo,
siboney, haytiano, boriqueño y jamaiquino. En La Española, en el
departamento de Macorix, se hablaba un dialecto que lo llegó á
dominar Fray Román Pane. La lengua general, ó común, de la isla de
Haytí, según Las Casas, únicamente la sabía bien un marinero de
Palos de Moguer, llamado Cristóbal Rodríguez.
Hemos probado, con la tradición histórica, la más pura que
poseemos, como los indios intérpretes de Colón se comunicaron muy
bien con los indígenas de las islas Lucayas y con los de Cuba, Haytí y
Boriquén. Respecto á Jamayca, concluyente será también la prueba.
Refiere Bernal Diaz del Castillo[217], que al desembarcar con Juan de
Grijalba en la isla de Cozumel “vino una india moza, de buen parecer,
é comenzó á hablar la lengua de la isla de Jamayca... é como muchos
de nuestros soldados é yo entendimos muy bien aquella lengua, que
es la de Cuba, nos admiramos é la preguntamos cómo estaba allí.”
Resultó, que el naufragio de una canoa de pescadores de Jamayca la
había llevado á la isla de Cozumel. Queda, pues, plenamente
comprobado nuestro aserto, de un idioma general en el Archipiélago
antillano, con la excepción de las islas ocupadas por los Caribes. Y
respecto á estas islas de Barlovento, refiere el padre Raymond
Breton, en su Diccionario caribe-francés, contado á él por jefes indios
de la isla Domínica, “que cuando la conquista de las islas, el jefe
caribe había exterminado todos los naturales del país, reservando
solamente las mujeres, las que siempre han guardado muchas cosas
de su lenguaje.”[218]
Ahora bien, esta lengua común indo-antillana, así como sus
derivados ó dialectos, se han perdido. Las órdenes religiosas, que
dominaron en las Antillas mayores, no pudieron dedicarse á
conservarlos, mediante vocabularios y léxicos, como tuvieron la
gloria de hacerlo en otras partes de América. La brega del desarrollo
de la conquista y el pugilato de las ideas, de si debía continuar el
indio encomendado ó dársele absoluta libertad, entorpecía la acción
cristiana de los misioneros, cuanto más la labor literaria de estudiar
y conservar el idioma indo-antillano: máxime cuando los Domínicos
se inclinaron á favor de los indígenas y los Franciscos en pro de los
Encomenderos: rivalidad que les obligó á enviar sus representantes
ante el Rey. No pasó así en el Continente, ni en las Antillas menores.
Pasada la perturbación del choque de dos razas antitéticas, y
sometidas casi todas las Indias, las misiones pudieron trabajar en
paz y dedicar sus hombres inteligentes al estudio de las lenguas de
los aborígenes.
Al padre Raymundo Breton, de la Orden de Predicadores, debemos
poder estudiar la lengua Caribe. Al domínico Santo Thomás y al
jesuita González Holguín el conocimiento de la Kechúa del Perú. A
los manuscritos de los misioneros de Bogotá la lengua Chibcha, que
hoy nos da á conocer Uricoechea[219], así como la de los Paos ó indios
de tierra adentro de Colombia.[220] Al jesuita Bertonio[221] debemos la
Aymara. A los misioneros Vega, Valdivia, y Santisteban y al jesuita
Andrés Febres[222] somos deudores de tener el idioma Araucano, de
Chile. A los padres Anchieta[223] y Figueira[224], y al limeño Ruiz de
Montoya el Tupí-Guaraní.[225] A los misioneros de la Guayana
francesa el Galibi, que hoy Celedon, Brinton, Coudreau y Crevaux
nos han dado á conocer mejor. A los franciscanos y jesuitas de Méjico
el Azteca ó Nahuatl.[226] Y así sucesivamente. Por todas partes las
misiones recogieron el lenguaje de los indios. Tan sólo los indo-
antillanos, por las razones anotadas, quedaron imposibilitados de
legar á la posteridad su dulce idioma.
Quedan únicamente las reliquias de esta lengua general del
Archipiélago. En Cuba, Santo Domingo y Puerto Rico se conservan
muchas palabras indo-antillanas en ríos, montañas, árboles, frutas,
lugares, puertos, cabos, etc. Lo mismo en aves, peces y objetos de uso
doméstico. En los mismos Cronistas hemos hallado algunos vocablos
con su correspondencia en castellano y hasta alguna que otra frase.
La mano del tiempo conserva estos despojos como margaritas
perdidas de un rico joyel. Las hemos ido recogiendo pacientemente
para que nos ayudaran á descubrir el origen del pueblo indo-
antillano. Y efectivamente, gracias á ellas y á la Filología hemos
podido ver claramente que el autóctono de las Antillas procedía del
Continente meridional, esplicándonos perfectamente el proceso
evolutivo de las tribus Aruacas en las isla, perdida ya la memoria de
su inmigración.
El escritor cubano don Juan Ignacio de Armas opina[227], que no ha
existido un idioma general en las Antillas, lo cual está en abierta
oposición con lo que nosotros afirmamos, apoyándonos en el estudio
de los Cronistas. Es verdad, que las Antillas son islas
desparramadas en el Océano, como dice el señor Armas; pero ésto
no fué un impedimento para que fueran pobladas por individuos
procedentes de unas mismas tribus y de idéntico origen étnico; y
después mantuvieran siempre entre sí ciertas relaciones, con el
auxilio de sus canoas. El dialecto mallorquín y el dialecto catalán,
proceden de la lengua lemosina y á pesar de estar separadas las islas
Baleares de Cataluña, los dos dialectos conservan la unidad de la
lengua madre.
No faltan, en nuestros días, quienes digan, por ejemplo, que
lucayo, viene de la dicción castellana cayo; que los españoles vieron
las chozas de los indígenas en forma de cono y las llamaron conucos;
que caney procede de cana; maíz se origina en mahizo; y los
vocablos ají y cacique vienen del árabe; cocuyo y seboruco, del latín;
Anacaona y Baracoa, del vascuence, etc. Los que de tal manera
opinan abrevan en las fuentes de Juan Ignacio de Armas. Manera
muy socorrida y original de hacer semejantes estudios etimológicos
en voces, que algunas están corrompidas por el uso, y otras
presentan similitud de sílabas idénticas y de pronunciación parecida
con voces de nuestros idiomas.
Vamos á echar por tierra semejante modo de razonar con tres
ejemplos. Esos mismos etimologistas, al leer nuestra geografía
boriquense y hallarse la palabra Caguas, dirían enfáticos, que esta
voz procede indudablemente de la castellana agua. Al hojear la
historia de Méjico y tropezarse con Cuernavaca asegurarían
firmemente, que se trata de un vocablo compuesto de dos voces
castizas cuerno y vaca. Y al oir á nuestros campesinos llamar á un
fruto del país tallote, rotundamente sostendrían, que el neologismo
se había originado en el genuino y castellano tallo. No teniendo en
cuenta al hacer tales afirmaciones, que los conquistadores y
pobladores de Indias adaptaban á su idioma los vocablos indígenas
como mejor les parecía y más fácil se les hacía su pronunciación.
He aquí las pruebas de nuestro aserto. En la relación ó extracto de
una carta que escribió el conquistador Diego Velazquez, teniente de
gobernador en Cuba, á SS. AA. sobre el gobierno de ella, el año de
1514[228], se lee el siguiente párrafo, donde se ve claramente que
Caguas es palabra indo-antillana: “Y que de todo lo susodicho fué
capitán un indio de la isla Española, criado intérprete del cacique
Yacahiiey, que se decía Caguax, el qual ya es muerto.” Y para
comprobar que el mismo vocablo es boriqueño, véase la distribución
hecha por Ponce de León de los caciques y labranzas del Boriquén,
en 1510, para ocurrir á los gastos de la incipiente colonia, y se verá
que el cacique Caguax, con su ranchería junto al río Turabo,
correspondió en venta á Francisco de Robledo y Juan de Castellanos.
[229]

Para asegurarnos que la dicción Cuernabaca no es más que la


evolución y cristalización castellana, por corruptela, de un vocablo
mexicano, léase la tercera carta de Hernán Cortés al Emperador
Carlos V.[230], y en ella se verá designar una población con el nombre
de Coadnabaced. Y el capitán Bernal Diaz del Castillo anota en sus
célebres crónicas:[231] “é otro día fuimos camino de otro mejor é
mayor pueblo, que se dice Coadalbaca, é comunmente corrompimos
ahora aquel vocablo é le llamamos Cuernabaca”. Otros escribían
Quanabuac; siendo el verdadero nombre azteca Quahaunahuatl.
Y respecto á que tallote no viene de tallo, como erróneamente
induciría á creer la similitud de sílabas, desaparecen las dudas con la
lectura del siguiente párrafo, que también recogemos del mismo
Bernal Diaz del Castillo, y que revela el origen mexicano de la
palabra. Refiere el capitán narrador, que después de la batalla de
Otumba “ibamos muy alegres, é cogiendo unas calabazas que llaman
ayotes, é comiendo é caminando hacia Tlascala.”[232]. Y cuando
refiere el mismo autor, el viaje en que acompañó á Hernán Cortés á
la exploración de Honduras, anota: “hallamos cuatro casas llenas de
maíz é muchos frísoles, é sobre treinta gallinas, é melones de tierra,
que se dicen ayotes.” Con el trasiego de voces y dicciones de las islas
antillanas al Continente americano, y vice versa, se importó el
vocablo dicho, aplicándolo al fruto del sechium edule. Y de ayotes
derivaron los pobladores, chayotes, tayotes, y finalmente tallotes.[233]
En cambio, tales etimologistas tendrían por indígenas, por no
encontrar dicción ortográfica ú ortológica homóloga, á las palabras
plátano, cuando es griega; dita, que procede del latin; zafra, que es
arábiga; cobija, siendo castellana; fotuto, de origen italiano;
guarapo, que viene del quechúa; etc. Todo lo cual nos indica, que en
el campo de las investigaciones etimológicas de las voces indo-
antillanas es preciso entrar con pie tardo y suma precaución, para no
caer en sensibles equivocaciones.
Si el color rojo de la piel le da derecho al americano indígena para
constituir un tipo étnico, un tronco, el polisintetismo único de sus
lenguas coadyuva firmemente al sostenimiento de esa tesis. No se
debe confundir la aglutinación con el polisintetismo: de aquel se
viene á este: y sólo las lenguas indo-americanas son polisintéticas.
[234]

Los indo-antillanos procedentes de las tribus Aruacas del


Continente meridional habían perdido la noción de este origen: lo
que es una prueba fehaciente de que esa separación era de fecha muy
remota. Desprendimiento de tribus, que con el trascurso del tiempo,
dió nacimiento al pueblo indo-antillano y á una lengua propia. El
suelo es un gran factor en la génesis de un pueblo. Los idiomas
necesitan la lenta acción del tiempo para crearse. Véase, como el
español, el francés, el lemosín y el italiano han ido derivándose del
latín. De igual modo el habla general indo-antillana se fué formando
en las islas del Mediterráneo Colombino, conservando, empero,
enlace filológico con el Aruaca continental.
El lenguaje es la imagen fiel de la realidad y deja huellas profundas
por donde quiera que pasa. Sobre los territorios de Venezuela vamos
á investigar algunas de las huellas del lenguaje de los Aruacas y á
interpretarlas en harmonía con los restos del habla indo-antillana,
que poseemos, lo que probará su enlace.
Al gran río venezolano llamaban los Caribes Orinoco. Los Aruacas
le llamaban Huyaparí. Pero esta voz aparece ya corrompida en los
cronicones de la Conquista. La mayoría de los vocablos aruacas de
Venezuela, que se conservan, ha permutado la letra b en la letra p,
por accidente fonético; y también la letra n en la letra r. Cosa muy
natural en un lenguaje cuyas voces no estaban fijadas en escritura
alguna. Huyaparí es, por lo tanto, corrupción de Huyabaní, es decir,
lugar de mucha agua. Explicación filológica: Huy por juy,
equivalente á guay, exclamación de sorpresa, como si dijéramos hé
aquí! Ya, sitio ó lugar, por yara. Por polisintetismo no aparecen dos
y en el vocablo, ni el ra de yara. Igualmente sucede en ba por bana,
grande, mucho. Ní, agua. El vocablo que estudiamos, sin la
encapsulación polisintética, sería, Huy-yara-bana-ní, ó sea
Huyyarabananí. Y con el polisintetismo Huyabaní, corrompida la
palabra en Huyapari.
El nombre de Maracapana es corrupción de Maracabana, es
decir Maraca-bana, equivalente á Higiiera grande. El vocablo
maraca está en los lenguajes boriqueño y aruaca aplicado á un
mismo objeto, á una higiiera, vacía de su endocarpio y demás
sustancia interior y llena de pedrezuelas, que la hacen una sonajera,
y que servía á los indígenas de instrumento musical. De modo que la
provincia de Maracapana era, traducido al español, el territorio de
Higiiera Grande. Así tenemos en Puerto Rico, por ejemplo, Sabána
Grande, para designar un pueblo puertorriqueño, habiendo unido al
vocablo indio Sabána, llano, con la voz española grande. Hoy, ya no
se pronuncia la palabra india con el acento en la penúltima sílaba,
sino que se ha hecho la voz esdrújula, confundiéndola con la
castellana Sábana. En Cuba tenemos también Xagua la Grande: una
voz indígena con una española. Esto nos recuerda la construcción de
los vocablos greco-latinos.
Maracaibo es palabra aruaca y sus raices todas están en el
lenguaje boriqueño. El vocablo Maracaibo significa lugar de higiiera
y agua. Explicación filológica: Maraca, higüera; i por ní, agua; bo
por abo, lugar. La palabra, sin la encapsulación polisintética es
Maracaniabo. Y con el polisintetismo Maracaibo.
Aruaca, tal como se encuentra consignado el vocablo en los
cronistas, es una corrupción de Aragua. De este modo se encuentra
la palabra original de aquel gran pueblo enclavada aún en su tierra
por distintas partes, designando una islilla de la desembocadura del
Orinoco, también en uno de los caños del Delta y en su desagüe en el
mar, en un río y en unas sierras. Ha prevalecido más, en las crónicas,
el vocablo Aruaca; y por eso, lo hemos usado en lugar del legítimo
Aragua. Los lectores no muy versados en estos asuntos creerán que
nosotros cambiamos nombres para facilitar la explicación filológica.
Y no hay tal cosa. Estas transformaciones en las palabras son muy
corrientes; así como la elipsis de letras y sílabas, y la metátesis.
Zaragoza procede de Cesárea augusta; Lima nace de Rimac; y en
nuestra propia islilla y muy moderno tenemos el vocablo Ciales
aplicado á un pueblo del interior y es corrupción de es Lacy, apellido
de un célebre general español, fusilado en Palma de Mallorca.
Estudiemos ahora filológicamente la palabra Aragua; y tendremos
ara por yara, sitio, lugar; y gua, como sufijo equivalente á he aquí;
como si dijéramos: he aquí sitio. En nuestro lenguaje moderno, el
hogar, la patria. Cuando Pelayo, después de derrotar á los moros,
bajó de Covadonga al llano é inició la reconquista del suelo español,
su gente le preguntó dónde fundaban población y el caudillo
contestó: ubi edo, que quiere decir en latín, donde estoy: de ahí
procede el actual Oviedo.
Cumaná, provincia de aruacas, significa, Lugar llano y grande: de
cu por cua ó coa lugar; ma, llano; y ná por bana grande. Y
Cumanacoa vocablo derivado de Cumaná. Cumaná-coa, sitio de
Cumaná.
Cariaco, palabra aplicada á un golfo y á un río, quiere decir Lugar
de agua. Ca por gua, he aquí; ri por ní, agua; y aco por coa, lugar. La
metátesis de aco por coa es frecuente.
Los que deseen ahondar más en este estudio pueden tomar un
mapa de Venezuela y verán inmediatamente por doquiera una serie
de nombres aplicados á islas, golfos, ríos, valles y montañas, cuyas
radicales y componentes son los mismos del lenguaje indo-antillano.
Seiba, Guayabal, Guayo, Cocuisa, Yguana, Yaya, Sipao, por Cibao,
Guarico, Guariquén, Yaruma, Guanaja, Caguas, Guiria, etc., son
vocablos que se encuentran también en el lenguaje boriqueño;
procedentes desde luego del Aruaca; palabras que es lógico
congeturar pasaron del Continente meridional al Archipiélago
antillano, porque también se encuentran entre los haytianos,
quisqueyanos, siboneyes y boriqueños.
Debemos al doctor Sagot[235] y á los hermanos Hernhutes de
Zittau[236] y especialmente al misionero Theodoro Schulz[237]
bellísimos trabajos lenguísticos sobre el lenguaje aruaca. Como es
natural, la acción del tiempo trascurrido, desde que las tribus que
invadieron las islas se separaron de las tribus continentales, tal vez
muchas centurias, ha originado cambios radicalísimos con el
fermento cuotidiano del vocablo en un pueblo del período neolítico,
hasta el punto de formarse un habla propia, el lenguaje indo-
antillano.
El vocablo indígena, según la filiación nacional del escritor que lo
anota, sufre también cierta variante. Un simple ejemplo dará la
prueba de lo que indicamos. La sílaba gua, oida por un inglés,
anotará en su cartera wa; si la recoje un francés, escribirá goua: si
un alemán wa; y si un español ó portugués gua. Así los exploradores,
según la educación fonética de sus oidos y el valor de las letras en sus
respectivos idiomas, han hecho los Vocabularios de las lenguas
indias.
Todavia encontramos en los trabajos de Sagot, Hernhutes de
Zittau y Schulz las palabras cabuya, calichi, burén, conuco, hamaca,
maisí, siba, ní, ají, maraca, canoa, iguana, manaca, y otras muchas
significando lo mismo que en el lenguaje boriqueño. Así, pues, la
Filología viene á ayudar á las crónicas y relaciones de Colón, Ojeda,
Las Casas, Oviedo, Bastidas y Rodrigo de Figueroa, para dilucidar,
que los indo-antillanos autóctonos descendían de los Aruacas del
Continente meridional.
Vamos á probar también, con el auxilio de la Filología, que el
pueblo boriqueño tenía común origen continental con el pueblo
caribe de las islas de Barlovento; á pesar de ser dos pueblos, que se
odiaban á muerte; y cuya odiosidad y estado de perpetua guerra
trajeron á las islas, desde el inmediato Continente, donde sus
antepasados vivieron de igual modo.
Ahondemos en las oscuras profundidades de la prehistoria de cada
uno de ellos. Los boriqueños llamaban á su dios tutelar zemí; pues
igual denominación le aplicaban los caribes antillanos. El Padre
Raymond Breton[238] escribe chemij: pero esta ch debe pronunciarse
como z ó c. La j final es la consecuencia del vocablo en fermentación
fonética: hoy mismo oimos decir Madrí, Madrid, Madriz, y es
corrupción del latino Madritum, que á su vez lo es de Matritum, la
Mantua Carpetanorum de los romanos. De modo que los dos
pueblos, el boriqueño y el caribe, para significar su dios penate
conservaba aún la misma palabra, traida indudablemente del
Continente inmediato.
Los boriqueños llamaban á su curandero augur bohique. Los
caribes lo mismo. Los cronistas escriben boyez. Pero es, sin duda
alguna, el mismo vocablo ya corrompido. Bo-y-ez equivale
perfectamente á bo-hi-ques. El tiempo trascurrido desde la
separación de estas tribus, iba imprimiendo el transformismo
fonético en la morfología de las palabras: tanto es así, como que
tuvieron lenguas completamente diferentes, y en el mismo
Continente meridional infinidad de dialectos.
Los caribes llamaban á su dios protector, según el mismo padre
Raymond Breton[239], Icheiri; según el padre Labat[240], Akambú; y
según Champlain, Laborde y Souvestre[241] Loucuo. Opinamos, que
estos tres vocablos son originados del primitivo Yuká guaraní, al
igual que el haytiano Yukajú y el boriqueño Yukiyu.
Analicemos el vocablo Ycheiri que trae el padre Breton: Ycheiri—
Ychei-ri—Yquei-ri—Y-ki-ri—Yu-ki-ru—Yuki-yu—Yukiyu, el dios
bienhechor de Boriquén.

You might also like