Andrade 2013
Andrade 2013
Assessment
Classroom Assessment in the Context of Learning
Theory and Research
By:Heidi L. Andrade
Edited by: James H. McMillan
Book Title: SAGE Handbook of Research on Classroom Assessment
Chapter Title: "Classroom Assessment in the Context of Learning Theory and Research"
Pub. Date: 2013
Access Date: May 22, 2022
Publishing Company: SAGE Publications, Inc.
City: Thousand Oaks
Print ISBN: 9781412995870
Online ISBN: 9781452218649
DOI: https://dx.doi.org/10.4135/9781452218649.n2
Print pages: 17-34
© 2013 SAGE Publications, Inc. All Rights Reserved.
This PDF has been generated from SAGE Knowledge. Please note that the pagination of the online
version will vary from the pagination of the print book.
SAGE SAGE Reference
© 2013 by SAGE Publications, Inc.
Classroom assessment (CA) has long been influenced by program evaluation (Scriven, 1967), psychometrics
(Bloom, 1956), and statistics (Shepard, 2000). Only in the past decade or two have we begun to acknowledge
that a careful study of the relationship between learning theory and assessment can serve at least two
purposes: (1) to inform the design of assessment processes that are grounded in research on how students
represent knowledge and develop competence (Bransford, Brown, & Cocking, 2000) and (2) to provide
an interpretive lens on the intended and unintended consequences of assessment, including its effects on
achievement and motivation.
In 2000, Lorrie Shepard described the ways in which new developments in learning theory had the potential
to transform assessment theory and practice. At the time, the consensus among scholars in the field was that
the vast majority of what was known about cognition and learning had yet to be applied to CA (Pellegrino,
Chudowsky, & Glaser, 2001). Arguing that assessment must reflect cognitive, constructivist, and sociocultural
theories in order to enhance learning as well as (or instead of) measuring it, Shepard described the ways in
which its form, content, and uses should change to align with and support social-constructivist pedagogy.
This chapter extends Shepard's seminal work via a selective review of current research on learning and
assessment. Beginning with an acknowledgment of the need for a model of cognition, the first section
discusses recent developments in learning progressions and their implications for assessment in the content
areas. The chapter then explores the lessons to be learned by framing assessment in terms of the regulation
of learning, a general megatheory of sorts that comprises everything from goal setting to metacognition,
progress monitoring, feedback, and adjustments to learning and teaching. Recommendations for research
are integrated into each section of the chapter and summarized at the end.
Although learning progressions are often designed with state and federal standards in mind, they are more
detailed than most standards, which do not include the significant intermediate steps within and across grade
levels that lead to attainment of the standards (Heritage, 2011). Detailed descriptions of typical learning serve
as representations of models of cognition that can inform instruction as well as the design and interpretation
of the results of assessment. As is shown in Figure 2.1, learning progressions can also indicate common pre-
Learning progressions can provide teachers with a blueprint for instruction and assessment because they
represent a goal for summative assessment, indicate a sequence of activities for instruction, and can inform
the design of formative assessment processes that provide indicators of students' understanding (Corcoran,
Mosher, & Rogat, 2009; Songer, Kelcey, & Gotwals, 2009). The value of learning progressions for CA lies in
the information they provide about what to assess and when to assess it. Teachers and districts can design
summative assessments with a learning progression in mind, as well as formative assessments that move
learning ahead. Questions that target common misconceptions can be designed in advance and delivered
verbally, in writing, to individuals or to groups. For example, at a particular point in a unit on the earth and the
solar system, a teacher can ask questions designed to reveal student thinking in relation to a specific learning
goal in a progression, such as “How long does it take the earth to go around the sun, and how do you know?”
The students' responses to the questions provide insight into their learning and can guide the teacher's next
pedagogical steps.
Diagnostic questions can also be implemented in the form of multiple-choice items (Ciofalo & Wylie, 2006;
Wylie, Ciofalo, & Mavronikolas, 2010). Briggs, Alonzo, Schwab, and Wilson (2006) have demonstrated
that multiple-choice items based on construct maps—that is, learning progressions—can provide diagnostic
information to teachers about student understanding. When each of the possible answer choices in an item
is linked to developmental levels of student understanding, as in the example in Figure 2.2, an item-level
analysis of student responses can reveal what individual students and the class as a whole understand. For
example, if one quarter of the students in a class choose option D, which suggests that they believe that
darkness is caused by the earth moving around the sun once a day, the teacher might decide to provide
opportunities for structured small group discussions between students who understand the day–night cycle
and students who are still developing this understanding. Briggs et al. (2006) described the more intensive
interventions that may be implemented for the portion of the class who scored at Level 2 or below by selecting
options A, C, or E:
Students who chose option E believe that the Sun and Moon switch places to create night. These students are
likely not to believe that the Moon is visible during the day, so daytime observations of the Moon may serve
as a catalyst for them to engage in scaffolded reconsideration of this idea. On the other hand, students who
chose option C believe that the Sun moves around the Earth once per day. This is not something which can be
easily countered by observation, and a more direct approach may be necessary, for example, having students
act out the relative motion of the Earth and Sun. As illustrated here, information from a class summary of a
single item can help the teacher see whether there are general patterns in how her students respond to very
specific concepts, and might help the teacher plan a subsequent instruction. (pp. 49–51)
Figure 2.1 Excerpt from Construct Map for Student Understanding of Earth in the
Solar System
Figure 2.2 Diagnostic Item Based on Construct Map for Student Understanding of
Earth in the Solar System
Briggs et al. (2006) noted that while diagnostic items based on a model of cognition represent an improvement
over tests consisting of traditional multiple-choice items, they complement but do not replace rich, open-
ended performance tasks. However, recent evidence suggests that such items are actually better than
open-ended items at eliciting responses similar to the understanding that students express in think-alouds
and interviews, perhaps because the items probe students' understanding by offering plausible response
alternatives (Steedle & Shavelson, 2009).
If empirical studies support the popular belief that pedagogy based on learning progressions has the potential
to increase student achievement and motivation, further research and development on teacher professional
development will be needed. Research indicates that classroom teachers often do not know how to use
learning progressions as the basis for their assessments (Heritage, Kim, Vendlinski, & Herman, 2009), nor
how to use assessment information to adjust instruction to address student learning needs (Ruiz-Primo & Li,
2011; Schneider & Gowan, 2011). This should come as no surprise, given the documented lack of attention
to assessment by most teacher preparation programs (Popham, 2009).
Heritage (2011) has worked with teachers to write learning progressions and map evidence of learning
to them. She argued in favor of teacher-developed progressions, while Popham (2011) questioned the
usefulness of the practice. Research focused on the role of teachers in the design of learning progressions
could be illuminating.
Finally, Steedle and Shavelson (2009) pointed out the difficulties of interpreting results from a collection of
items when a student does not reason consistently about the concept being assessed:
Valid and simple interpretations of learning progression level diagnoses are only possible when students
select responses reflecting a single learning progression level with some consistency. When students reason
inconsistently, an accurate score report would have to say something like, “The student has scientifically
accurate understanding in problem context A, but he or she has a certain problematic belief in contexts B and
C and a different problematic belief in context D.” Such a characterization of student understanding is more
difficult to interpret than the small, coherent set of ideas encompassed by a learning progression level, but it
may be more accurate. (p. 702)
Careful research on how and how well teachers can apply and interpret formative and summative assessment
data based on learning progressions is needed.
In general, the regulation of learning involves four main processes: (1) goal setting, (2) the monitoring of
progress toward the goal, (3) interpretation of feedback derived from monitoring, and (4) adjustment of
goal-directed action including, perhaps, redefining the goal itself (Allal, 2010). Research and theory on CA
emphasize very similar regulatory goals and processes. Defined as a process of collecting, evaluating, and
using evidence of student learning in order to monitor and improve learning (see McMillan, Chapter 1 of
this volume), effective CA articulates the learning targets, provides feedback to teachers and students about
where they are in relation to those targets, and prompts adjustments to instruction by teachers as well as
changes to learning processes and revision of work products by students. Drawing on Sadler (1989), Hattie
and Timperley (2007) summarized this regulatory process in terms of three questions to be asked by students:
(1) Where am I going? (2) How am I going? and (3) Where to next?
Nicol and Macfarlane-Dick's (2006) review of the literature on SRL and feedback led them to conclude that
good feedback practice is “anything that might strengthen the students' capacity to self-regulate their own
performance” (p. 205). Reasoning that if formative assessment is exclusively in the hands of teachers then
students are less likely to become empowered and develop the self-regulation skills needed to prepare them
for learning outside of school and throughout life, Nicol and Macfarlane-Dick positioned the research on
formative assessment and feedback within Butler and Winne's (1995) model of feedback and SRL. Figure 2.3
is an adaptation of Nicol and Macfarlane-Dick's model. The main modifications are the heightened emphasis
on other-regulation via feedback from teachers, peers, technologies, and others (H), the inclusion of the
processes of interpreting feedback (I), and the closing of the feedback loop for teachers (J).
Following Butler and Winne (1995) and Nicol and Macfarlane-Dick (2006), a key feature of the model in
Figure 2.3 is that students occupy a central and active role in all feedback processes, including and especially
monitoring and regulating their progress toward desired goals and the evaluating the efficacy of the strategies
used to reach those goals. Processes internal to the learner, including activating motivation and knowledge of
the domain and relevant strategies; setting goals; selecting learning strategies; and regulating learning, affect,
and cognition are depicted inside the shaded area. External feedback from teachers and others must also be
interpreted by the student if it is to have a significant influence on subsequent learning.
Current views of SRL acknowledge that learning is not just self-regulated by students, but also co-regulated
and shared (Hadwin et al., 2011). Regulated learning is as much a social as a solo phenomenon. Black and
Wiliam (1998) argued that assessment is also social and that, in fact, “all the assessment processes are, at
heart, social processes, taking place in social settings, conducted by, on, and for social actors” (p. 56). In this
section of the chapter, I will examine the theoretical and research bases for conceiving of CA as the regulation
of learning by students themselves as well as by their peers, their teachers, and assessment technologies.
The section is organized in terms of how research on CA is related to the four main regulation processes
identified by Allal (2010): (1) goal setting, (2) the monitoring of progress toward the goal, (3) interpretation of
feedback derived from monitoring, and (4) adjustment of goal-directed action.
Hattie (2009) defined effective goal setting by teachers as setting appropriately challenging goals, developing
commitment on the part of teachers and students (especially those with special needs) to attain them, and
intending to implement strategies to achieve them. When goals are determined by the teacher, it is necessary
to share them with students, who can use them to begin to answer the question, “Where am I going?” For
example, Seidel, Rimmele, and Prenzel (2005) found a positive effect of physics teachers' goal clarity and
coherence on German students' perceptions of supportive learning conditions, motivation, and competence
development as measured by tests on electric circuits and force concepts. The increase in competence
corresponded to an increase of more than one standard deviation.
The level of challenge of a goal is quite important (Hill & Rowe, 1998). According to Hattie (2009), there
is a direct linear relationship between the degree of goal difficulty and performance, with difficult goals
outperforming “do your best” goals. With an average effect size of d = 0.66, the support for challenging goals
was compelling enough to spur Hattie to recommend that “any school with the motto ‘do your best’ should
immediately change it to ‘face your challenges' or ‘strive to the highest’” (p. 164). Ideally, of course, the
learning goals set for and/or by students must lie within their zone of proximal development (ZPD) (Vygotsky,
1978) in order to ensure that they are appropriately challenging. Learning progressions can play a part in the
selection of appropriately challenging goals by teachers and students.
Of course, students also set their own learning goals (step C in the figure), particularly achievement goals.
Brookhart (Chapter 3 of this volume) discusses the relationships between students' achievement goals,
motivation, and performance. More research is needed, however, on the relationship between unit-, lesson-,
and task-specific goal setting by students and achievement, particularly since students' own goals commit
them to a particular standard or outcome (Hadwin et al., 2011). Belgrad (Chapter 19 of this volume) reviews
the research on portfolios, which tend to highlight goal setting by students. Although quite limited, that
research suggests a positive relationship between goal setting and students' performance (Church, Elliot, &
Gable, 2001). Additional rigorous research on the effects of student goal setting (with or without portfolios) on
self-regulation and achievement is warranted.
Additional research is also needed on the effects on learning outcomes when teachers explicitly share their
learning goals with students. Although there have been some recent attempts to encourage teachers to
communicate learning goals or targets to students, research indicates that teachers generally do not do so
without support (Moss et al., 2011). As with any new development in CA, once the effectiveness of shared
learning goals is well defined and understood, research on related professional development for teachers
should ensue.
Success Criteria
In contrast with learning goals, which tend to be broad, success criteria describe the qualities of excellent
student work on a particular assignment. Success criteria can be communicated to students in a variety of
ways. Worked examples, which typically consist of a sample problem and the appropriate steps to its solution,
imply success criteria. Hattie's (2009) meta-analysis resulted in an overall effect size of worked examples of
d = 0.52.
More direct expressions of the success criteria include rubrics and checklists. My colleagues and I (Andrade,
Du, & Mycek, 2010; Andrade, Du, & Wang, 2009) engaged elementary and middle school students in
reading a model essay, discussing its qualities, and generating a list of criteria that were then included in the
rubric they used to self-assess drafts of their essays. This process was more effective than self-assessment
according to a rubric that was simply handed out after reviewing and discussing the model essay. Thinking
about the qualities of an effective essay and cocreating the success criteria for their own essays appears to
make a difference in the potency of self-assessment.
Ross and Starling (2008) also ensured that the Grade 9 geography students in their study understood and
could apply the criteria and standards of assessment to their own work. Before asking students to self-
assess their projects, they involved them in defining assessment criteria by co-constructing a rubric and
taught students how to apply the criteria through teacher modeling. After controlling for the effects of pretest
self-efficacy, students in the self-assessment group scored higher than students in the control group on all
SAGE Handbook of Research on Classroom Assessment
Page 8 of 19
SAGE SAGE Reference
© 2013 by SAGE Publications, Inc.
achievement measures.
It is important to note that the studies discussed in this section involved self-assessment as well as
transparent success criteria. My search of the literature revealed only one study that examined the effect of
success criteria alone: It was a study I conducted (2001) of the effects of simply providing a rubric to eighth-
grade writers before they began to write. Of the three essays students wrote for the study, only one resulted in
significant differences between the treatment and comparison groups. Given the results of that study, it seems
reasonable to assume that sharing success criteria with students should be part of a more comprehensive
process of actively engaging them in assessment by, for example, cocreating success criteria and monitoring
their own progress through peer and/or self-assessment. Brown and Harris (Chapter 21 of this volume) draw
a similar conclusion.
The quality of the success criteria makes a difference, of course. Citing Moreland and Jones (2000), Brookhart
(2007) noted that formative assessment and the instructional decisions based on it can actually thwart the
learning process if the success criteria are trivial (e.g., focused on social and managerial issues such as
taking turns at the computer) rather than substantive (e.g., focused on procedural and conceptual matters
related to learning about computers). Brookhart rightly warned us to “be careful about criteria, which construct
the way forward” (p. 51). Informal Google searches of K–12 rubrics strongly suggest that teachers need
guidance regarding the identification of high quality success criteria. Research on the effectiveness of
professional development would be most welcome.
For example, Mathan and Koedinger (2005) showed that a computer-based program can explicitly model
metacognitive skills and use the models to scaffold student performance. Using a less high-tech approach,
Allen and Hancock (2008) examined the effects of having students in Grades 4, 5, and 6 metacognitively
reflect on their personal profiles of strengths and weaknesses related to reading comprehension. They found
that sharing individualized profiles based on the Woodcock–Johnson III and regularly asking students to
reflect in writing and make judgments of learning regarding their individual cognitive strengths resulted in
greater learning gains as compared to another treatment and a control group. Such programs are excellent
examples of using metacognition as a means to an end—better performance—but they typically do not
assess metacognition itself.
White and Frederiksen (2005) have developed intelligent software agents that scaffold metacognition and
have consistently found that helping students adopt metacognitive roles as they work together on projects
can promote learning and foster self-regulation. Unfortunately, metacognition is rarely taught much less
assessed in most classrooms. The inclusion of a metacognitive knowledge category in the revised Taxonomy
of Educational Objectives (Anderson & Krathwohl, 2001) could have spurred the creation and dissemination
of ways to assess metacognition but that has not happened. The variety of measures of metacognition
SAGE Handbook of Research on Classroom Assessment
Page 9 of 19
SAGE SAGE Reference
© 2013 by SAGE Publications, Inc.
used by researchers, including self-report questionnaires, self-ratings, think-aloud protocols, written tests, and
interviews (Desoete, 2008) have not been adapted for widespread classroom use. There are a variety of
explanations for the lack of attention to the assessment of metacognition including the fact that metacognition
is not explicitly tested by large-scale examinations, so there is little impetus to teach it, as well as the
challenges of assessing a process that is completely internal to the learner.
The good news is that classroom-based assessment of metacognition is a topic flush with opportunities for
innovative research. There are at least two categories of inquiry with potential: (1) research on the effects of
assessment of metacognition on metacognition itself as well as on learning and (2) research on the effects
of CA on metacognition. An obvious starting point for the first category is to study the results of teachers
including goals for teaching metacognitive knowledge and strategies in their regular unit planning, sharing
those goals with students, and teaching and assessing metacognition along with other content knowledge.
Work in this area could build on White and Frederiksen's (2005) research by sharing with students the
information about their metacognition gleaned from measures used for research purposes as well as by
providing feedback on metacognitive processing from students themselves and, if possible, their teachers
and peers.
Research on the effects of assessment of metacognition on metacognition itself as well as on learning can
also be done via portfolios. Portfolios provide a rich context for the assessment of metacognition, as they offer
students the opportunity to reflect on their work and their approaches to it (see Chapter 19 of this volume).
Studies of the development of metacognitive self-knowledge through portfolio assessment seem like a natural
extension of existing research.
Inquiry into the effects of CA on metacognition—the second category previously listed—could focus on
the relationships between peer or self-assessment and metacognition. Links between self-assessment and
metacognition are likely to be found, since the essence of self-assessment is the ability to know what and how
one is learning (Jones, 2007). Links between peer assessment and metacognition are also possible, given
that students influence each other through co-regulation (Kramarski & Dudai, 2009; White & Frederiksen,
2005).
Excellent reviews of research on feedback have recently been published (Brookhart, 2004; Hattie, 2009;
Hattie & Timperley, 2007; Lipnevich & Smith, 2008; Shute, 2008; Wiliam, 2010; also see Chapters 12 and
13 of this volume). In general, the research on feedback shows that it tends to be associated with learning
and achievement but that not all kinds of feedback are equally effective. Feedback is most effective when it
is the right kind (e.g., detailed and narrative, not graded), delivered in the right way (supportive), at the right
time (sooner for low-level knowledge but not so soon that it prevents metacognitive processing and later for
complex tasks), and to the right person (who is in a receptive mood and has reasonably high self-efficacy).
Feedback can come from a variety of sources, including teachers, students, and technology. Each source has
a substantive research base that is briefly overviewed next.
The design of the study included three conditions: (1) no feedback, (2) detailed feedback perceived by
participants to be provided by the course instructor, and (3) detailed feedback perceived by participants
to be computer generated. The conditions were crossed with two factors of grade (receiving a grade or
not) and praise (receiving praise or not). Detailed, narrative feedback on individual students' first drafts
was found to be strongly related to improvement in essay scores: Students who did not receive detailed
feedback obtained substantially lower final exam scores than those who received detailed feedback from
either the computer or the instructor. There were no differences in students' performance between computer
and instructor conditions. Differences between the no-feedback condition and the instructor- or computer-
generated feedback conditions showed effect sizes of between 0.30 to 1.25, depending on the presence of
grade and praise.
The influence of grades and praise was more complex. There was a significant difference in the final exam
score between students in the grade condition and those in the no-grade condition. Students who were shown
the grade they received for their first draft performed less well on the final version than those who were
not shown their grade. Lipnevich and Smith (2008) noted that this effect should be viewed in the context
of two significant interactions involving grades: Under the grade condition, scores were higher when praise
was presented than when it was not. For the no-grade condition, scores were higher when praise was not
presented than when praise was presented.
Lipnevich and Smith (2008) concluded that overall, detailed feedback was most effective when given alone,
unaccompanied by grades or praise. Their findings echo earlier findings by Butler (1987; Butler & Nisan,
1986), which showed that students who received grades and no comments showed no learning gains, those
who received grades and comments also showed no gains, but the students who received comments and no
grades showed large gains. Additional research on the relationship between summative grades, achievement,
and motivation in K–12 classrooms is needed. If the results indicate that grades do indeed interfere with
learning and achievement, the hard work of figuring out how to minimize or even eliminate that negative effect
within the context of our grade-addicted society could begin in earnest.
Brown and Harris (Chapter 21 of this volume) survey current research on self-assessment, including
investigations of the relationship between it and self-regulation. They conclude that there is evidence of a
link between self-assessment and better self-regulation skills, “provided such self-evaluation involves deep
engagement with the processes affiliated with self-regulation (i.e., goal setting, self-monitoring, and evaluation
against valid, objective standards)” (p. 386). For obvious reasons, simply grading one's own work appears to
be less likely to lead to SRL than thoughtful reflection.
Brown and Harris raise an interesting question about whether or not the accuracy of students' self-
assessments is a determinant of the effectiveness of the process. They argue that accuracy is a condition
of valid student self-assessment, which can be flawed because students, especially younger and less
academically proficient students, are often unrealistically optimistic about their own abilities, believe they are
above average, neglect crucial information, and have deficits in their knowledge base. On the other hand,
insisting on or even simply attending to the accuracy of student's self-assessment could be counterproductive,
given the well-known pitfalls of social response bias (Paulhus, 1991); the potential emphasis on a score or
grade rather than on forward-looking feedback (Lipnevich & Smith, 2008); and issues of trust, particularly in
terms of the quality of the relationship between student and teacher and the student's willingness to be honest
in his or her self-assessment (Raider-Roth, 2005). The value of the accuracy of self-assessment by students
is an empirical question open to debate and inquiry. Researchers interested in conducting research on this
issue are first tasked with creating a method of collecting students' self-assessment that do not also influence
them.
SAGE Handbook of Research on Classroom Assessment
Page 11 of 19
SAGE SAGE Reference
© 2013 by SAGE Publications, Inc.
Computer-based assessment programs integrate the management of learning (e.g., organizing student
assignments, assessments, and performance), curricular resources, embedded assessments, and detailed
student-level and aggregate-level reporting of strengths and weaknesses. Perhaps the greatest advantage of
these computerized systems is the degree to which they help students and teachers monitor progress. Many
programs harness the flexible, adaptive capabilities of artificial intelligence to respond to each student's work
with detail and immediacy.
Examples of computer-based programs that feature assessments include ASSISTments, ALEKS, DreamBox
Learning, Time To Know, Compass-Learning Odyssey, Wowzers, Carnegie Learning, SuccessMaker, and
WriteToLearn. Some programs such as DreamBox Learning and Time To Know integrate instruction and
assessment into one platform. Others such as WriteToLearn have a more exclusive focus on assessment.
WriteToLearn is an example of an assessment technology with research support. WriteToLearn promotes
reading comprehension and writing skills by providing students with immediate, individualized feedback
(Landauer, Lochbaum, & Dooley, 2009). The program is designed for students in Grades 4 through 12 and
is comprised of two components: (1) Summary Street, where students read and summarize articles or book
excerpts, and (2) the Intelligent Essay Assessor, where students write topic-prompted essays.
The research on WriteToLearn is promising. One study used a counterbalanced design to find a positive
relationship between the use of Summary Street and student summary scores after just 2 weeks of using
the program (Wade-Stein & Kintsch, 2004). The researchers also found that students spent significantly
more time on generating summaries than students not using the program, which suggests the program may
promote motivation and engagement. Another study, using an experimental design, found that eighth-grade
students who used Summary Street scored significantly higher on a test of comprehension than students
who did not use the program (Franzke, Kintsch, Caccamise, Johnson, & Dooley, 2005). Student writing in the
treatment group was also judged as being of higher quality than the writing of students in the control group.
Innovative research on the efficacy of a computerized assessment for learning (AFL) system named Adaptive
Content with Evidence-Based Diagnosis (ACED) suggests that even traditional tests can provide feedback
that promotes learning (Shute, Hansen, & Almond, 2008). Shute et al. found that the system could enhance
student learning by providing test takers with elaborated, task-level feedback without compromising the
technical quality of the assessment. The authors concluded that state-mandated tests might be augmented
to support student learning with instructional feedback without jeopardizing the primary purpose of the
assessment. Since such an augmentation to tests would go a long way toward making them more effective in
promoting learning and growth, more research on the potential applications to CA could be quite productive.
Russell (2010) noted that many of the studies of the efficacy of assessment technologies use small samples of
students or classrooms and/or were conducted by researchers who were closely linked with the development
of the tools. The obvious implications for future research in this area are to expand sample sizes and to
involve researchers who do not have vested interest in the outcome of the studies.
SAGE Handbook of Research on Classroom Assessment
Page 12 of 19
SAGE SAGE Reference
© 2013 by SAGE Publications, Inc.
Noting that the more traditional finding of greater helplessness among girls was evident only when the
evaluators were adults, Dweck et al. (1978) interpreted these findings to mean that boys and girls have
learned to interpret and respond differently to feedback from different sources. Future research on the
differential effects of feedback from teachers, peers, technologies, and students themselves would be useful,
particularly if it included measures of attributions, self-efficacy, motivation, and performance.
Draper (2009) has developed a theoretical argument that stresses how students' interpretations of ambiguous
feedback determine whether that feedback is useful or not. His perspective can inform future research on the
subject. He postulates at least six possible interpretations of feedback:
1. Technical knowledge or method (e.g., concluding that one did not use the best information or
method for the task, both of which can be improved)
2. Effort (e.g., deciding that one did not leave enough time to do a task well)
3. Method of learning about a task (e.g., realizing that one did not seek out the right information or did
not understand the criteria for the task)
4. Ability (e.g., believing that one does not have the necessary aptitude to succeed at a task)
5. Random (e.g., assuming nothing was done incorrectly so success is possible next time without
adjustment or revision)
6. The judgment process was wrong (e.g., determining that the feedback was incorrect)
Students' self-regulatory responses to feedback might be determined by which of the previous six
interpretations are brought to bear on any given instance of feedback. Assuming that interpretations 1, 2, and
3 are generally (though not always) more productive than 4, 5 and 6, Draper contends that teachers should
help students construct appropriate interpretations of feedback by offering clear, often very simple, cues. The
cues should indicate which interpretation of feedback is correct and constructive—for example, “This is a
simple technical issue: You did not use the correct formula to use to solve this problem” (1: Technical method),
or “Have you spent enough time and effort on this to do a good job?” (2: Effort), or “It might be helpful to review
your method of learning about this task. How did you interpret the third criterion on the rubric?” (3: Method
of learning). Research that tests this or related theories and the ways in which CA can influence students'
interpretations of feedback is needed.
All theories of SRL emphasize learners' adjustments to their goals, strategies, and outcomes in response
to feedback from themselves and others about their progress: This is step F in Figure 2.3. But what do
we know about the adjustments to goal-directed action that students make in light of CA? Very little. The
lack of information about what students actually do in response to feedback reflects the fact that research
has tended to employ measures of outcomes and products rather than of the processes of learning and
revision. Since this limits our ability as a field to construct a meaningful theory of change, research is needed
on the cognitive and behavioral adjustments that students make (if any) in response to both formative and
summative assessment.
The literature on CA tends to emphasize teachers' adjustments to instruction, which is represented by step J
in Figure 2.3. Wiliam (2010) has championed the view that the most useful assessments are those that yield
insights that are instructionally tractable. In other words, assessments are only as good as the insights they
provide into the next steps in instruction that are likely to be most effective. Unfortunately, there is plentiful
evidence that teachers do not know how to adapt instruction in light of evidence of a lack of student learning
(Heritage & Heritage, 2011; Ruiz-Primo & Li, 2011; Schneider & Gowan, 2011). For example, Heritage et al.
(2009) and Schneider and Gowan (2011) found empirical evidence that teachers struggle to determine next
instructional steps after reviewing student work. Ruiz-Primo, Furtak, Yin, Ayala, and Shavelson (2010) and
Fitzpatrick and Schulz (2010) found limited or no evidence that CAs were used directly for formative purposes.
Hoover and Abrams (in press) found that 64% of the teachers they studied reported that instructional pacing
prohibited reteaching of concepts.
The assessment community should better understand teachers' skills with CA practices so that professional
development and instructional materials can better support them and their students in raising student
achievement. Schneider and Andrade (2012) propose the following focal research questions:
This chapter is peppered with recommendations for future research—particularly research that illuminates the
cognitive mechanisms of learning from assessment. The recommendations are summarized in categories of
SAGE Handbook of Research on Classroom Assessment
Page 14 of 19
SAGE SAGE Reference
© 2013 by SAGE Publications, Inc.
• Learning progressions: What are the effects of instruction and assessment based on learning
progressions? Do they differ for students who do not follow a typical learning path? Is it useful to
have students use learning progressions to set goals, monitor their progress toward them, and make
adjustments to the processes and products of their learning? If so, what would that look like, and how
effective is it in promoting learning, motivation, and self-regulation?
• SRL: What are the effects of student goal setting on self-regulation and achievement? Similarly,
what are the effects when teachers explicitly share their learning targets with students? What are
the effects of the assessment of metacognition on metacognition itself as well as on learning? What
are the effects of peer and self-assessment on metacognition? How do students' interpretations
of feedback influence learning and achievement? How does CA affect students' interpretations of
feedback? What cognitive and behavioral adjustments do students make in response to formative
and summative assessment?
• Peer and self-assessment: What is the relationship among accuracy in self-assessment, self-
regulation, and achievement? What is the relationship between peer assessment and SRL?
• Summative assessment: What is the relationship among grades, achievement, and motivation? Can
computer-based summative assessments promote learning by providing instant feedback to students
without compromising the psychometric qualities of the test?
• Sources of feedback: Are there differential effects of feedback from teachers, peers, technologies,
and students themselves? Does the gender of the assessor matter?
• Teacher professional development: What is the best way to involve teachers in the development, use,
and interpretation of learning progressions? How do teachers use formative assessment practices,
analyze student work, and use evidence of learning to adapt instruction? What are the implications
of research findings for professional development and instructional materials to support teachers?
The methods used to investigate questions like those listed here should be varied, rigorous, and appropriate
to the questions being asked, of course. Because randomization is generally seen as the gold standard
method for making strong inferences about treatment effects (Cook, 2006) but is often difficult to implement
with classroom-based studies, researchers might consider using a randomized block design based on
matched pairs. Research designs can also be enhanced with the use of sophisticated modern graphics
that facilitate highly informative interpretations of results, including causal inferences about treatment effects
(Pruzek & Helmreich, 2009). Graphics like those found in R, a statistical software package that is freely
available on the Internet, can reveal where assessment interventions appear to work especially well or poorly
and how and how much results vary across contexts.
1 Thanks go to Fei Chen for her assistance with the search of the literature for this chapter.
References
Allal, L. (2010). Assessment and the regulation of learning. In E. B. P.Peterson (ed.), International
Encyclopedia of Education (Vol. 3, pp. 348–352). Oxford: Elsevier.
Allal, L., & Lopez, L. M. (2005). Formative assessment of learning: A review of publications in French. In
J.Looney (ed.), Formative assessment: Improving learning in secondary classrooms (pp. 241–264). Paris:
Organisation for Economic Cooperation and Development.
Allen, K., & Hancock, T. (2008). Reading comprehension improvement with individualized cognitive profiles
and metacognition. Literacy Research and Instruction, 47, 124–139.
Anderson, L., & Krathwohl, D. (eds.). (2001). A taxonomy for learning, teaching, and assessing: A revision of
Bloom's Taxonomy of Educational Objectives (Complete edition). New York: Longman.
Andrade, H. G. (2001). The effects of instructional rubrics on learning to write. Current Issues in Education,
4(4). Retrieved from http://cie.ed.asu.edu/volume4/number4
Andrade, H. L. (2010). Students as the definitive source of formative assessment: Academic self-assessment
and the self-regulation of learning. In H. L.Andrade & G. J.Cizek, Handbook of formative assessment (pp.
90–105). New York: Routledge.
Andrade, H. L., Du, Y., & Mycek, K. (2010). Rubric-referenced self-assessment and middle school students’
writing. Assessment in Education, 17(2), 199–214.
Andrade, H. L., Du, Y., & Wang, X. (2009). Putting rubrics to the test: The effect of a model, criteria generation,
SAGE Handbook of Research on Classroom Assessment
Page 15 of 19
SAGE SAGE Reference
© 2013 by SAGE Publications, Inc.
Green, S., & Johnson, R. (2010). Assessment is essential. New York: McGraw-Hill.
Hacker, D., Dunlosky, J., & Graesser, A. (1998). Metacognition in educational theory and practice. Mahwah,
NJ: Lawrence Erlbaum.
Hadwin, A., Järvelä, S., & Miller, M. (2011). Self-regulated, co-regulated, and socially shared regulation of
learning. In B.Zimmerman & D.Schunk (eds.), Handbook of self-regulation of learning and performance (pp.
65–86). New York: Routledge.
Hattie, J. (2009). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. New York:
Routledge.
Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77, 81–112.
Heritage, M. (2009). The case for learning progressions. San Francisco: Stupski Foundation.
Heritage, M. (2011). Developing learning progressions. Paper presented at the annual conference of the
American Educational Research Association, New Orleans, LA.
Heritage, M., & Heritage, J. (2011). Teacher questioning: The epicenter of instruction and assessment. Paper
presented at the annual meeting of the American Educational Research Association, New Orleans, LA.
Heritage, M., Kim, J., Vendlinski, T., & Herman, J. (2009). From evidence to action: A seamless process in
formative assessment?Educational Measurement Issues and Practice, 28(3), 24–31.
Hill, P. W., & Rowe, K. J. (1998). Modeling student progress in studies of educational effectiveness. School
Effectiveness and School Improvement, 9 (3), 310–333.
Hinton, C., Fischer, K., & Glennon, C. (2012). Student centered learning: A mind, brain and education
perspective: The Students at the Center series. Boston: Jobs for the Future. Retrieved from
http://www.studentsatthecenter.org/papers/assessing-learning
Hollander, E., & Marcia, J. (1970). Parental determinants of peer orientation and self-orientation among
preadolescents. Developmental Psychology, 2(2), 292–302.
Hoover, N., & Abrams, L. (in press). Teachers’ instructional use of student assessment data. Applied
Measurement in Education.
Jones, D. (2007). Speaking, listening, planning and assessing: The teacher's role in developing metacognitive
awareness. Early Child Development and Care, 6/7, 569–579.
Kegel, C., Bus, A. G., & van Ijzendoorn, M. H. (2011). Differential susceptibility in early literacy instruction
through computer games: The role of the dopamine D4 receptor gene (DRD4). Mind, Brain and Education,
5(2), 71–78.
Kramarski, B., & Dudai, V. (2009). Group-metacognitive support for online inquiry in mathematics with
differential self-questioning. Journal of Educational Computing Research, 40(4), 377–404.
Landauer, T., Lochbaum, K., & Dooley, S. (2009). A new formative assesment technology for reading and
writing. Theory Into Practice, 48(1).
Lipnevich, A., & Smith, J. (2008). Response to assessment feedback: The effects of grades, praise, and
source of information (Research report RR-08-30). Princeton, NJ: Educational Testing Service.
Mathan, S., & Koedinger, K. (2005). Fostering the intelligent novice: Learning from errors with metacognitive
tutoring. Educational Psychologist, 40(4), 257–265.
McMillan, J. (2011). Classroom assessment: Principles and practice for effective standards-based instruction.
New York: Pearson.
Moreland, J., & Jones, A. (2000). Emerging assessment practices in an emergent curriculum: Implications for
technology. International Journal of Technology and Design Education, 10, 283–305.
Moss, C., Brookhart, S., & Long, B. (2011). School administrators’ formative assessment leadership practices.
Paper presented at the annual meeting of the American Educational Research Association, New Orleans, LA.
National Research Council. (2007). Taking science to school: Learning and teaching science in grades K-8.
Washington, DC: National Academies Press.
Nicol, D., & Macfarlane-Dick, D. (2006). Formative assessment and self-regulated learning: a model and
seven principles of good feedback practice. Studies in Higher Education, 31(2), 199–218.
Nitko, A., & Brookhart, S. (2011). Educational assessment of students (
6th ed.
). New York: Pearson.
Paulhus, D. (1991). Measurement and control of response bias. Measures of personality and social
psychological attitudes. In J.Robinson, P.Shaver, & L.Wrightsman (eds.), Measures of personality and social
psychological attitudes, Measures of social psychological attitudes (Vol. 1, pp. 17–59). San Diego, CA:
Academic Press.
Pellegrino, J., Chudowsky, N., & Glaser, R. (eds.). (2001). Knowing what students know: The science and
SAGE Handbook of Research on Classroom Assessment
Page 17 of 19
SAGE SAGE Reference
© 2013 by SAGE Publications, Inc.
Wiliam, D. (2010). An integrative summary of the research literature and implications for a new theory of
formative assessment. In H. L.Andrade & G. J.Cizek (eds.), Handbook of formative assessment (pp. 18–40).
New York: Routledge.
Winne, P. (2011). A cognitive and metacognitive analysis of self-regulated learning. In B.Zimmerman &
D.Schunk (eds.), Handbook of self-regulation of learning and performance (pp. 15–32). New York: Routledge.
Wylie, C., Ciofalo, J., & Mavronikolas, E. (2010). Documenting, diagnosing and treating misconceptions:
Impact on student learning. Paper presented at the annual meeting of the American Educational Research
Association, Denver, CO.
• metacognition
• peer assessment
• feedback
• self-assessment
• self-regulation
• assessment
• goal setting
http://dx.doi.org/10.4135/9781452218649.n2