An Introduction To Design-Based Research
An Introduction To Design-Based Research
Abstract This chapter arose from the need to introduce researchers, including Master
and PhD students, to design-based research (DBR). In Sect. 16.1 we address key
features of DBR and differences from other research approaches. We also describe
the meaning of validity and reliability in DBR and discuss how they can be improved.
Section 16.2 illustrates DBR with an example from statistics education.
The purpose of this chapter is to introduce researchers, including Master and PhD
students, to design-based research. In our research methods courses for this audi-
ence and in our supervision of PhD students, we noticed that students considered
key publications in this field unsuitable as introductions. These publications have
mostly been written to inform or convince established researchers who already have
considerable experience with educational research. We therefore see the need to
write for an audience that does not have that level of experience, but may want to
know about design-based research. We do assume a basic knowledge of the main
research approaches (e.g., survey, experiment, case study) and methods (e.g., inter-
view, questionnaire, observation).
Compared to other research approaches, educational design-based research
(DBR) is relatively new (Anderson and Shattuck 2012). This is probably the reason
that it is not discussed in most books on qualitative research approaches. For exam-
ple, Creswell (2007) distinguishes five qualitative approaches, but these do not
include DBR (see also Denscombe 2007). Yet DBR is worth knowing about, espe-
cially for students who will become teachers or researchers in education: Design-
based research is claimed to have the potential to bridge the gap between educational
practice and theory, because it aims both at developing theories about domain-
specific learning and the means that are designed to support that learning. DBR thus
produces both useful products (e.g., educational materials) and accompanying sci-
entific insights into how these products can be used in education (McKenney and
Reeves 2012; Van den Akker et al. 2006). It is also said to be suitable for addressing
complex educational problems that should be dealt with in a holistic way (Plomp
and Nieveen 2007).
In line with the other chapters in this book, Sect. 16.1 provides a general theory
of the research approach under discussion and Sect. 16.2 gives an example from
statistics education on how the approach can be used.
Another way to characterize DBR is to contrast it with other approaches on the fol-
lowing two dimensions: naturalistic vs. interventionist and open vs. closed.
Naturalistic studies analyze how learning takes place without interference by a
researcher. Examples of naturalistic research approaches are ethnography and sur-
veys. As the term suggests, interventionist studies intervene in what naturally hap-
pens: Researchers deliberately manipulate a condition or teach according to
particular theoretical ideas (e.g., inquiry-based or problem-based learning). Such
studies are necessary if the type of learning that researchers want to investigate is
not present in naturalistic settings. Examples of interventionist approaches are
experimental research, action research, and design-based research.
Research approaches can also be more open or closed. The term open here refers
to little control of the situation or data whereas closed refers to a high degree of
control or a limited number of options (e.g., multiple choice questions). For example,
surveys by means of questionnaires with closed questions or responses on a Likert
scale are more closed than surveys by means of semi-structured interviews.
Likewise, an experiment comparing two conditions is more closed than a DBR
project in which the educational materials or ways of teaching are emergent and
adjustable. Different research approaches can thus be positioned in a two-by-two
table as in Table 16.1. DBR thus shares an interventionist nature with experiments
and action research. We therefore continue by comparing DBR with experiments
(16.1.2.5) and with action research (16.1.2.6).
16 An Introduction to Design-Based Research with an Example From Statistics… 433
Table 16.1 Naturalistic vs. interventionist and open vs. closed research approaches
Naturalistic Interventionist
Closed Survey: questionnaires with closed questions Experiment (randomized controlled
trial)
Open Survey: interviews with open questions Action research
Ethnography Design-based research
start every lesson with a warm-up activity (e.g., a puzzle). Apparently it had been
proven by means of an RCT that student scores were significantly higher in the
experimental condition in which lessons started with a warm-up activity. The nega-
tive effect in teaching practice, however, was that teachers ran out of good ideas for
warm-up activities, and that these often had nothing to do with the topic of the
lesson. Effectively, teachers therefore lost five minutes of every lesson. Better
insight into how and why warm-up activities work under particular conditions could
have improved the situation, but the comparative nature of RCT had not provided
this information because only the variable of starting the lesson with or without
warm-up activity had been manipulated.
A second argument why RCT has its limitations is that a new strategy has to be
designed before it can be tested, just like a Boeing airplane cannot be compared
with an Airbus without a long tradition of engineering and producing such airplanes.
In many cases, considerable research is needed to design innovative approaches.
Design-based research emerged as a way to address this need of developing new
strategies that could solve long-standing or complex problems in education.
Two discussion points in the comparison of DBR and RCT are the issues of gen-
eralization and causality. The use of random samples in RCT allows generalization
to populations, but in most educational research random samples cannot be used. In
response to this point, researchers have argued that theory development is not just
about populations, but rather about propensities and processes (Frick 1998). Hence
rather than generalizing from a random sample to a population (statistical general-
ization), many (mainly qualitative) research approaches aim for generalization to a
theory, model or concept (theoretical or analytic generalization) by presenting find-
ings as particular cases of a more general model or concept (Yin 2009).
Where the use of RCTs can indicate the intervention or treatment being the cause
of better learning, DBR cannot claim causality with the same convincing rigor. This
is not unique to DBR: All qualitative research approaches face this challenge of
drawing causal claims. In this regard it is helpful to distinguish two views on
causality: a regularity, variance-oriented understanding of causality versus a realist,
process-oriented understanding of causality (Maxwell 2004). People adopting the
first view think that causality can only be proven on the basis of regularities in larger
data sets. People adopting the second view make it plausible on the basis of circum-
stantial evidence of observed processes that what happened is most likely caused by
the intervention (e.g., Nathan and Kim 2009). The first view is underlying the logic
of RCT: If we randomly assign subjects to an experimental and control condition,
treat only the experimental group and find a significant difference between the two
groups, then it can only be attributed to the difference in condition (the treatment).
However, if we were to adopt the same regularity view on causality we would never
be able to identify the cause of singular events, for example why a driver hit a tree.
From the second, process-oriented view, if a drunk driver hits a tree we can judge
the circumstances and judge it plausible that his drunkenness was an important
16 An Introduction to Design-Based Research with an Example From Statistics… 435
explanation because we know that alcohol can cause less control, slower reaction
time et cetera. Similarly, explanations for what happens in classrooms should be
possible according to a process-oriented position based on what happens in response
to particular interventions. For example, particular student utterances are very
unlikely if not deliberately fostered by a teacher (Nathan and Kim 2009). Table 16.2
summarizes the main points of the comparison of RCT and DBR.
Like action research, DBR typically is interventionist and open, involves a reflective
and often cyclic process, and aims to bridge theory and practice (Opie 2004). In both
approaches the teacher can be also researcher. In action research, the researcher is not
an observer (Anderson and Shattuck 2012), whereas in DBR s/he can be observer.
Furthermore, in DBR design is a crucial part of the research, whereas in action
research the focus is on action and change, which can but need not involve the design
of a new learning environment. DBR also more explicitly aims for instructional theo-
ries than does action research. These points are summarized in Table 16.3.
Table 16.3 Commonalities and differences between DBR and action research
DBR Action research
Commonalities Open, interventionist, researcher can be participant, reflective cyclic process
Differences Researcher can be observer Researcher can only be participant
Design is necessary Design is possible
Focus on instructional theory Focus on action and improvement of a situation
436 A. Bakker and D. van Eerde
In its relatively brief history, DBR has been presented under different names.
Design-based research is the name used by the Design-Based Research Collective
(see special issues in Educational Researcher, 2003; Educational Psychologist
2004; Journal of the Learning Sciences 2004). Other terms for similar approaches are:
• Developmental or development research (Freudenthal 1988; Gravemeijer 1994;
Lijnse 1995; Romberg 1973; Van den Akker 1999)
• Design experiments or design experimentation (Brown 1992; Cobb et al. 2003a;
Collins 1992)
• Educational design research (Van den Akker et al. 2006)
The reasons for these different terms are mainly historical and rhetorical. In the
1970s Romberg (1973) used the term development research for research accompa-
nying the development of curriculum. Discussions on the relation between research
and design in mathematics education, especially on didactics, mainly took place in
Western Europe in the 1980s and the 1990s, particularly in the Netherlands (e.g.,
Freudenthal 1988; Goffree 1979), France (e.g., Artigue 1988, cf. Artigue Chap. 17)
and Germany (e.g., Wittmann 1992). The term developmental research is a transla-
tion of the Dutch ontwikkelingsonderzoek, which Freudenthal introduced in the
1970s to justify the development of curricular materials as belonging to a university
institute (what is now called the Freudenthal Institute) because it was informed by
and leading to research on students’ learning processes (Freudenthal 1978;
Gravemeijer and Koster 1988; De Jong and Wijers 1993). The core idea was that
development of learning environments and the development of theory were inter-
twined. As Goffree (1979, p. 347) put it: “Developmental research in education as
presented here, shows the characteristics of both developmental and fundamental
research, which means aiming at new knowledge that can be put into service in
continued development.” At another Dutch university (Twente University), the term
ontwerpgericht (design-oriented) research was more common, but there the focus
was more on the curriculum than on theory development (Van den Akker 1999).
One disadvantage of the terms ‘development’ and ‘developmental’ is their connota-
tions to developmental psychology and research on children’s development of con-
cepts. This might be one reason that this term is hardly used anymore.
In the United States, the terms design experiment and design research were more
common (Brown 1992; Cobb et al. 2003a; Collins 1992; Edelson 2002). One advan-
tage of these terms is that design is more specific than development. One possible
disadvantage of the term design experiment can be explained by reference to a criti-
cal paper by Paas (2005) titled Design experiment: Neither a design nor an experi-
ment. The confusion that his pun refers to is two-fold. First, in many educational
research communities the term design is reserved for research design (e.g., compar-
ing an experimental with a control group), whereas the term in design research
refers to the design of learning environments (Sandoval and Bell 2004). Second, for
many researchers, also outside the learning sciences, the term experiment is reserved
for “true” experiments or RCTs. In design experiments, hypotheses certainly play
an important role, but they are not fixed and tested once. Instead they may be
16 An Introduction to Design-Based Research with an Example From Statistics… 437
We have already stated that theory typically has a more central role in DBR than in
action research. To address the role of theory in DBR, it is helpful to summarize
diSessa and Cobb’s (2004) categorization of different types of theories involved in
educational research. They distinguish:
• Grand theories (e.g., Piaget’s phases of intellectual development; Skinner’s
behaviorism)
• Orienting frameworks (e.g., constructivism, semiotics, sociocultural theories)
• Frameworks for action (e.g., designing for learning, Realistic Mathematics
Education)
• Domain-specific theories (e.g., how to teach density or sampling)
• Hypothetical Learning Trajectories (Simon 1995) or didactical scenarios (Lijnse
1995; Lijnse and Klaassen 2004) formulated for specific teaching experiments
(explained in Sect. 16.1.3).
As can be seen from this categorization, there is a hierarchy in the generality of
theories. Because theories developed in DBR are typically tied to specific learning
environments and learning goals, they are humble and hard to generalize. Similarly,
it is very rare that a theoretical contribution to aerodynamics will be made in the
design of an airplane; yet innovations in airplane design occur regularly. The use of
grand theoretical frameworks and frameworks for action is recommended, but
researchers should be careful to manage the gap between the different types of the-
ory on the one hand and design on the other (diSessa and Cobb 2004). If handled
with care, DBR can then provide the basis for refining or developing theoretical
concepts such as meta-representational competence, sociomathematical norms
(diSessa and Cobb), or whole-class scaffolding (Smit et al. 2013).
So far we have characterized DBR in terms of its predictive and advisory aim, par-
ticular way of handling hypotheses, its engineering nature and differences from
other research methods. Here we summarize five key characteristics of DBR as
identified by Cobb et al. (2003a):
1. The first characteristic is that its purpose is to develop theories about learning
and the means that are designed to support that learning. In the example pro-
vided in Sect. 16.2 of in this chapter, Bakker (2004a) developed an instruction
theory for early statistics education and instructional means (e.g. computer tools
438 A. Bakker and D. van Eerde
DBR typically consists of cycles of three phases each: preparation and design,
teaching experiment, and retrospective analysis. One might argue that the term
‘retrospective analysis’ is pleonastic: All analysis is in retrospect, after a teaching
16 An Introduction to Design-Based Research with an Example From Statistics… 439
experiment. However, we use it here to distinguish it from analysis on the fly, which
takes place during a teaching experiment, often between lessons.
A design and research instrument that proves useful during all phases of DBR is
the hypothetical learning trajectory (HLT), which we regard as an elaboration of
Freudenthal’s thought experiment. Simon (1995) defined the HLT as follows:
The hypothetical learning trajectory is made up of three components: the learning goal that
defines the direction, the learning activities, and the hypothetical learning process—a pre-
diction of how the students’ thinking and understanding will evolve in the context of the
learning activities. (p. 136)
Simon used the HLT for one or two lessons. Series of HLTs can be used for lon-
ger sequences of instruction (also see the literature on didactical scenarios in Lijnse
1995). The HLT is a useful research instrument to manage the gap between an
instruction theory and a concrete teaching experiment. It is informed by general
domain-specific and conjectured instruction theories (Gravemeijer 1994), and it
informs researchers and teachers how to carry out a particular teaching experiment.
After the teaching experiment, it guides the retrospective analysis, and the interplay
between the HLT and empirical results forms the basis for theory development. This
means that an HLT, after it has been mapped out, has different functions depending
on the phase of the DBR and continually develops through the different phases. It
can even change during a teaching experiment.
The development of an HLT starts with an analysis of how the mathematical topic of
the design study is elaborated in the curriculum and the mathematical textbooks, an
analysis of the difficulties students encounter with this topic, and a reflection on what
they should learn about it. These analyses result in the formulation of provisional
mathematical learning goals that form the orientation point for the design and
redesign of activities in several rounds. While designing mathematical activities the
learning goals may become better defined. During these design processes the
researcher also starts formulating hypotheses about students’ potential learning and
about how the teacher would support students’ learning processes. The confrontation
of a general rationale with concrete tasks often leads to a more specific HLT, which
means that the HLT gradually develops during the design phase (Drijvers 2003).
An elaborated HLT thus includes mathematical learning goals, students’ starting
points with information on relevant pre-knowledge, mathematical problems and
assumptions about students’ potential learning processes and about how the teacher
could support these processes.
During the teaching experiment, the HLT functions as a guideline for the teacher
and researcher for what to focus on in teaching, interviewing, and observing. It may
happen that the teacher or researcher feels the need to adjust the HLT or instruc-
tional activity for the next lesson. As Freudenthal wrote (1991, p. 159), the cyclic
440 A. Bakker and D. van Eerde
alternation of research and development can be more efficient the shorter the cycle
is. Minor changes in the HLT are usually made because of incidents in the class-
room such as student strategies that were not foreseen, activities that were too dif-
ficult, and so on. Such adjustments are generally not accepted in comparative
experimental research, but in DBR, changes in the HLT are made to create optimal
conditions and are regarded as elements of the data corpus. This means that these
changes have to be reported well and the information is stronger when changes are
supported by theoretical considerations. The HLT can thus also change during the
teaching experiment phase.
It is evident that the relevant present knowledge about a topic should be studied first.
Gravemeijer (1994) characterizes the design researcher as a tinkerer or, in French, a
bricoleur, who uses all the material that is at hand, including theoretical insights and
practical experience with teaching and designing.
In the first design phase, it is recommended to collect and invent a set of tasks
that could be useful and discuss these with colleagues who are experienced in
designing for mathematics education. An important criterion for selecting a task is
its potential role in the HLT towards the mathematical end goal. Could it possibly
lead to types of reasoning that students could build upon towards that end goal?
Would it be challenging? Would it be a meaningful context for students?
There are several design heuristics, principles, and guidelines. In Sect. 16.2 we
explain heuristics from the theory of Realistic Mathematics Education.
The notion of a teaching experiment arose in the 1970s. Its primary purpose was to
experience students’ learning and reasoning first-hand, and it thus served the pur-
pose of eliminating the separation between the practice of research and the practice
of teaching (Steffe and Thompson 2000). Over time, teaching experiments proved
useful for a broader purpose, namely as part of DBR. During a teaching experiment,
researchers and teachers use activities and types of instruction that according to the
HLT seem most appropriate at that moment. Observations in one lesson and theo-
retical arguments from multiple sources can influence what is done in the next les-
son. Observations may include student or teacher deviations from the HLT.
Hence, this type of research is different from experimental research designs in
which a limited number of variables are manipulated and effects on other variables
are measured. The situation investigated here, the learning of students in a new
context with new tools and new end goals, is too complicated for such a set-up.
442 A. Bakker and D. van Eerde
Besides that, a different type of knowledge is looked for, as pointed out earlier in
this chapter: We do not want to assess innovative material or a theory, but we need
prototypical educational materials that could be tested and revised by teachers and
researchers, and a domain-specific instruction theory that can be used by others to
formulate their own HLTs suiting local contingencies.
During a teaching experiment, data collection typically includes student work,
tests before and after instruction, field notes, audio recordings of whole-class dis-
cussions, and video recordings of every lesson and of the final interviews with stu-
dents and teachers. We further find ‘mini-interviews’ with students, lasting from
about twenty seconds to four minutes, very useful provided that they are carried out
systematically (Bakker 2004a).
We describe two types of analysis useful in DBR, a task oriented analysis and a
more overall, longitudinal, cyclic approach. The first is to compare data on students’
actual learning during the different tasks with the HLT. To this end we find the data
analysis matrix (Table 16.4) described in Dierdorp et al. (2011) useful. The left part
of the matrix summarizes the HLT and the right part is filled with excerpts from
relevant transcripts, clarifying notes from the researcher as well as a quantitative
impression of how well the match was between the assumed leaning as formulated
in the HLT and the observed learning. With such analysis it is possible to give an
overview, as in Table 16.5, which can help to identify problematic sections in the
educational materials. Insights into why particular learning takes place or does not
Table 16.4 Data analysis matrix for comparing HLT and actual learning trajectory (ALT)
Hypothetical learning trajectory Actual learning trajectory
Task Formulation Conjecture of Transcript Clarification Match between HLT
number of the task how students excerpt and ALT: Quantitative
would respond impression of how
well the conjecture
and actual learning
matched (e.g., −, 0, +)
Table 16.5 ALT result compared with HLT conjectures for the tasks involving a particular type of
reasoning
+ x x x x x x x x x x x x x
± x x x
– x x x
Task: 5d 5f 6a 6c 7 8 9c 9e 10b 11c 15 17 23b 23c 24a 24c 25d 34a 42
Note: an x means how well the conjecture accompanying that task matched the observed learning
(− refers to confirmation for up to 1/3 of the students, and + to at least 2/3 of the students)
16 An Introduction to Design-Based Research with an Example From Statistics… 443
take place help to improve the HLTs in subsequent cycles of DBR. This iterative
process allows the researcher to improve the predictive power of HLTs across sub-
sequent teaching experiments.
An elaborated HLT would include assumptions about students’ potential learn-
ing and about how the teacher would support students’ learning processes. In this
task-oriented analysis above no information is included about the role of the teacher.
If there are crucial differences between students’ assumed and observed learning
processes or if the teaching has been observed to diverge radically from what the
researcher had intended, the role of the teacher should be included into the analysis
in search of explanations for these discrepancies.
A comparison of HLTs and observed learning is very useful in the redesign pro-
cess, and allows answers to research questions that ask how particular learning
goals could be reached. However, in our experience additional analyses are often
needed to gain more theoretical insights into the learning process. An example of
such additional analysis is a method inspired by the constant comparative method
(Glaser and Strauss 1967; Strauss and Corbin 1998) and Cobb and Whitenack’s
(1996) method of longitudinal analyses. Bakker (2004a) used this type of analysis
in his study in the following way. First, all transcripts were read and the videotapes
were watched chronologically episode-by-episode. With the HLT and research
questions as guidelines, conjectures about students’ learning and views were gener-
ated and documented, and then tested against the other episodes and other data
material (student work, field notes, tests). This testing meant looking for confirma-
tion and counter-examples. The process of conjecture generating and testing was
repeated. Seemingly crucial episodes were discussed with colleagues to test whether
they agreed with our interpretation or perhaps could think of alternative interpreta-
tions. This process is called peer examination.
For the analysis of transcripts or videos it is worth considering computer soft-
ware such as Atlas.ti (Van Nes and Doorman 2010) for coding the transcripts and
other data sources. As in all qualitative research, data triangulation (Denscombe
2007) is commonly used in design-based research.
Researchers want to analyze data in a reliable way and draw conclusions that are
valid. Therefore, validity and reliability are important concerns. In brief, validity
concerns whether we really measure what we intend to measure. Reliability is about
independence of the researcher. A brief example may clarify the distinction. Assume
a researcher wants to measure students’ mathematical ability. He gives everyone 7
out of 10. Is this a valid way of measuring? Is this a reliable way?
It is a very reliable way because the instruction “give all students a 7” can be
reliably carried out, independently of the researcher. However, it is not valid,
because there is most likely variation between students’ mathematical ability, which
is not taken into account with this way of measuring.
444 A. Bakker and D. van Eerde
We should emphasize that validity and reliability are complex concepts with
multiple meanings in different types of research. In qualitative research the
meanings of validity and reliability are slightly different than in quantitative
research. Moreover, there are so many types of validity and reliability that we
cannot address them all. In this chapter we have focused on those types that
seemed most relevant to us in the context of DBR. The issues discussed in this
section are inspired by guidelines of Maso and Smaling (1998) and Miles and
Huberman (1994), who distinguish between internal and external validity and
reliability.
Internal validity refers to the quality of the data and the soundness of the reasoning
that has led to the conclusions. In qualitative research, this soundness is also labeled
as credibility (Guba 1981). In DBR, several techniques can be used to improve the
internal validity of a study.
• During the retrospective analysis conjectures generated and tested for specific
episodes are tested for other episodes or by data triangulation with other data
material, such as field notes, tests, and other student work. During this testing
stage there is a search for counterexamples to the conjectures.
• The succession of different teaching experiments makes it possible to test the
conjectures developed in earlier experiments in later experiments.
Theoretical claims are substantiated where possible with transcripts to provide a
rich and meaningful context. Reports about DBR tend to be long due to the thick
descriptions (Geertz 1973) required. For example, the paper by Cobb et al. (2003b)
is 78 pages long!
Internal reliability refers to the degree of how independently of the researcher the
data are collected and analyzed. It can be improved with several methods. Data
collection by objective devices such as audio- and video registrations contribute to
the internal reliability. During his retrospective analysis Bakker (2004a) ensured
reliability by discussing the critical episodes, including those discussed in
Sect. 16.2, with colleagues for peer examination. For measuring interrater reliability,
the agreement among independent researchers, it is advised to calculate not only
the percentage of agreement but also use Cohen’s kappa or another measure that
takes into account the probability of agreement by chance (e.g., Krippendorff’s
alpha). It is not necessary for a second coder to code all episodes, but ensure that a
random sample should be of sufficient size: The larger the number of possible
codes, the larger the sample required (Bakkenes et al. 2010; Cicchetti 1976). Note
that the term internal reliability can also refer to the consistency of responses on a
questionnaire or test, often measured with help of Cronbach’s alpha.
this DBR project and then focus on one design idea, that of growing samples, to
illustrate how it is related to different layers of theory and how it was analyzed.
Finally we discuss the issue of generalizability. In the appendix we provide a struc-
ture of a DBR project with examples from this Sect. 16.2.
Bakker’s initial research question was: How can students with little statistical back-
ground develop a notion of distribution? In trying to answer this question in grade
7, however, Bakker came to include a focus on other statistical key concepts such as
data, center, and sampling because these are so intricately connected to that of dis-
tribution (Bakker and Derry 2011). The concept of distribution also proved hard for
seventh-grade students. The initial research question was therefore reformulated for
grade 8 as follows: How can coherent reasoning about distribution be promoted in
relation to data, variability, and sampling in a way that is meaningful for students
with little statistical background?
Our point here is that research questions can change during a research project.
Indeed, the better and sharper your research question is in the beginning of the proj-
ect, the better and more focused your data collection will be. However, our experi-
ence is that most DBR researchers, due to progressive insight, end up with slightly
different research questions than they started with.
16 An Introduction to Design-Based Research with an Example From Statistics… 447
As pointed out in Sect. 16.1, DBR typically draws on several types of theories.
Given the importance of graphical representations in statistics education, it made
sense for Bakker to draw on semiotics as an orienting framework. He came to focus
on semiotics, in particular Peirce’s ideas on diagrammatic reasoning. The domain-
specific theory of Realistic Mathematics Education proved a useful framework for
action in the design process even though it had hardly been applied in statistics
education.
The learning goal was that distribution would become an object-like entity.
Theories on reification of concepts (Sfard and Linchevski 1992) and the relation
between process and concept (cf. Tall et al. 2000, on procept) were drawn upon.
One theoretical question unanswered in the literature was what the process nature
of a distribution could be. It is impossible to make sense of graphs without having
appropriate conceptual structures, and it is impossible to communicate about con-
cepts without any representations. Thus, to develop an instruction theory it is
necessary to investigate the relation between the development of the meaning of
graphs and concepts. After studying several theories in this area, Bakker deployed
Peirce’s semiotic theory on diagrammatic reasoning (Bakker 2007; Bakker and
Hoffmann 2005). For Peirce, a diagram is a sign that is meant to represent rela-
tions. Diagrammatic reasoning involves three steps:
1. The first step is to construct a diagram (or diagrams) by means of a representa-
tional system such as Euclidean geometry, but we can also think of diagrams in
computer software or of an informal student sketch of statistical distribution.
Such a construction of diagrams is supported by the need to represent the rela-
tions that students consider significant in a problem. This first step may be called
diagrammatization.
2. The second step of diagrammatic reasoning is to experiment with the diagram (or
diagrams). Any experimenting with a diagram is executed within a not necessarily
perfect representational system and is a rule or habit-driven activity. Contemporary
researchers would stress that this activity is situated within a practice. What makes
experimenting with diagrams important is the rationality immanent in them
(Hoffmann 2002). The rules define the possible transformations and actions, but
also the constraints of operations on diagrams. Statistical diagrams such as dot
plots are also bound by certain rules: a dot has to be put above its value on the x
axis and this remains true even if for instance the scale is changed. Peirce stresses
the importance of doing something when thinking or reasoning with diagrams:
Thinking in general terms is not enough. It is necessary that something should be DONE. In
geometry, subsidiary lines are drawn. In algebra, permissible transformations are made.
Thereupon the faculty of observation is called into play. (CP 4.233—CP refers to Peirce’s
collected papers, volume 4, section 233)
448 A. Bakker and D. van Eerde
In the software used in this research, students can do something with the data
points such as organizing them into equal intervals or four equal groups.
3. The third step is to observe the results of experimenting. We refer to this as the
reflection step. As Peirce wrote, the mathematician observing a diagram “puts
before him an icon by the observation of which he detects relations between the
parts of the diagram other than those which were used in its construction” (Peirce
1976 III, p. 749). In this way he can “discover unnoticed and hidden relations
among the parts” (Peirce CP 3.363; see also CP 1.383). The power of diagram-
matic reasoning is that “we are continually bumping up against hard fact. We
expected one thing, or passively took it for granted, and had the image of it in our
minds, but experience forces that idea into the background, and compels us to
think quite differently” (Peirce CP 1.324).
Diagrammatic reasoning, in particular the reflection step, is what can introduce
the ‘new’. New implications within a given representational system can be found, but
possibly the need is felt to construct a new diagram that better serves its purpose.
As pointed out by diSessa and Cobb (2004), grand theories and orienting frame-
works do not tell the design researcher how to design learning environments. For
this purpose, frameworks for action can be useful. Here we discuss Realistic
Mathematics Education (RME).
Our research took place in the tradition of RME as developed over the last 40
years at the Freudenthal Institute (Freudenthal 1991; Gravemeijer 1994; Treffers
1987; van den Heuvel-Panhuizen 1996). RME is a theory of mathematics education
that offers a pedagogical and didactical philosophy on mathematical learning and
teaching as well as on designing educational materials for mathematics education.
RME emerged from research and development in mathematics education in the
Netherlands in the 1970s and it has since been used and extended, also in other
countries.
The central principle of RME is that mathematics should always be meaningful
to students. For Freudenthal, mathematics was an extension of common sense, a
system of concepts and techniques that human beings had developed in response to
phenomena they encountered. For this reason, he advised a so-called historical
phenomenology of concepts to be taught, a study of how concepts had been devel-
oped in relation to particular phenomena. The insights from such a study can be
input for the design process (Bakker and Gravemeijer 2006).
The term ‘realistic’ stresses that problem situations should be ‘experientially
real’ for students (Cobb et al. 1992). This does not necessarily mean that the problem
situations are always encountered in daily life. Students can experience an abstract
mathematical problem as real when the mathematics of that problem is meaningful
16 An Introduction to Design-Based Research with an Example From Statistics… 449
16.2.5 Methods
The absence of the type of learning aimed for is a common reason to carry out
design research. For Bakker’s study in statistics education, descriptive, compara-
tive, or evaluative research did not make sense because the type of learning aimed
for could not be readily observed in classrooms. Considerable design and research
effort first had to be taken to foster specific innovative types of learning. Bakker
therefore had to design HLTs with accompanying educational materials that sup-
ported the desired type of learning about distribution. Design-based research offers
a systematic approach to doing that while simultaneously developing domain-
specific theories about how to support such learning for example here on the domain
of statistics. In general, DBR researchers first need to create the conditions in which
they can develop and test an instruction theory, but to create those conditions they
also need research.
Teaching experiment. Bakker designed educational materials with accompany-
ing HLTs in several cycles. Here we focus on the last cycle, involving a teaching
experiment in grade 8. Half of the lessons were carried out in a computer lab and as
part of them students used two minitools (Cobb et al. 1997), simple Java applets
with which they analyzed data sets on, for instance, battery life span, car colours,
and salaries (Fig. 16.3). The researcher was responsible for the educational materi-
als and the teacher was responsible for the teaching, though we discussed in advance
on a weekly basis both the materials and appropriate teaching style. Three preser-
vice teachers served as assistants and helped with videotaping and interviewing
students and with analyzing the data.
In the example that we elaborate we focus on the fourth of a series of ten lessons,
each 50 min long. In this specific lesson, students reasoned about larger and larger
samples and about the shape of distributions.
Subjects. The teaching experiment was carried out in an eighth-grade class with
30 students in a state school in the center of a Dutch city. The students in this study
450 A. Bakker and D. van Eerde
Fig. 16.2 Jeans data with four equal groups option in Minitool 2
ysis by data triangulation. Conjectures that were confirmed remained in the list;
conjectures that were refuted were removed from the list. Then the whole generat-
ing and testing process was repeated. The aforementioned examples were all con-
firmed throughout this analysis.
To get a sense of the interrater reliability of the analysis, about one quarter of the
episodes including those discussed in this chapter and the conjectures belonging to
these episodes were judged by the three assistants who attended the teaching experi-
ment. The amount of agreement among judges was very high: all four judges agreed
about 33 out of 35 codes. A code was only accepted if all judges agreed after discus-
sion. We give an example of a code that was finally rejected and one that was
accepted. This example stems from the seventh lesson in which two students used
the four equal groups option in Minitool 2 for a revised version of the jeans activity.
Their task was to advise a jeans factory about frequencies of jeans sizes to be pro-
duced (Fig. 16.2).
Sofie Because then you can best see the spread, how it is distributed.
Int. How it is distributed. And how do you see that here [in this graph]?
What do you look at then? (…)
Sofie Well, you can see that, for example, if you put a [vertical] line here,
here a line, and here a line. Then you see here [two lines at the right]
that there is a very large spread in that part, so to speak.
In the first line, Sofie seems to use the terms spread and distributed as almost
synonymous. This line was therefore coded with C7, which states that “students’
notions of spread, distribution, and density are not yet distinguished. When explain-
ing how data are spread out, they often describe the distribution or the density in
some area.” In the second line, Sofie appears to look at spread very locally, hence it
was coded with C2, which states that “students either characterize spread as range
or look very locally at spread.”
We also give an example of a code assignment that was dismissed in relation to
the same diagram.
452 A. Bakker and D. van Eerde
To illustrate relationships between theory, method, and results, this section pres-
ents the analysis of students’ reasoning during one educational activity which was
carried out in the fourth lesson. Its goal was to stimulate students to reason about
larger and larger samples. We summarize the HLT of that lesson: the learning
goal, the activity of growing a sample and the assumptions about students’ poten-
tial learning processes and about how the teacher could support these processes.
We then present the retrospective analysis of three successive phases in growing a
sample.
The overall goal of the growing samples activity as formulated in the hypotheti-
cal learning trajectory for this fourth lesson was to stimulate students’ diagrammatic
reasononing about shape in relation to sampling and distribution aspects in the con-
text of weight. This implied that students should first make diagrams, then experi-
ment with them and reflect on them. The idea was to start with ideas invented by the
students and guide them toward more conventional notions and representations.
This process of guiding students toward these culturally accepted concepts and
graphs while building on their own inventions is called guided reinvention. We had
noted in previous teaching experiments that students were inclined to choose very
small samples initially. It proved necessary to stimulate reflection on the disadvan-
tages of such small samples and have them predict what larger samples would look
like. Such insights from the analyses of previous teaching experiments helped to
better formulate the HLT of a new teaching experiment. More particularly, Bakker
assumed that starting with students’ initial ideas about small samples and asking for
predictions about larger samples would make students aware of various features of
distributions.
The activity of growing a sample consisted of three phases of making sketches of
a hypothetical situation and comparing those sketches with graphs displaying real
data sets. In the first phase students had to make a graph of their own choice of a
predicted weight data set with sample size 10. The results were discussed by the
teacher to challenge this small sample size, and in the subsequent phases students
had to predict larger data sets, one class and three classes in the second phase, and
all students in the province in the third phase. Thus, three such phases took place as
16 An Introduction to Design-Based Research with an Example From Statistics… 453
Fig. 16.3 (a) Minitool 1 showing a value-bar graph of battery life spans in hours of two brands.
(b) Minitool 1, but with bars hidden. (c) Minitool 2 showing a dot plot of the same data sets
described and analyzed below. Aiming for guided reinvention, the teacher and
researcher tried to strike a balance between engaging students in statistical reason-
ing and allowing their own terminology on the one hand, and guiding them in using
conventional and more precise notions and graphical representations on the other.
Figure 16.3b is the result of focusing only on the endpoints of the value bars in
Fig. 16.3a. Figure 16.3c is the result of these endpoints falling down vertically on
the x-axis. In this way, students can learn to understand the relationship between
value-bar graphs and dot plots, and what distribution features in different represen-
tations look like (Bakker and Hoffmann 2005).
454 A. Bakker and D. van Eerde
The text of the student activity sheet for the fourth lesson contained a number of
tasks that we cite in the following subsections. The sheet started as follows:
Last week you made graphs of predicted data for a balloon pilot. During this lesson you will
get to see real weight data of students from another school. We are going to investigate the
influence of the sample size on the shape of the graph.
Task a. Predict a graph of ten data values, for example with the dots of minitool 2.
The sample size of ten was chosen because the students had found that size rea-
sonable after the first lesson in the context of testing the life span of batteries.
Figure 16.4 shows examples for three different types of diagrams the students made
to show their predictions: there were three value-bar graphs (such as in minitool
1—e.g., Ruud’s diagram), eight with only the endpoints (such as with the option of
minitool 1 to “hide bars”—e.g., Chris’s diagram) and the remaining nineteen plots
were dot plots (such as in minitool 2—e.g., Sandra’s diagram). For the remainder of
this section, the figures and written explanations of these three students are demon-
strated, because their work gives an impression of the variety of the whole class.
Those three students were chosen because their diagrams represent all types of
diagrams made in this class, also for other phases of growing a sample.
To stimulate the reflection on the graphs, the teacher showed three samples of ten
data points on the blackboard and students had to compare their own graphs
(Fig. 16.4) with the graphs of the real data sets (Fig. 16.5).
Task b. You get to see three different samples of size 10. Are they different from your own
prediction? Describe the differences.
The reason for showing three small samples was to show the variation among these
samples. There were no clear indications, though, that students conceived this varia-
tion as a sign that the sample size was too small for drawing conclusions, but they
generally agreed that larger samples were more reliable. The point relevant to the
analysis is that students started using predicates to describe aggregate features of the
graphs. The written answers of the three students were the following:
Ruud Mine looks very much like what is on the blackboard.
Chris The middle-most [diagram on the blackboard] best resembles mine
because the weights are close together and that is also the case in my
graph. It lies between 35 and 75 [kg].
Sandra The other [real data] are more weights together and mine are further
apart.
Ruud’s answer is not very specific, like most of the written answers in the first
phase of growing samples. Chris used the predicate “close together” and added
numbers to indicate the range, probably as an indication of spread. Sandra used such
terms as “together” and “further apart,” which address spread. The students in the
class used common predicates such as “together,” “spread out” and “further apart”
to describe features of the data set or the graph. For the analysis it is important to
16 An Introduction to Design-Based Research with an Example From Statistics… 455
Fig. 16.4 Student predictions (Ruud, Chris, and Sandra) for ten data points (weight in kg) (Bakker
2004a, p. 219)
note that the students used predicates (together, apart) and no nouns (spread,
average) in this first phase of growing samples. Spread can only become an object-
like concept, something that can be talked about and reasoned with, if it is a noun.
In the semiotic theory of Peirce, such transitions from the predicate “the dots are
spread out” to “the spread is large” are important steps in the formation of concepts
(see Bakker and Derry 2011, for our view on concept formation).
The students generally understood that larger samples would be more reliable. With
the feedback students had received after discussing the samples of ten data points in
dot plots, students had to predict the weight graph of a whole class of 27 students
and of three classes with 67 students (27 and 67 were the sample sizes of the real
data sets of eighth graders of another school).
Task c. We will now have a look how the graph changes with larger samples. Predict a
sample of 27 students (one class) and of 67 students (three classes).
Task d. You now get to see real samples of those sizes. Describe the differences. You can use
words such as majority, outliers, spread, average.
During this second phase, all of the students made dot plots, probably because
the teacher had shown dot plots on the blackboard, and because dot plots are less
laborious to draw than value bars (only one student started with a value-bar graph
for the sample of 27, but switched to a dot plot for the sample of 67). The hint on
statistical terms was added to make sure that students’ answers would not be too
superficial as (often happened before) and to stimulate them to use such notions in
their reasoning. It was also important for the research to know what these terms
meant to them. When the teacher showed the two graphs with real data, once again
there was a short class discussion in which the teacher capitalized on the question of
why most student predictions now looked pretty much like what was on the black-
board, whereas with the earlier predictions there was much more variation. No stu-
dent had a reasonable explanation, which indicates that this was an advanced
question. The figures of the same three students are presented in Figs. 16.6 and 16.7
and their written explanations were:
Ruud My spread is different.
Chris Mine resembles the sample, but I have more people around a certain
weight and I do not really have outliers, because I have 10 about the 70
and 80 and the real sample has only 6 around the 70 and 80.
Sandra With the 27 there are outliers and there is spread; with the 67 there are
more together and more around the average.
Here, Ruud addressed the issue of spread (“my spread is different”). Chris was
more explicit about a particular area in her graph, the category of high values. She
also correctly used the term “sample,” which was newly introduced in the second
lesson. Sandra used the term “outliers” at this stage, by which students meant
“extreme values,” which did not necessarily mean exceptional or suspect values.
16 An Introduction to Design-Based Research with an Example From Statistics… 457
40 50 60 70 80 90
40 50 60 70 80 90
Ruud
30 40 50 60 70 80 90
30 40 50 60 70 80 90
Chris
30 40 50 60 70 80 90
30 40 50 60 70 80 90
Sandra
Fig. 16.6 Predicted graphs for one class (n = 27, top plot) and three classes (n = 67, bottom plot)
by Ruud, Chris, and Sandra (Bakker 2004a, p. 222)
30 35 40 45 50 55 60 65 70 75 80
30 35 40 45 50 55 60 65 70 75 80
1e 1c
Fig. 16.7 Real data sets of size 27 and 67 of students from another school (Bakker 2004a, p. 222)
458 A. Bakker and D. van Eerde
She also seemed to locate the average somewhere and to understand that many
students are about average. These examples illustrate that students used statistical
notions for describing properties of the data and diagrams.
In contrast to the first phase of growing a sample, students used nouns instead of just
predicates for comparing the diagrams. Like others Ruud used the noun “spread” (“my
spread is different”) whereas students earlier used only predicates such as “spread out”
or “further apart” (e.g., Sandra). Of course, this does not always imply that if students
use these nouns that they are thinking of the right concept. Statistically, however, it
makes a difference whether we say, “the dots are spread out” or “the spread is large.”
In the latter case, spread is an object-like entity that can have particular aggregate char-
acteristics that can be measured, for instance by the range, the interquartile range, or the
standard deviation. Other notions such as outliers, sample, and average, are now used
as nouns, that is as conceptual objects that can be talked about and reasoned with.
The aim of the hypothetical learning trajectory was that students would come to
draw continuous shapes and reason about them using statistical terms. During teach-
ing experiments in the seventh-grade experiments (Bakker and Gravemeijer 2004),
reasoning with continuous shapes turned out to be difficult to accomplish, even if it
was asked for. It often seemed impossible to nudge students toward drawing the
general, continuous shape of data sets represented in dot plots. At best, students
drew spiky lines just above the dots. This underlines that students have to construct
something new (a notion of signal, shape, or distribution) with which they can look
differently at the data or the variable phenomenon.
In this last phase of growing the sample, the task was to make a graph showing
data of all students in the city, not necessarily with dots. The intention of asking this
was to stimulate students to use continuous shapes and dynamically relate samples
to populations, without making this distinction between sample and population
explicit yet. The conjecture was that this transition from a discrete plurality of data
values to a continuous entity of a distribution is important to foster a notion of dis-
tribution as an object-like entity with which students could model data and describe
aggregate properties of data sets. The task proceeded as follows:
Task e. Make a weight graph of a sample of all eighth graders in the city. You need not draw
dots. It is the shape of the graph that is important.
Task f. Describe the shape of your graph and explain why you have drawn that shape.
The figures of the same three students are presented in Fig. 16.8 and their written
explanations were:
Ruud Because the average [values are] roughly between 50 and 60 kg.
Chris I think it is a pyramid shape. I have drawn my graph like that because I
found it easy to make and easy to read.
Sandra Because most are around the average and there are outliers at 30 and
80 [kg].
16 An Introduction to Design-Based Research with an Example From Statistics… 459
Fig. 16.8 Predicted graphs for all students in the city by Ruud, Chris, and Sandra (Bakker 2004a,
p. 224)
Ruud’s answer focused on the average group. During an interview after the
fourth lesson, Ruud like three other students literally called his graph a “bell shape,”
though he had probably not encountered that term in a school situation before. This
is probably a case of reinvention. Chris’s graph was probably inspired by line graphs
that the students made during mathematics lessons. She introduced the vertical axis
with frequency, though such graphs had not been used before in the statistics course.
Sandra may have started with the dots and then drawn the continuous shape.
In this third phase of growing a sample, 23 students drew a bump shape. The
words they used for the shapes were pyramid (three students), semicircle (one),
and bell shape (four). Many students drew continuous shapes but these were all
460 A. Bakker and D. van Eerde
symmetrical. Since weight distributions are not symmetrical and because skewness
is an important concept, a subsequent lesson addressed asymmetrical shapes in rela-
tion to the weight data (see Bakker 2004b).
The research question we addressed in the example is: How can coherent reasoning
about distribution be promoted in relation to data, variability, and sampling in a way
that is meaningful for students with little statistical background? We now discuss
those key elements for the educational activity and speculate about what can be
learned from the analysis presented here.
The activity of growing a sample involved short phases of constructing diagrams
of new hypothetical situations, and comparing these with other diagrams of a real
sample of the same size. The activity has a broader empirical basis than just the
teaching experiment reported in this chapter, because it emerged from a previous
teaching experiment (Bakker and Gravemeijer 2004) as a way to address shape as a
pattern in variability.
To theoretically generalize the results, Bakker analyzed students’ reasoning as an
instance of diagrammatic reasoning, which typically involves constructing dia-
grams, experimenting with them, and reflecting on the results of the previous two
steps. In this growing samples activity, the quick alternation between prediction and
reflection during diagrammatic reasoning appears to create ample opportunities for
concept formation, for instance of spread.
In the first phase involving the prediction of a small data set, students noted that
the data were more spread out, but in subsequent phases, students wrote or said that
the spread was large. From the terms used in this fourth lesson, we conclude that
many statistical concepts such as center (average, majority), spread (range and range
of subsets of data), and shape had become topics of discussion (object-like entities)
during the growing samples activity. Some of these words were used in a rather
unconventional way, which implies that students needed more guidance at this point.
Shape became a topic of discussion as students predicted that the shape of the graph
would be a semicircle, a pyramid, or a bell shape, and this was exactly what the HLT
targeted. Given the students’ minimal background in statistics and the fact that this
was only the fourth lesson of the sequence, the results were promising. Note, how-
ever, that such activities cannot simply be repeated in other contexts; they need to be
adjusted to local circumstances if they are to be applied in other situations.
The instructional activity of growing samples later became a connecting thread
in Ben-Zvi’s research in Israel, where it also worked to help students develop statis-
tical concepts in relation to each other (Ben-Zvi et al. 2012). This implies that this
instructional idea was transferable to other contexts. The transferability of instruc-
tional ideas from the USA to the Netherlands to Israel, even to higher levels of
education, illustrates that generalization in DBR can take place across contexts,
cultures and age group.
16 An Introduction to Design-Based Research with an Example From Statistics… 461
The example presented in Sect. 16.2 was intended to substantiate the issues dis-
cussed in Sect. 16.1, and we hope that readers will have a sense of what DBR could
look like and feel invited to read more about it. It should be noted that there are
many variants of DBR. Some are more focused on theory, some more on empiri-
cally grounded products. Some start with predetermined learning outcomes, others
have more open-ended goals (cf. Engeström 2011). DBR may be a challenging
research approach but it is in our experience also a very rewarding one given the
products and insights that can be gained.
Acknowledgments The research was funded by the Netherlands Organization for Scientific
Research under grant number 575-36-003B. The writing of this chapter was made possible with a
grant from the Educational and Learning Sciences Utrecht awarded to Arthur Bakker. Section 2.6
is based on Bakker (2004b). We thank our Master students in our Research Methodology courses
for their feedback on earlier versions of this manuscript. Angelika Bikner-Ahsbahs’s and review-
ers’ careful reading has also helped us tremendously. We also acknowledge PhD students Adri
Dierdorp, Al Jupri, and Victor Antwi, and our colleague Frans van Galen for their helpful com-
ments, and Nathalie Kuijpers and Norma Presmeg for correcting this manuscript.
In line with Oost and Markenhof (2010), we formulate the following general criteria
for any research project:
1. The research should be anchored in the literature.
2. The research aim should be relevant, both in theoretical and practical terms.
3. The formulation of aim and questions should be precise, i.e. using concepts and
definitions in the correct way.
4. The method used should be functional in answering the research question(s).
5. The overall structure of the research project should be consistent, i.e. title, aim,
theory, question, method and results should form a coherent chain of reasoning.
In this appendix we present a structure of general points of attention during DBR
and specifications for our statistics education example, including references to rel-
evant sections in the chapter. In this structure these criteria are bolded. This struc-
ture could function as the blueprint of a book or article on a DBR project.
References
Akkerman, S. F., Admiraal, W., Brekelmans, M., & Oost, H. (2008). Auditing quality of research
in social sciences. Quality & Quantity, 42, 257–274.
Anderson, T., & Shattuck, J. (2012). Design-based research: A decade of progress in education
research? Educational Researcher, 41, 16–25.
Artigue, M. (1988). Ingénierie didactique [Didactical engineering]. In M. Artigue, G. Brousseau,
J. Brun, Y. Chevallard, F. Conne, & G. Vergnaud (Eds.), Didactique des mathematiques
[Didactics of mathematics]. Paris: Delachaux et Niestlé.
Bakkenes, I., Vermunt, J. D., & Wubbels, T. (2010). Teachers learning in the context of educational
innovation: Learning activities and learning outcomes of experienced teachers. Learning and
Instruction, 20(6), 533–548.
Bakker, A. (2004a). Design research in statistics education: On symbolizing and computer tools.
Utrecht: CD-Bèta Press.
Bakker, A. (2004b). Reasoning about shape as a pattern in variability. Statistics Education Research
Journal, 3(2), 64–83. Online http://www.stat.auckland.ac.nz/~iase/serj/SERJ3(2)_Bakker.pdf
Bakker, A. (2007). Diagrammatic reasoning and hypostatic abstraction in statistics education.
Semiotica, 164, 9–29.
Bakker, A., & Derry, J. (2011). Lessons from inferentialism for statistics education. Mathematical
Thinking and Learning, 13, 5–26.
Bakker, A., & Gravemeijer, K. P. E. (2004). Learning to reason about distribution. In D. Ben-Zvi
& J. Garfield (Eds.), The challenge of developing statistical literacy, reasoning, and thinking
(pp. 147–168). Dordrecht: Kluwer.
Bakker, A., & Gravemeijer, K. P. E. (2006). An historical phenomenology of mean and median.
Educational Studies in Mathematics, 62(2), 149–168.
Bakker, A., & Hoffmann, M. (2005). Diagrammatic reasoning as the basis for developing con-
cepts: A semiotic analysis of students’ learning about statistical distribution. Educational
Studies in Mathematics, 60, 333–358.
Ben-Zvi, D., Aridor, K., Makar, K., & Bakker, A. (2012). Students’ emergent articulations of
uncertainty while making informal statistical inferences. ZDM The International Journal on
Mathematics Education, 44, 913–925.
464 A. Bakker and D. van Eerde
Biehler, R., Ben-Zvi, D., Bakker, A., & Makar, K. (2013). Technology for enhancing statistical
reasoning at the school level. In M. A. Clement, A. J. Bishop, C. Keitel, J. Kilpatrick, & A. Y.
L. Leung (Eds.), Third international handbook on mathematics education (pp. 643–689). New
York: Springer. doi:10.1007/978-1-4614-4684-2_21.
Brown, A. (1992). Design experiments: Theoretical and methodological challenges in creating
complex interventions in classroom settings. Journal of the Learning Sciences, 2, 141–178.
Cicchetti, D. V. (1976). Assessing inter-rater reliability for rating scales: Resolving some basic
issues. British Journal of Psychiatry, 129, 452–456.
Cobb, P., & Whitenack, J. W. (1996). A method for conducting longitudinal analyses of classroom
videorecordings and transcripts. Educational Studies in Mathematics, 30(3), 213–228.
Cobb, P., Yackel, E., & Wood, T. (1992). A constructivist alternative to the representational view of
mind in mathematics education. Journal for Research in Mathematics Education, 23, 2–33.1.
Cobb, P., Gravemeijer, K.P.E., Bowers, J., & McClain, K. (1997). Statistical Minitools. Designed
for Vanderbilt University, TN, USA. Programmed and revised (2001) at the Freudenthal
Institute, Utrecht University, the Netherlands.
Cobb, P., Confrey, J., diSessa, A., Lehrer, R., & Schauble, L. (2003a). Design experiments in edu-
cational research. Educational Researcher, 32(1), 9–13.
Cobb, P., McClain, K., & Gravemeijer, K. P. E. (2003b). Learning about statistical covariation.
Cognition and Instruction, 21, 1–78.
Collins, A. (1992). Toward a design science of education. In E. Scanlon & T. O'Shea (Eds.), New
directions in educational technology (pp. 15–22). New York: Springer.
Cook, T. (2002). Randomized experiments in education: A critical examination of the reasons the
educational evaluation community has offered for not doing them. Educational Evaluation and
Policy Analysis, 24(3), 175–199.
Creswell, J. W. (2005). Educational research: Planning, conducting, and evaluating quantitative
and qualitative research (2nd ed.). Upper Saddle River: Pearson Education.
Creswell, J. W. (2007). Qualitative inquiry and research design. Choosing among five traditions
(2nd ed.). Thousand Oaks: Sage.
De Jong, R., & Wijers, M. (1993). Ontwikkelingsonderzoek: Theorie en praktijk [Developmental
research: Theory and practice]. Utrecht: NVORWO.
Denscombe, M. (2007). The good research guide (3rd ed.). Maidenhead: Open University Press.
Dierdorp, A., Bakker, A., Eijkelhof, H. M. C., & Van Maanen, J. A. (2011). Authentic practices as
contexts for learning to draw inferences beyond correlated data. Mathematical Thinking and
Learning, 13, 132–151.
diSessa, A. A., & Cobb, P. (2004). Ontological innovation and the role of theory in design experi-
ments. Educational Researcher, 32(1), 77–103.
Drijvers, P. H. M. (2003). Learning algebra in a computer algebra environment: Design research
on the understanding of the concept of parameter. Utrecht: CD-Beta Press.
Edelson, D. C. (2002). Design research: What we learn when we engage in design. Journal of the
Learning Sciences, 11, 105–121.
Educational Researcher. (2003). Special issue on design-based research collective, 32(1–2).
Educational Psychologist. (2004). Special issue design-based research methods for studying
learning in context, 39(4).
Engeström, Y. (2011). From design experiments to formative interventions. Theory and Psychology,
21(5), 598–628.
Fosnot, C. T., & Dolk, M. (2001). Young mathematicians at work. Constructing number sense,
addition, and subtraction. Portsmouth: Heinemann.
Freudenthal, H. (1978). Weeding and sowing: Preface to a science of mathematical education.
Dordrecht: Reidel.
Freudenthal, H. (1988). Ontwikkelingsonderzoek [Developmental research]. In K. Gravemeijer &
K. Koster (Eds.), Onderzoek, ontwikkeling en ontwikkelingsonderzoek [Research, development
and developmental research]. Universiteit Utrecht, the Netherlands: OW&OC.
Freudenthal, H. (1991). Revisiting mathematics education: China lectures. Dordrecht: Kluwer.
Frick, R. W. (1998). Interpreting statistical testing: Process and propensity, not population and
random sampling. Behavior Research Methods, Instruments, & Computers, 30(3), 527–535.
16 An Introduction to Design-Based Research with an Example From Statistics… 465
Friel, S. N., Curcio, F. R., & Bright, G. W. (2001). Making sense of graphs: Critical factors influ-
encing comprehension and instructional implications. Journal of Research in Mathematics
Education., 32(2), 124–158.
Geertz, C. (1973). Thick description: Toward an interpretive theory of culture. In C. Geertz (Ed.),
The interpretation of cultures: Selected essays (pp. 3–30). New York: Basic Books.
Glaser, B. G., & Strauss, A. L. (1967). The discovery of grounded theory: Strategies for qualitative
research. Chicago: Aldine.
Goffree, F. (1979). Leren onderwijzen met Wiskobas. Onderwijsontwikkelingsonderzoek ‘Wiskunde
en Didaktiek’ op de pedagogische akademie [Learning to teach Wiskobas. Educational devel-
opment research]. Rijksuniversiteit Utrecht, The Netherlands.
Gravemeijer, K. P. E. (1994). Educational development and developmental research in mathemat-
ics education. Journal for Research in Mathematics Education, 25(5), 443–471.
Gravemeijer, K. P. E., & Cobb, P. (2006). Design research from a learning design perspective. In
J. Van den Akker, K. P. E. Gravemeijer, S. McKenney, & N. Nieveen (Eds.), Educational
design research (pp. 17–51). London: Routledge.
Gravemeijer, K. P. E., & Koster, K. (Eds.). (1988). Onderzoek, ontwikkeling en ontwikkeling-
sonderzoek [Research, development, and developmental research]. Utrecht: OW&OC.
Guba, E. G. (1981). Criteria for assessing trustworthiness of naturalistic inquiries. Educational
Communication and Technology Journal, 29(2), 75–91.
Hoffmann, M. H. G. (2002). Peirce’s “diagrammatic reasoning” as a solution of the learning para-
dox. In G. Debrock (Ed.), Process pragmatism: Essays on a quiet philosophical revolution
(pp. 147–174). Amsterdam: Rodopi Press.
Hoyles, C., Noss, R., Kent, P., & Bakker, A. (2010). Improving mathematics at work: The need for
techno-mathematical literacies. Abingdon: Routledge.
Journal of the Learning Sciences (2004). Special issue on design-based research, 13(1), guest-
edited by S. Barab and K. Squire.
Kanselaar, G. (1993). Ontwikkelingsonderzoek bezien vanuit de rol van de advocaat van de duivel
[Design research: Taking the position of the devil’s advocate]. In R. de Jong & M. Wijers
(Red.) (Eds.), Ontwikkelingsonderzoek, theorie en praktijk. Utrecht: NVORWO.
Konold, C., & Higgins, T. L. (2003). Reasoning about data. In J. Kilpatrick, W. G. Martin, &
D. Schifter (Eds.), A research companion to principles and standards for school mathematics
(pp. 193–215). Reston: National Council of Teachers of Mathematics.
Mathematical Thinking and Learning (2004). Special issue on learning trajectories in mathemat-
ics education, guest-edited by D. H. Clements and J. Sarama, 6(2).
Lehrer, R., & Schauble, L. (2001). Accounting for contingency in design experiments. Paper pre-
sented at the annual meeting of the American Education Research Association, Seattle.
Lewin, K. (1951). Problems of research in social psychology. In D. Cartwright (Ed.), Field theory
in social science; selected theoretical papers. New York: Harper & Row.
Lijnse, P. L. (1995). “Developmental Research” as a way to an empirically based “didactical struc-
ture” of science. Science Education, 29(2), 189–199.
Lijnse, P. L., & Klaassen, K. (2004). Didactical structures as an outcome of research on teaching-
learning sequences? International Journal of Science Education, 26(5), 537–554.
Maso, I., & Smaling, A. (1998). Kwalitatief onderzoek: praktijk en theorie [Qualitative research:
Practice and theory]. Amsterdam: Boom.
Maxwell, J. A. (2004). Causal explanation, qualitative research and scientific inquiry in education.
Educational Researcher, 33(2), 3–11.
McClain, K., & Cobb, P. (2001). Supporting students’ ability to reason about data. Educational
Studies in Mathematics, 45, 103–129.
McKenney, S., & Reeves, T. (2012). Conducting educational design research. London:
Routledge.
Miles, M. B., & Huberman, A. M. (1994). Qualitative data analysis: A sourcebook of new meth-
ods. Beverly Hills: Sage.
Nathan, M. J., & Kim, S. (2009). Regulation of teacher elicitations in the mathematics classroom.
Cognition and Instruction, 27(2), 91–120.
466 A. Bakker and D. van Eerde
Olsen, D. R. (2004). The triumph of hope over experience in the search for “what works”: A
response to Slavin. Educational Researcher, 33(1), 24–26.
Oost, H., & Markenhof, A. (2010). Een onderzoek voorbereiden [Preparing research]. Amersfoort:
Thieme Meulenhoff.
Opie, C. (2004). Doing educational research. London: Sage.
Paas, F. (2005). Design experiments: Neither a design nor an experiment. In C. P. Constantinou,
D. Demetriou, A. Evagorou, M. Evagorou, A. Kofteros, M. Michael, C. Nicolaou,
D. Papademetriou, & N. Papadouris (Eds.), Integrating multiple perspectives on effective
learning environments. Proceedings of 11th biennial meeting of the European Association for
Research on Learning and Instruction (pp. 901–902). Nicosia: University of Cyprus.
Peirce, C. S. (1976). The new elements of mathematics (C. Eisele, Ed.). The Hague: Mouton.
Peirce, C. S. (CP). Collected papers of Charles Sanders Peirce 1931–1958. In C. Hartshorne &
P. Weiss (Eds.), Cambridge, MA: Harvard University Press.
Plomp, T. (2007). Educational design research: An introduction. In N. Nieveen & T. Plomp (Eds.),
An introduction to educational design research (pp. 9–35). Enschede: SLO.
Plomp, T., & Nieveen, N. (Eds.). (2007). An introduction to educational design research. Enschede:
SLO.
Romberg, T. A. (1973). Development research. Overview of how development-based research
works in practice. Wisconsin Research and Development Center for Cognitive Learning,
University of Wisconsin-Madison, Madison.
Sandoval, W. A., & Bell, P. (2004). Design-dased research methods for studying learning in con-
text: Introduction. Educational Psychologist, 39(4), 199–201.
Sfard, A., & Linchevski, L. (1992). The gains and the pitfalls of reification — The case of algebra.
Educational Studies in Mathematics, 26(2–3), 191–228.
Simon, M. (1995). Reconstructing mathematics pedagogy from a constructivistic perspective.
Journal for Research in Mathematics Education, 26(2), 114–145.
Slavin, R. E. (2002). Evidence-based educational policies: Transforming educational practice and
research. Educational Researcher, 31, 15–21.
Smit, J., & Van Eerde, H. A. A. (2011). A teacher’s learning process in dual design research:
Learning to scaffold language in a multilingual mathematics classroom. ZDM The International
Journal on Mathematics Education, 43(6–7), 889–900.
Smit, J., van Eerde, H. A. A., & Bakker, A. (2013). A conceptualisation of whole-class scaffolding.
British Educational Research Journal, 39(5), 817–834.
Steffe, L. P., & Thompson, P. W. (2000). Teaching experiments methodology: Underlying princi-
ples and essential elements. In R. Lesh & A. E. Kelly (Eds.), Research design in mathematics
and science education (pp. 267–307). Hillsdale: Erlbaum.
Strauss, A., & Corbin, J. (1998). Basics of qualitative research techniques and procedures for
developing grounded theory (2nd ed.). London: Sage.
Tall, D., Thomas, M., Davis, G., Gray, E., & Simpson, A. (2000). What is the object of the encap-
sulation of a process? Journal of Mathematical Behavior, 18, 223–241.
Treffers, A. (1987). Three dimensions. A model of goal and theory description in mathematics
instruction. The Wiskobas project. Dordrecht: Kluwer.
Van den Akker, J. (1999). Principles and methods of development research. In J. van den Akker,
R. M. Branch, K. Gustafson, N. Nieveen, & T. Plomp (Eds.), Design approaches and tools in
education and training (pp. 1–14). Boston: Kluwer.
Van den Akker, J., Gravemeijer, K., McKenney, S., & Nieveen, N. (Eds.). (2006). Educational
design research. London: Routledge.
Van den Heuvel-Panhuizen, M. (1996). Assessment and realistic mathematics education. Utrecht:
CD-Bèta Press.
Van Nes, F., & Doorman, L. M. (2010). The interaction between multimedia data analysis and
theory development in design research. Mathematics Education Research Journal, 22(1), 6–30.
Wittmann, E. C. (1992). Didaktik der Mathematik als Ingenieurwissenschaft. [Didactics of math-
ematics as an engineering science.]. Zentralblatt für Didaktik der Mathematik, 3, 119–121.
Yin, R. K. (2009). Case study research: Design and methods. Thousand Oaks: Sage.
Chapter 17
Perspectives on Design Research:
The Case of Didactical Engineering
Michèle Artigue
Abstract In what is often called the “French didactical culture,” design has always
played an essential role in research. This is attested by the introduction and institu-
tionalization of a specific concept, that of didactical engineering, already in the
early 1980s and by the way didactical engineering has accompanied the development
of didactical research, both in its fundamental and applied dimensions. In this chapter,
I present this vision of design and its characteristics as a research methodology,
coming back to its historical origin in close connection with the development of the
theory of didactical situations, tracing its evolution along the last three decades, and
illustrating this methodology by some particular examples. I also consider current
developments within this design culture, especially those linked to the integration
of a design dimension into the anthropological theory of didactics and also to the
idea of didactical engineering of second generation introduced for addressing more
efficiently the development dimension of didactical engineering.
17.1 Introduction
M. Artigue (*)
Laboratoire de Didactique, LDAR, Université Paris Diderot – Paris 7,
Case 7018, 75205, Paris Cedex 13, France
e-mail: [email protected]
Engineering (DE in the following) already in the early eighties. Since that time DE,
which developed in close connection with the theory of didactical situations
initiated by Brousseau (cf. (Warfield 2006) for an introduction and (Brousseau
1997) for a more detailed vision), has accompanied the development of didactical
research, both in its fundamental and applied dimensions. This chapter is structured
into four main sections. In the first section I briefly review the development of DE
from its emergence in the early eighties until now, and clarify its links with the
theory of didactical situations (see also (Bessot 2011)). In the second section I pres-
ent its characteristics as a research methodology. In the third section I illustrate this
methodology with examples taken at different levels of schooling. In the fourth sec-
tion I consider two recent evolutions of DE. The first one is conveyed by the anthro-
pological theory of didactics in terms of course of study and research that considers
very open forms of design; the second one is “didactical engineering of second
generation” introduced by Perrin-Glorian for addressing dissemination and up-
scaling issues (Perrin-Glorian 2011). Beyond the many examples of realizations,
the writing of this chapter has been especially inspired by some foundational texts
such as (Chevallard 1982; Artigue 1990, 2002, 2009), and by the extensive reflection
on didactical engineering carried out at the XVe Summer School of Didactics of
Mathematics in 2009 (Margolinas et al. 2011).
1
In the theory, the milieu of a situation is defined as the system with which the student interacts,
and which provides objective feedback to her. The milieu may comprise material and symbolic
elements: artifacts, informative texts, data, results already obtained…, and also other students who
collaborate or compete with the learner.
470 M. Artigue
The teacher’s role, for its part, had been mainly approached in terms of the dual
processes of devolution and institutionalization, coherently with the vision of learn-
ing as a combination of adaptation and acculturation processes underlying the
theory. Through the devolution process, the teacher tries to make her students accept
the mathematical responsibility of solving the problem at stake. She tries to make
thus possible the adidactic interaction with the milieu required for learning through
adaptation. If the devolution process is successful, the students agree to forget
for a while the didactical intention of the teacher; to concentrate on the search for
mathematical solutions instead of trying to decipher the teacher’s expectations.
Through the process of institutionalization, the teacher connects the knowledge
built by students through adidactic interaction with the milieu to the scholarly
and decontextualized forms of knowledge aimed at by the institution, making the
acculturation dimension of learning possible.
In 1989, even if the DEs produced by researchers had been able to approach in
many cases this ideal-type, their functioning out of the control of research seemed
difficult. Moreover, high attention was paid to the innovative situations designed for
introducing new mathematical ideas or overcoming epistemological obstacles,2 and
much less to the more standard situations used for consolidating mathematical
knowledge and techniques. This situation created a distorted vision of DE products
that certainly had negative impact on the quality of their dissemination.
In 2009, 20 years later, DE was once again a theme for the summer school, in fact
its unique theme. Since 1989, the didactic field had substantially evolved. The
anthropological theory of didactics that was just emerging in the late 1980s had
matured and gained in influence. Moreover, in the last decade, it had created its own
design approach in terms of activities of research and study and then programs of
study and research (Chevallard 2006, in press). A new theoretical framework had
also emerged from the theory of didactical situations and the anthropological theory
of didactics: the theory of joint action between teachers and students, proposing a
renewed vision of the role of the teacher and of students-teacher interactions
(Sensevy 2011, 2012). More generally, teachers’ practices and professional development
had become a focus of research, and this research had developed its own methodolo-
gies involving naturalistic and participative observations of classrooms. DE was
still an important research methodology, especially each time the didactical systems
one wanted to study could not be observed in natural conditions (as is for instance
often the case in research about technology), but was no longer the privileged
methodology (Artigue 2002, 2009). Didactical engineering had also migrated out-
side its original habitat. It has been extended to teacher education and to the study
2
The notion of epistemological obstacle, introduced by the philosopher Gaston Bachelard, was
imported in the educational field by Guy Brousseau (1983) for expressing the fact that the develop-
ment of mathematical knowledge necessarily faces obstacles, due to prior forms of knowledge that
were relevant and successful in specific contexts. Epistemological obstacles are those attested in
the historical development of knowledge, and having played a constitutive role in this develop-
ment. Their identification may help understand students’ resistant errors and difficulties. Schneider
(2014) provides a synthetic presentation and discussion of the notion, its development and use in
mathematics education research.
17 Perspectives on Design Research: The Case of Didactical Engineering 471
Preliminary analyses set the background for the conception phase of the process.
They cover different dimensions, and especially the three following:
• An epistemological analysis of the content at stake, often including an historical
part. This analysis helps researchers to fix the precise goals of the DE and to
identify possible epistemological obstacles to be faced. It also supports the
search for mathematical situations representative of the knowledge aimed at,
what the theory of didactical situations calls fundamental situations. These are
problematic situations for the solving of which this knowledge is necessary or in
some sense optimal. The epistemological analysis helps the researchers to take
the necessary reflective position and distance with respect to the educational
world they are embedded in, and to build a reference point.
• An institutional analysis whose aim is to identify the characteristics of the
context in which the DE takes place, the conditions and constraints it faces.
These conditions and constraints may be situated at different levels of what is
called the hierarchy of levels of co-determination (Chevallard 2002) in the
anthropological theory of didactics. They may be attached to curricular choices
regarding the content at stake and associated teaching practices, to more general
curricular characteristics regarding the teaching of the discipline, the (technological)
resources accessible, the evaluation practices and the school organization.
They can also be linked to the characteristics of the students and teachers
involved, to the way the school is connected with its environment… Depending
on the precise goals and context of the research, the importance attached to these
different levels may of course vary.
• A didactical analysis whose aim is to survey what research has to offer regarding
the teaching and learning of the content at stake, and is likely to guide the design.
The three dimensions organizing the phase of preliminary analyses reflect the
systemic perspective underlying DE as a research methodology. Each of them has its
methodological specificities and needs. The epistemological analysis often involves
the use of historical sources and not just secondary sources; the institutional analysis
also generally includes an historical dimension. As made clear by the theory of
didactical transposition (Bosch and Gascón 2006), curricular organizations and
choices are the result of a long-term historical process; they cannot be understood
just by analyzing current curricula, official documents and textbooks. Such under-
standing is needed for making clear the strength of the constraints faced and the way
some of these can be moved in the design. The didactical analysis has generally a
substantial cognitive dimension, but this cognitive dimension is only one part of the
global picture even if what is aimed at is the development of a didactical strategy
allowing students to learn better some part of mathematics.
It must also be pointed out that, according to the precise goals of the research,
what is exactly investigated in these dimensions, and the respective importance
attached to each of them may vary substantially.
17 Perspectives on Design Research: The Case of Didactical Engineering 473
3
Among the many variables influencing the possible dynamics of a situation and its learning
outcomes, didactical variables are those under the control of the teacher. In a situation of
enlargement such as the well-known “Puzzle situation” by Brousseau, the number of pieces of the
puzzle, their shapes and dimensions, the ratio of enlargement are didactical variables; the fact
that students work in group, each student being asked to enlarge one piece of the puzzle is also a
didactical variable.
474 M. Artigue
During the realization phase, data are collected for the analysis a posteriori.
The nature of the collected data depends on the precise goals of the DE, on the
hypotheses put to the test in it and on the conjectures made in the a priori analysis.
However, particular attention is paid to the collection of data allowing the researcher
to understand students’ interaction with the milieu, and up to what point this interac-
tion supports their autonomous move from initial strategies to the strategies aimed
at, and to analyze devolution and institutionalization processes. Generally collected
data include the students’ productions including computer files when technology is
used, field notes from observers, audios and, more and more, videos covering group
work and collective phases. The data, collected during the realization are generally
complemented by additional data (questionnaires, interviews with students and
teacher, tests) allowing a better evaluation of the outcomes of the DE. During the
realization, researchers are in the position of observers. It is important to point out
that the realization often leads to make some adaptation of the design during the
realization, especially when the DE is of substantial size, or from one realization to
the next one when several realizations are planned in the research project. In that
case, adaptations are of course documented together with the rationale for them and
taken into account when the a posteriori analysis is carried out.
by the global evolution of the field and also by the technological evolution. In
general, researchers combine and triangulate different scales of analyses. They
more and more include microscopic analyses taking into account the multimodality
of the semiotic resources used by students and teachers that technology makes
accessible today. To this should be added that, as mentioned above, the validation of
the research hypotheses generally combines the analysis of data collected during the
classroom sessions themselves and of complementary data.
It must be stressed that the results obtained through this methodology are mainly
local, contextualized, and generally in form of existence theorems in their positive
forms. For instance, in the research I developed about the teaching of differential
equations in the mid-1980s (Artigue 1992, 1993), I used DE methodology to inves-
tigate the possibility of combining qualitative, algebraic and numerical approaches
to the solving of ordinary differential equations in a university mathematics course
for first year students. The research showed the possibility of organizing such a
course in the French context, at that time, with the support of technological tools; it
made clear what could be expected from such a course in terms of learning
outcomes in this particular context and why. Beyond that, one important result was
that a condition for the viability of the course was the acceptance by the didactical
system of proofs based on specific graphical arguments, which violated the usual
didactical contract4 regarding proofs in Analysis at university. The difficulty of
ensuring this acceptance out of experimental contexts and research control at that
time hindered a large-scale dissemination of the developed didactical strategy,
despite the fact that its robustness had been attested by realizations carried out with
different categories of students. These results were certainly interesting but could
not be generalized without precaution to another educational context. However, it
would be unreasonable to consider that the results of this engineering work were
limited to what we have summarized above.
As evidenced by the further use of this work by different researchers, the pre-
liminary analyses carried out had a more general value, as well as the understanding
gained on:
• the students’ cognitive development in this area;
• the role played in it by the interaction between the quantitative and the qualitative,
between algebraic and graphical representations;
• the affordances of technological tools for approaching the qualitative study of
differential equations;
4
The notion of didactical contract is a fundamental notion in the theory of didactical situations
(Brousseau 1997). It expresses the mutual expectations, partly explicit but mainly implicit, of
students and teacher regarding the mathematical knowledge at stake in a given situation. The rules
of the didactic contract often become visible when they are transgressed by one actor or another.
476 M. Artigue
I will finish this section by situating didactical engineering with respect to design-
based research, using the definition of it provided in the Encyclopedia of mathematics
education (Swan 2014, p. 148):
Design-based research is a formative approach to research, in which a product or process
(or ‘tool’) is envisaged, designed, developed and refined through cycles of enactment,
observation, analysis and redesign, with systematic feedback from end users. In education,
such tools might, for example, include innovative teaching methods, materials, professional
development programs, and/or assessment tasks. Educational theory is used to inform the
design and refinement of the tools, and is itself refined during the research process. Its goals
are to create innovative tools for others to use, describe and explain how these tools
function, account for the range of implementations that occur, and develop principles and
theories that may guide future designs. Ultimately, the goal is transformative; we seek to
create new teaching and learning possibilities and study their impact on teachers, children
and other endusers.
5
The didactical and ludic contract is defined as the set of rules that, implicitly or explicitly, fixes
the respective expectations and regulate the behaviour of one educator and one or several participants,
in a project combining ludic and learning aims.
17 Perspectives on Design Research: The Case of Didactical Engineering 477
This definition makes clear that design-based research and DE have some common
methodological characteristics. Both methodologies are organized around the
design of some educational tool; this design is informed by educational theory, but
also contributes to its development. Moreover, both methodologies reject standard-
ized validation processes based on the comparison of experimental and control
groups through a pre-test/ post-test system. However, differences are visible. The
global vision underlying design-based research is that of mathematics education as
a design science whose aim is the controlled production of educational tools
(Wittmann 1998; Collins 1992); the global vision underlying DE is of didactics of
mathematics as a fundamental science, whose aim is the understanding of didactical
systems and didactical phenomena, and which has also of course an applied
dimension. This fundamental difference reflects in methodological characteristics.
Design-based research is interventionist and iterative in nature, and the cyclic nature
of its process is essential. Along the successive cycles, the design is refined but
also experimented in wider contexts for studying how it functions with different
categories of users, not involved in the design process, and what adaptations may be
necessary for its large-scale use. Didactical engineering as a research methodology
does not obey the same pattern. It is more a “phénoménotechnique” with the meaning
given to this term by Bachelard (1937), a tool for answering didactical questions,
for identifying, analyzing and producing didactical phenomena through the con-
trolled organization of teaching experiments. This is the reason why the preliminary
analyses with their different dimensions and the a priori analysis are a central
part of the research work, and are given so much importance in the articles referring
to this methodology. Of course, this does not mean that a DE used in research is
built from scratch, but previous constructions when they exist are used to inform the
a priori analysis; the process is not theorized as a cyclic process. Moreover,
what concerns robustness and up-scaling is considered as a matter of development.
I come back to this point in the last section of this chapter, but first illustrate the
ideas developed up to now with two examples.
The first example I will consider is the paradigmatic example of the didactical
engineering developed by N. and G. Brousseau, more than three decades ago, for
extending the field of whole numbers towards rational and decimals (Brousseau and
Brousseau 1987, English version: Brousseau et al. 2014). This engineering which
ranges over 65 classroom sessions is a very big object when compared with usual
constructions whose size is much more limited. I cannot enter into its very details
but would like to show how this construction is characteristic of a DE piloted by the
theory of didactical situations.
478 M. Artigue
This construction evidences first the importance attached to the preliminary analyses,
and especially to their epistemological and didactical dimensions, the initial realiza-
tions having taken place in the COREM6 where the institutional pressure was
reduced. These analyses led Brousseau to question the usual educational strategy
for extending the field of whole numbers. Usually indeed, the first step was the
introduction of decimal numbers in connection with changes in units in the metric
system, and fractions played a more marginal role. Emphasis was put on the continuity
between the two systems of numbers (whole numbers and decimals), especially
regarding the techniques for arithmetic operations, and the resistant cognitive
difficulties that these strategies generated or reinforced were more and more evidenced
by research. Brousseau made the hypothesis that, in their last years at elementary
school, students were able to learn much more about rational and decimal numbers,
for instance to differentiate the dense order of rational and decimal numbers from
the discrete order of whole numbers, to appreciate the computational interest of
decimal numbers and the possibility that this system offers for approaching rational
numbers with arbitrary levels of precision. The didactical engineering developed
aimed at testing the validity of this hypothesis with ordinary students.
The epistemological analysis carried out inspired the first macro-choice, in clear
rupture with established practices: to extend first the field of numbers towards
rational numbers, and then to particularize decimal numbers among these for the
facilities they offer in terms of comparison, estimation and calculation. Regarding
the introduction of rational numbers, another macro-choice was made linked to the
identification of two different conceptions for rational numbers: a conception in
terms of partition of the unit (1/n is then associated with the partition of one unit
into n equal parts and the rational m/n represents m such pieces of the unit) and
a conception in terms of commensurability, which corresponds to the search for a
common multiple to two different magnitudes for instance two lengths (the ratio of
two magnitudes is expressed by the rational m/n if m times the second one equals n
times the first one). Generally didactical strategies privilege the first conception in
the context of pizza parts or other equivalent contexts. This constitutes an easy
entrance in the world of fractions but Brousseau hypothesized that it could contribute
to the observed cognitive difficulties. This led him to explore the potential offered
6
COREM was the Center for observation and research in mathematics education created by
Brousseau in Bordeaux in 1973. An experimental elementary school was attached to this center,
with very advanced means for systematic data collection and storage. The data collected there during
more than 20 years are still studied by researchers, for instance, in the frame of the national project
VISA (http://visa.ens.lyon.fr). Detailed information is accessible at the following url: http://guy-
brousseau.com/le-corem/acces-aux-documents-issus-des-observations-du-corem-1973-1999/
17 Perspectives on Design Research: The Case of Didactical Engineering 479
of the interaction with the milieu, it is thus necessary that some knowledge about
proportional reasoning be part of the mathematical knowledge shared by students.
In the a priori analysis, this knowledge is supposed from the generic student. For
instance, if the task is to compare the types of paper corresponding to the two couples
mentioned above, one can develop the following reasoning: for the first paper, 2 mm
should correspond to 54 sheets, and 54 is more than 40, thus the second paper is
thicker. For close thicknesses, comparison may be more delicate for the reasons
mentioned above, and several exchanges of messages might be needed.
What is mathematically at stake in the solving of this problem is the ordered
structure of rational numbers seen as couples of whole numbers or more appropriately
families of such couples, and the conception attached is clearly the commensurability
conception. As shown by the many realizations carried out, substantial work can be
developed in this context about equality and order of rational numbers, students can
progressively discover a good number of properties in a-didactic interaction with
the successive milieus organized for them, validate them pragmatically using piles
of paper, and then use piles of paper more metaphorically for supporting computations
and reasoning. However, the mathematical knowledge built still remains attached to
this specific context. There is no reason that the notations introduced by students
and progressively refined for reasons of economy and efficiency are the conventional
notations. This is the responsibility of the teacher to decide when to connect these
classroom notations to the usual ones expected by the institution, and also to organize
the decontextualization of knowledge through appropriate situations. Of course, in
the DE, these steps are also carefully designed.
In this DE, the same context is then used for extending addition to these new
numbers. However it does not allow to extend multiplication to rational numbers in
a similar way. For this extension, the choice is made of privileging a conception of
multiplication as an external operation in terms of linear application for which the
well-known situation of the puzzle is the associated fundamental situation. With this
new situation, it is also expected to make students face the epistemological obstacle
of the additive model.
I cannot enter into more details in this DE structured in four main phases and invite
the interested reader to consult the references mentioned above or the retrospective
analysis provided by Brousseau and Brousseau (2007). In the description above,
I have focused on the essential phases of design and a priori analysis of the meth-
odology, trying to show how they were informed by the preliminary analyses and
guided by the theory of didactical situations. The experimentations took place in
the experimental school attached to the COREM, the sessions were observed by
researchers according to specific guidelines and systematically video-recorded. The
comparison of the a priori and a posteriori analyses, the complementary tests taken
by the students, validated the hypotheses underlying the DE.
17 Perspectives on Design Research: The Case of Didactical Engineering 481
This DE was used year after year in the experimental school attached to the
COREM. More than 750 students were exposed to it and its robustness was
confirmed. However, as often stressed by Brousseau himself, it was never consid-
ered that it could be easily implemented in ordinary schools and become a standard
teaching strategy. Moreover, the comparison of the successive dynamics attracted
Brousseau’s attention to the fact that the reproduction of the same situations, year
after year, by a teacher generated what he called a phenomenon of obsolescence
affecting the internal reproducibility of the DE. This phenomenon more globally
raised the issue of the reproducibility of didactical situations that was theorized in
further work (Artigue 1986).
It must also be stressed that this DE was in fact used for approaching a diversity
of research questions, and for instance for investigating dependences between
conceptions (Ratsimba-Rajohn 1982). In his doctoral thesis, indeed, Ratsimba-
Rajohn, starting from the two strategies for associating a rational measure to a
magnitude mentioned above (commensurability and partition of the unit), precisely
differentiated these in terms of situations of effectiveness and mathematical
knowledge engaged. This analysis led to the identification of a set of nine variables
conditioning the effectiveness and cost of each strategy, depending on the type of
task (game in the terminology used by the author, in line with the use of game the-
ory in the theory of didactical situations). The author used this tool for investigating
how students introduced to rational measures through the commensurability strat-
egy, as was the case in the DE, could enrich their strategies by incorporating the
partitioning strategy, a priori more intuitive and socially used. For that purpose, a
sequence of three situations was designed as part of the DE. In the first situation, the
commensurability strategy was extended to other magnitudes (length, weight,
capacity); in the second situation, the tasks proposed were out of the domain of
effectiveness of the commensuration strategy but could be solved using the partition
strategy.7 The goal of the third situation was to initiate the validation of equivalence
of the two models when both strategies are effective. The corresponding lessons
were implemented in two consecutive years. Students’ strategies and their evolution
were carefully documented. Different dynamics were identified. The most striking
result was the difficulty that these students had at moving from commensuration
strategies to partition strategies, even when commensuration was ineffective. These
difficulties were confirmed by the evolution of students’ answers at a test taken by
the students before and after the teaching sequence in the first year of experimentation.
All students significantly progressed in their answers to questions that favored the
commensuration strategy or were neutral, only one student progressed on questions
blocking the commensuration strategy. Difficulties met in using commensuration
and efforts made for overcoming these difficulties in fact tended to reinforce this
strategy and the associated conception of rational numbers; more was needed for
7
This is the case for instance when pupils are asked to find a rational measure for a stick, a unit
stick being provided, but the limitation of the physical space and material provided does not allow
them to implement the strategy of commensuration.
482 M. Artigue
In this DE, we observe still the same attention paid to preliminary analyses.
Maschietto developed a detailed analysis of the different perspectives that can be
attached to a function: punctual, local, global, of the idea of local straightness, and
of thinking modes in Analysis. Her epistemological analyses also aimed at under-
standing how, before the official introduction of the concept of limit, the language
of infinitesimals could support the transition from Algebra towards Calculus, foster-
ing the identification of rules for computations taking into consideration the
17 Perspectives on Design Research: The Case of Didactical Engineering 483
The conception phase of the DE relied on these preliminary analyses. In the first situ-
ation, students were asked to consider six different functions and after entering them
in the calculator and getting their graphical representation in the standard window, to
make successive zooms around particular points and to explore what happened.
They were also asked to sketch the initial representation and those obtained after
two zooms and at the end of the exploration (when they had the feeling that the
graphical representation was more or less stable), before moving to another function.
8
For instance, taking into account the fact that, in the neighbourhood of 0, the order of magnitude
of x2 + x is the order of magnitude of x.
484 M. Artigue
The number and characteristics of the proposed functions and the selected points are
evidently micro-didactical variables for this task. In the DE, the value of these was
chosen so that students first met differentiable functions, then faced a function not
differentiable at a point but having left and right derivatives (the function defined by
f(x) = −x3 − 2∣x∣ + 4), a linear function and a function with a more complex behavior
(the function defined by f(x) = 4 + x.sin(1/x) for x 0 and f(0) = 4). It was hypothe-
sized that the first examples would lead students to perceptively identify the local
straightness phenomenon and to expect its emergence for further examples. The
examples of non-differentiable functions would then oblige them to realize that there
exist exceptions to this apparently common behavior and that these exceptions might
present different characteristics. It was also expected that the dynamic process of
zooming would make emerge discourses and metaphors able to support the further
mathematization of the perceptive phenomena of local straightness. The drawings
asked of the students were expected to be a useful support for this emergence, and for
the substantial collective discussion at the end of the session. These drawings were
also data to be used for the a posteriori analysis. Moreover, for each function two
different points were selected for insisting on the local nature of the observed
phenomenon. Students worked in pairs with one calculator for each pair and one
common graphical production to deliver. This is a classical organization in DE for
fostering verbal exchanges and making these accessible to researchers.
The aim of the second situation was the mathematization of this perceptive phe-
nomenon. A differentiable function was selected, different from those already envis-
aged, and a particular point of its graphical representation. Students were asked to
check its local behavior around this point and to find the equation of the line they
had got on the screen. It was hypothesized that the different groups would manage
the zooming process in different ways and stop it at different times, obtaining thus
close but different lines. Using the Trace command or numerical values from the
Table application of the calculator for getting coordinates of a second point of their
line, they would thus get different equations. At this stage, it was planned that the
teacher would collect and write on the blackboard all these equations and would
launch a collective discussion. It was hypothesized that the view of the equations,
close but different would lead students to consider all these lines as approximations
of one ideal object: the tangent to the curve, whose equation they could conjecture
from the equations written on the blackboard. The validation of this conjecture was
not supposed to result from mere a-didactical interaction with the milieu. In the
scenario for this session, it was planned that the teacher would ask students to find
a common way of expressing the different computations and that, if this was not
spontaneously proposed by them, she would introduce the idea of giving account of
the commonalities between these different calculations through the use of a letter h
representing the different small increments chosen by the students. From this point
a collective computation was expected to lead to an equation for the line depending
on h, but becoming the ideal equation when h was made equal to 0 (in some sense
when infinite zooming was performed). This should allow the teacher both to
institutionalize the definition of the tangent to a curve at a given point in terms of
linear approximation, and the specific type of computation that allowed finding its
17 Perspectives on Design Research: The Case of Didactical Engineering 485
equation. For this second situation, the characteristics of the function and of the
point were the main micro-didactical variables of the task. In the DE, two different
choices were successively made: a polynomial function of degree 2 and then one of
degree 3, with simple coefficients and of a point whose coordinates were such that
the ideal equation could be easily conjectured. Choosing a polynomial function and
using the letter h in the symbolic computation resulted in the equation of the line
described by a polynomial in h (after simplification by h), which made the reasoning
easier. Choosing a polynomial of degree 3 made that the algebraic strategy known
from these students for finding tangents to conics did no longer work. Once again
students worked in pairs. In the third situation, it was planned to begin to consolidate
the form of computation that had been introduced and also to connect this conception
of the tangent in terms of approximation with those conceptions, geometric and
algebraic, mentioned above, reinforced in grade 10, through the work with conics.
As mentioned above, it was hypothesized that during the three sessions, the students
would combine gestures with the use of language and different semiotic representations
for making sense of the situations and exchange with other students and the teacher.
However, the exact forms these combinations would take, and the language that
students were likely to introduce for qualifying local straightness was not anticipated.
From that point of view, the DE had more an exploratory purpose.
Each session lasted 90 min and combined a phase of autonomous work by the
students and a phase of collective discussion. Its a priori analysis was structured in
the thesis around the following dimensions:
• the preparation of students’ worksheets and analysis of them in terms of
mathematical content, pre-requisites, didactical variables;
• the analysis of the role to be played by graphic and symbolic calculators in each
phase of the session;
• the analysis of the work expected from the students, the anticipation of possible
strategies and difficulties;
• the analysis of the work expected from the teacher in each phase of the session,
and of the distribution of responsibility expected between students and teacher.
The collected data consisted of students’ worksheets and productions, videos of one
particular group and of collective phases, observation notes for different groups
(two or three depending on the experimentation) according to guidelines defined in
the analysis a priori. A test taken by students 2 weeks after the teaching experiment
and a questionnaire filled by them regarding their participation in this experience
were added. The semiotic perspective impacted the collection of data (those in
charge of video-recording for instance tried to capture students’ and teacher’s gestures
as much as possible) and the a posteriori analysis of the sessions.
The a posteriori analysis of each session combined two levels. The first level
presented a global analysis of the session in its relation to the a priori analysis
(regarding the scenario of the session, the distribution between group work and
486 M. Artigue
collective discussions, the strategies developed by the students and the main charac-
teristics of their work, the difficulties observed, the teacher’s role…). The second
level was a fine-grained analysis of the data collected during the session elucidating
the conceptualization processes at stake and their characteristics, through the role
of the calculator, of metaphors, of discourse and gestures, of interactions between
students during group work and between students and teacher.
We illustrate this methodological work by a few examples taken from the
a posteriori analysis of the first session. For this session, the global analysis was
structured around four dimensions: the scenario, the localization of the perspective,
the emergence of the invariant and the role of the teacher. Regarding the localization
of the perspective for instance, the main elements taken into account in this global
approach were the characteristics of the graphical representations drawn by the
different groups. A specific list of codes had been developed in the a posteriori
analysis of the first experimentation, starting from students’ productions. It was
used again in the a posteriori analysis of the second and third experimentation.
These codes showed the expected evolution of representations along the zoom
process, but they also made evident the strength of the usual didactic contract
regarding graphical representations of functions and the difficulty most students
thus faced when the zooming process makes the axes disappear.
The analysis of data for the observed groups and for the collective discussion then
combined different semiotic elements for clarifying the conceptualization processes
at stake and the characteristics of the situation that fostered these conceptualizations
(characteristics of the task, of the milieu and of social interactions). In particular,
discourse, inscriptions and gestures were tightly connected in the analysis.
In the a posteriori analysis, the different levels of analysis for one particular ses-
sion were then combined for testing the conjectures made in the a priori analysis
regarding this particular session. The same type of a posteriori analysis was made
for the three sessions, then the different results were synthesized and triangulated
with those resulting from the analysis of the final test and questionnaire.
The following two quotations by Maschietto (2008) in which the author gives a
synthetic vision of her research work, illustrate the form that these analyses
have taken. The first quotation (pp. 215–216) regards the emergence of the linear
invariant and an interesting phenomenon accompanying this emergence. This
phenomenon was not anticipated in the a priori analysis but it had a positive effect
on the dynamics of the situation.
DF “I want the other piece of function. It’s still a line! Draw at least one axis”
(addressed to MA. DF carries out the 3rd ZoomIn)
DF “We’ll stop here because it stays the same”.
In the pencil-and-paper environment (Fig. 17.1a), the linearity is emphasised by the use
of a ruler to draw the graphical representation that appears on the calculator display on the
third sheet (end of the exploration).
In other protocols (Exp_B and Exp_C), the students try to explain the end-point of their explo-
ration, for example: “REASON WHY WE STOPPED CARRYING OUT THE ZOOMS → The
more we used the ZoomIn, the more the curve sector considered tended to become a line”. We
observe here a dynamic language, that draws on the infinite approximation process.
In the protocols, there are two distinct phenomena, linked to the local point of view. The
first regards the strength of the “straight” nature at a perceptive level. The second regards
the interference of the global point of view with the local one. As far as the first phenome-
non is concerned, the comments (for example, Excerpt 2) on the exploration of the corner
(function y39) highlight that at this stage the students have, in general, clearly identified the
graphic phenomenon “it becomes straight using the zoom”.
Excerpt 2: DAL-DF-MA group (Exp_A)
In all these cases the functions, even with the second zoom, are similar to a line with a
gradient ≥0 but:
– y410 is similar to a line only after the 4th zoom [Note: at x = 1/pi]
– y3 is similar to two lines (one with m > 0 and the other with m < 0)
However, this recognition does not allow them to distinguish the situation of the func-
tion that is differentiable at the given point and that of the function having two different half
derivatives and leading to a corner. In fact, these situations, mathematically different, are
unified by their common “straightness” recognized at a perceptive level (Excerpt 2). The
second function does not therefore represent a counter-example, unlike what is hypothe-
sized in the a-priori analysis. Their distinction will only occur during the mathematization
9
y3(x) = − x3 − 2 ∣ x ∣ + 4 at x = 0.
10
y4(x) = 4 + sin(1/x)at x 0, = 4 at x = 0
488 M. Artigue
process of the linear invariant. The real counter-example is provided by the y4 function,
the graphical representation of which, after subsequent zooms, is perceptively different. In
this case there is no move from the “curve” category to the “straight” category, as happens
for all the other functions.
The second quotation (pp. 217–218) shows the importance attached to gestures
in the a posteriori analysis:
In accordance with the a-priori analysis, the activity presented to the students
shows its potential for the production of gestures and metaphors. These
appeared both during the communication inside the groups and during the
collective discussions. The analysis of the students’ protocols and the discussions
show that the conceptualisation of the zoom- controls, that supports the locali-
sation of the view, appears through gestures that accompany the explanation
of the exploration strategies and linguistic expressions that can be analysed in
terms of metaphors.
A particularly representative example is the analysis of the gestures of one
student, PM (Exp_A), while he is explaining the exploration of a graphical
representation. The ZoomIn control is used in order to see some of the char-
acteristics of the curve in a detailed way and is associated with a downward
movement meaning an “entrance into the curve,” that corresponds with moving
into the curve (ZoomIn gesture, Fig. 17.2a). The ZoomOut control, which is
used to obtain a bigger curve and to study its characteristics better, is associ-
ated with an upward movement meaning an “exit from the curve” (ZoomOut
gesture, Fig. 17.2b), which also corresponds with moving away from the curve.
PM’s gestures lead the details of the curve to be interpreted as downwards and the
overall curve as upwards. PM also creates a space in front of him for controlling
these processes (the standard window of the calculator becomes a little rect-
angle that is constructed by his fingers, Fig. 17.2c).
The reference to the ZoomOut control identifies the space under his eyes, while the palm of
one hand is associated with the flat part that is obtained from the ZoomIn. In this way, PM
has created his own space, which is suggested by the activity with the calculator, where the
two different transformations of the curve can co-exist and be controlled.
The realization took place in three different classes as mentioned above, with
some minor adjustments and evident regularities were observed. Globally the
hypotheses mentioned above were confirmed despite the fact that it was not possible
to cover all that had been planned and that, due to their previous experience
with conics, some groups conjectured very early that the line was the tangent and
privileged an algebraic strategy for finding its equation, persisting in that strategy
with the polynomial of degree 3 in the second and third experiments. Some interesting
and non-anticipated phenomena also occurred but they did not necessarily invalidate
the a priori analysis. For instance, as shown in the first quotation above, it appeared
that most students considered that straight lines and curves were objects belonging to
different categories. This conception in fact helped them to consider that the linear
representations they obtained by zooming were not exactly linear but just very
close to a linear object, and that linearity could only be reached through an infinite
succession of zooms. This helped them to make sense of the notion of tangent as
an ideal object and of the computations carried out for finding its equation. This
conception nevertheless also led them to think that the function admitting only left
and right derivatives at a given point was not very different from the regular ones.
This question was considered again later on once the derivative was properly
defined. As expected also, gestures accompanied students’ verbalizations and work,
and the language and metaphors used by students showed evident embodiment.
They introduced their own expressions for qualifying the phenomenon of local
straightness, saying for instance that the functions were “zoomata lineare” at a particular
point and these were accepted and used by the teacher. Validation of the DE did not
just use the comparison of the a priori and a posteriori analysis of the sessions, but
also the data from the questionnaire and interviews taken by the students after the
completion of the process as mentioned above.
I cannot enter into more details here. The interested reader can find these in the
references mentioned above. But I would like to stress a few points. According to
the author, this methodological construction is a DE and I fully agree with this posi-
tion, recognizing in it the fundamental features of DE presented above. This is nev-
ertheless a construction sensibly different from that described in the first example.
For instance, it is difficult to model the first situation as a game that students enter
with basic strategies that they must make evolve towards winning strategies.
Students are asked to stop their exploration when they have got the feeling that the
graphical representations will no longer substantially evolve, which is a rather fuzzy
condition. Moreover, if the situations are designed in order to ensure productive
adidactical interaction with the milieu, in the construction of the situations an
important role is given to collective discussions piloted by the teacher and to her
490 M. Artigue
After considering these two examples, in the last part of the paper, we enter into
some recent developments of didactical engineering, referring more precisely to the
work carried out at the 2009 summer school.
As mentioned earlier, the anthropological theory of didactics has developed in
the last decade a design perspective based on the idea of Programme of Study and
Research (PSR in the following). At the 2009 summer school, Chevallard proposed
to refund didactical engineering around this idea (Chevallard 2011). I will not follow
him up to this point but would like to situate Chevallard’s perspective with respect
to the vision of DE that has been presented in the first sections of this chapter, and
briefly explore some possible complementarities between these.
Through PSR, Chevallard wants to build a new epistemology opposing what he
calls the “monumentalistic” doctrine pervading contemporary school epistemology
(Chevallard 2006, in press). As explained by Chevallard (2006):
For every praxeology11 or praxeological ingredient chosen to be taught, the new epistemol-
ogy should in the first place make clear that this ingredient is in no way given, or a pure
echo of something out there, but a purposeful human construct. And it should consequently
bring to the fore what its raisons d’être are, that is, what its reasons are to be here, in front
of us, waiting to be studied, mastered, and rightly utilised for the purpose it was created to
serve. (p. 26)
11
The notion of praxeology is central in the anthropological theory of didactics that considers that
knowledge emerges from human practices and is shaped by the institutions where these practices
develop. Praxeologies, which model human practices, at the most elemental level (punctual prax-
eologies), are defined as 4-uplets made of a type of task, a technique for solving this type of task,
a discourse explaining and justifying the technique (technology), and a theory legitimating the
technology itself.
17 Perspectives on Design Research: The Case of Didactical Engineering 491
In coherence with this vision, a PSR starts from the will to bring an answer to some
generating question. In fact, at the 2009 summer school, Chevallard distinguished
between different forms of PSR, and especially between finalized and open PSR. In
finalized PSR, the main praxeologies aimed at are known. They correspond for
instance to praxeologies aimed at by a given curriculum. The designer must found a
question or a succession of questions which are able to generate the encounter of the
corresponding types of tasks and the development of techniques and technological
discourse constituting these praxeologies. This is done by a combination of study of
existing works and inquiry processes. In open PSR, the situation is quite different.
There is a generating question but the praxeological equipment needed for answering
it is not a priori known; neither it is necessarily limited to mathematical praxeologies.
This is for instance often the case in project work, and modeling activities.
Even in the case of finalized PSR, the proposed vision however is at some distance
from the forms of DE mentioned above, especially in what concerns the milieu and its
evolution. This is notably due to the place given to cultural answers to the question at
stake in PSR. In the didactical schema that Chevallard proposes (Chevallard in press),
a role is given to cultural answers or pieces of information accessible to the learners in
the media and especially on the Internet. It is supposed that such cultural answers or
pieces of information can enter the milieu on the initiative of teacher or students and
that, duly studied and criticized, they should contribute to the elaboration of the
expected answer to the question at stake. In the anthropological theory of didactics,
this is encapsulated in the idea of media-milieu dialectics.
Differences with the classical vision of DE also concern more globally what the
researcher ambitions to optimize and control in the design phase and consequently
they affect the a priori analysis. This is especially the case for open PSR. For that
case Chevallard denies the possibility of an a priori analysis. He thus introduces
the idea of analysis in vivo, fully integrated into the inquiry work. This position can
be questioned all the more as the publications of researchers working within this
perspective show that they develop some form of a priori analysis to select questions
having a strong generating power under the institutional conditions and constraints
at stake. What is clear, however, is that, for such open PSR, in the a priori analysis
researchers are more interested in investigating the didactical potential of the
selected question, trying to make clear how its study can develop and generate
new and interesting questions, motivate the study and progressive structuring of
important praxeologies, than in the optimization of students’ learning trajectories.
In fact, the a priori analysis becomes an on-going process that develops and adjusts
along the implementation phase of the DE. The doctoral thesis by Barquero (2009),
(see also Barquero et al. 2008) analyzing the design and implementation of a PSR
devoted to the modeling of population dynamics with undergraduate students
provides a good example of such functioning.
There is no doubt that, from a DE perspective, the notion of open PSR makes it
possible to address research issues attached to the functioning and viability of
didactical forms more open than those usually addressed by existing DE such as
project work and modeling activities. These didactical forms still have a marginal
position in educational systems but they are also more and more encouraged as
492 M. Artigue
12
See the portal www.scientix.eu for information about these projects.
17 Perspectives on Design Research: The Case of Didactical Engineering 493
At the second level, the goal is the study of the adaptability of such validated
situations to ordinary classrooms and teachers through the negotiation of the DE with
teachers who have not been involved in the first phase. These negotiations and the
transformations introduced by the teachers involved in this second phase are taken
as objects of study together with their impact on the DE itself and its outcomes. It is
expected that the results allow researchers to determine what concessions can be
made in such negotiations, what should be preserved and why, and to identify what
forms of control can be maintained.
As Perrin-Glorian points out, envisaging this second level modifies in fact the
first level because it obliges researchers to move from a top-down conception of
transmission of research results to an idea of adaptation much more dialectical.
As she adds:
The problem is no longer to control and disseminate engineering products coming from
research but to determine the key variables, in terms of knowledge involved, piloting the
didactical engineering that one wants to make a resource for ordinary teaching, and to study
the conditions of their dissemination. (p. 69, our translation)
She then illustrates this vision by an example regarding the teaching of axial
symmetry at the transition between elementary school and junior high school.
This reflection in fact points out that the transition from research to development
needs specific forms of research, extending our view of the ways didactical engineering
and educational research can be connected.
17.6 Conclusion
In this chapter, I have tried to present didactical engineering, focusing on its dimension
of research methodology. To help readers make sense of this methodology, I have
reviewed its history from its emergence in the early 1980s until now. I have tried to
clarify its main characteristics and to show that this methodology, even if it has
been shaped by the values and constructs of the theory of didactical situations, is a
methodology that can be productively used beyond the frontiers of this theory, and
is enriched by the different uses made of it. I have also tried to show that, as for
many other constructs in educational research, didactical engineering is a living and
dynamic concept which adapts to the evolution of the field, to the advances of educational
knowledge, and to the evolution of the social and cultural contexts of mathematics
education. I also hope to have made clear that this methodology, although flexible,
imposes a systemic view of the field, a view of the classroom as a social organization,
of learning as a combination of adaptation and acculturation processes and a particular
sensitivity to the discipline and its epistemology.
494 M. Artigue
References
Brousseau, G., Brousseau, N., & Warfield, V. (2014). Teaching fractions through situations: A
fundamental experiment. New York: Springer. doi:10.1007/978-94-007-2715-1.
Burkhardt, H., & Schoenfeld, A. H. (2003). Improving educational research: Toward a more
useful, more influential, and better-funded enterprise. Educational Researcher, 32(9), 3–14.
Cantoral, R., & Farfán, R. (2003). Mathematics education: A vision of its evolution. Educational
Studies in Mathematics, 53(3), 255–270.
Castela, C. (1995). Apprendre avec et contre ses connaissances antérieures. Recherches en
Didactique des Mathématiques, 15(1), 7–47.
Chevallard, Y. (1982). Sur l’ingénierie didactique. Preprint. Marseille: IREM d’Aix Marseille.
http://yves.chevallard.free.fr/spip/spip/article.php3?id_article=195. Accessed 28 Apr 2013.
Chevallard, Y. (2002). Organiser l’étude. In J. L. Dorier, M. Artaud, M. Artigue, R. Berthelot, &
R. Floris (Eds.), Actes de la Xème Ecole d’été de didactique des mathématiques (pp. 3–22,
41–56). Grenoble: La Pensée Sauvage.
Chevallard, Y. (2006). Steps towards a new epistemology in mathematics education. In M. Bosch
(Ed.), Proceedings of the IVth congress of the European society for research in mathematics
education (CERME 4) (pp. 22–30). Barcelona: Universitat Ramon Llull Editions.
Chevallard, Y. (2011). La notion d’ingénierie didactique, un concept à refonder. Questionnement
et éléments de réponse à partir de la TAD. In C. Margolinas, M. Abboud-Blanchard, L. Bueno-
Ravel, N. Douek, A. Fluckiger, P. Gibel, F. Vandebrouck, & F. Wozniak (Eds.), En amont et en
aval des ingénieries didactiques (XVe école d’été de didactique des mathématiques,
pp. 81–108). Grenoble: La Pensée Sauvage Editions.
Chevallard, Y. (in press). Teaching mathematics in tomorrow’s society: A case for an oncoming
counter paradigm. Regular lecture at ICME-12 (Seoul, 8–15 July 2012). http://www.icme12.
org/upload/submission/1985_F.pdf. Accessed 28 Apr 2013.
Cobb, P. (2007). Putting philosophy to work: Coping with multiple theoretical perspectives. In
F. Lester (Ed.), Second handbook of research on mathematics teaching and learning (pp. 3–38).
Greenwich: Information Age.
Collins, A. (1992). Towards a design science in education. In E. Scanlon & T. O’Shea (Eds.),
New directions in educational technology (pp. 15–22). New York: Springer.
Defouad, B. (2000). Etude de genèses instrumentales liées à l’utilisation d’une calculatrice
symbolique en classe de première S [Study of instrumental genesis in the use of a symbolic
calculator in grade 11]. Doctoral thesis, Université Paris 7.
Design-Based Research Collaborative. (2003). Design-based research: An emerging paradigm for
educational enquiry. Educational Researcher, 32(1), 5–8.
Douady, R. (1986). Jeux de cadres et dialectique outil-objet. Recherches en Didactique des
Mathématiques, 7(2), 5–32.
Falcade, R. (2006) Théorie des situations, médiation sémiotique et discussions collectives dans des
séquences d’enseignement qui utilisent Cabri-géomètre et qui visent à l’apprentissage des
notions de fonction et graphe de fonction [Theory of situations, semiotic mediation and collective
discussions in teaching sequences using Cabri-geometer and aiming to the learning of the ideas
of function and function graph]. Doctoral thesis, Université de Grenoble 1. http://www-diam.
imag.fr/ThesesIAM/RossanaThese.pdf. Accessed 28 Apr 2013.
Falcade, R., Laborde, C., & Mariotti, M. A. (2007). Approaching functions: Cabri tools as instru-
ments of semiotic mediation. Educational Studies in Mathematics, 66(3), 317–334.
Farfán, R. (1997). Ingeniería didactica y matemática educativa. Un estudio de la variación y el
cambio [Didactical engineering and mathematics education. A study of variation and change].
México: Grupo Editorial Iberoamérica.
Lakoff, G., & Nuñez, R. (2000). Where mathematics comes from: How the embodied mind creates
mathematics. New York: Basic Books.
Margolinas, C., Abboud-Blanchard, M., Bueno-Ravel, L., Douek, N., Fluckiger, A., Gibel, P.,
Vandebrouck, F., & Wozniak, F. (Eds.). (2011). En amont et en aval des ingénieries didactiques.
XVe école d’été de didactique des mathématiques. Grenoble: La Pensée Sauvage Editions.
496 M. Artigue
The United States educational system is decentralized, and there is a long history of
the local control of schooling. Each U.S. state is divided into a number of indepen-
dent school districts. In rural areas, many districts serve less than 1,000 students
whereas a number of urban districts serve more than 100,000 students. In the context
of the U.S. educational system, urban districts are the largest jurisdictions in which it
is feasible to design for improvement in the quality of instruction (Supovitz 2006).
The federal government’s role in the educational system in the U.S. has increased
significantly in recent years following the passing of the No Child Left Behind Act
(NCLB) in 2001. States receive incentives to set standards for students’ mathematics
achievement, develop standardized assessments aligned with the standards, and imple-
ment accountability measures to promote increases in achievement for all students and
for specific sub-groups (e.g., racial and ethnic categories, socio-economic status,
students who receive special education services). Districts and schools are sanctioned
if they fail to meet goals for “adequate yearly progress” (AYP) on state assessments.
As a result, school districts are under great pressure to improve student achieve-
ment in mathematics. In addition to responding to accountability pressures, urban
school districts in the United States face a number of other challenges that impact
improvement initiatives. These challenges include limited financial resources,
under-prepared teachers, and high teacher turnover (Darling-Hammond 2007).
Unfortunately, most U.S. school districts do not have the capacity to respond to
these accountability demands in a productive manner (Elmore 2006). Many districts
are implementing short-term interventions aimed at “teaching to the test,” and some are
attempting to game the assessment system (Heilig and Darling-Hammond 2008). In
addition, districts frequently expend considerable resources on different (and even
conflicting) improvement policies, abandoning each for the next when student achieve-
ment does not improve quickly, without understanding the challenges of implementing
particular policies. This policy churn (Hess 1999) can cause frustration for teachers and
does not help the larger educational community understand how improvement in
student achievement can be supported at the scale of a large school district.
A minority of districts is responding to accountability demands by attempting to
improve the quality of classroom instruction. These districts are attempting to support
teachers’ development of high quality instructional practices that will ultimately
lead to improvement in student achievement (Elmore 2004). Concurrently, the role
of the principal is shifting from school manager to instructional leader, with
an increased responsibility to support instructional reforms in each content area
(Nelson and Sassi 2005; Fink and Resnick 2001). To date, efforts to support funda-
mental improvements in teachers’ instructional practices on a large-scale have rarely
been successful, and there are no proven models regarding how this can be accom-
plished (Elmore 2004; Gamoran et al. 2003). Furthermore, although research on
mathematics teaching and learning has made significant advances in recent years,
500 E. Henrick et al.
these advances have had limited impact on the quality of instruction in most U.S.
classrooms. In addition, research in both mathematics education and in educational
policy and leadership can provide only limited guidance to districts attempting
to respond to high stakes accountability pressures by improving the quality of
mathematics instruction.
The four urban school systems, or districts, that we recruited for the MIST study
were all pursuing similar agendas for instructional improvement in mathematics.
These agendas were oriented by goals for students’ mathematics learning that are
relatively ambitious in the U.S. context. These system-level goals emphasized
students’ development of conceptual understanding as well as procedural fluency
in a range of mathematical domains, students’ use of multiple representations,
students’ engagement in mathematical argumentation to communicate mathematical
ideas effectively, and students’ development of productive dispositions towards
mathematics (U.S. Department of Education 2008; Kilpatrick et al. 2001; National
Council of Teachers of Mathematics 2000). These student learning goals in turn
oriented leaders of the four collaborating districts as they specified high-quality
mathematics instructional practices that could be justified in terms of student learning
opportunities (Kazemi et al. 2009). The resulting view of high-quality instruction
has been referred to in the U.S. as ambitious teaching (Lampert and Graziani 2009;
Lampert et al. 2010).
Ambitious teaching requires teachers to build on students’ solutions to challenging
tasks while holding students accountable to learning goals (Kazemi et al. 2009). Recent
research in mathematics education has begun to delineate a set of high-leverage
instructional practices that support students’ achievement of ambitious learning
goals (Franke et al. 2007; NCTM 2000). These practices include launching chal-
lenging tasks so that all students can engage substantially without reducing the
cognitive demand of tasks (Jackson et al. 2013), monitoring the range of solutions
that students are producing as they work on tasks individually or in small groups
(Horn 2012), and building on these solutions during a concluding whole-class
discussion by pressing students to justify their reasoning and to make connections
between their own and others’ solutions (Staples 2007; Stein et al. 2008). These
practices differ significantly from the current practices of most U.S. teachers, and
their development involves reorganizing rather then merely adjusting and elaborating
current practices. The learning demands for teachers include developing a deep
understanding both of the mathematics on which instruction focuses and of students’
learning in particular mathematical domains. In addition, it involves developing
the new high-leverage instructional practices outlined above (e.g., launching cognitively-
demanding tasks effectively; orchestrating whole class discussions of students’
solutions that focus on central mathematical ideas).
18 Design Research for Sytem-Wide Improvement 501
The agenda for instructional improvement that the four collaborating school
systems were pursuing is specific to the U.S. context and was influenced by the
recommendations of several professional organizations including the National
Council of Teachers of Mathematics (1989, 2000), and it is compatible with the
more recent Common Core State Standards Initiative (2010). Improvement efforts
in other countries might be oriented by a different vision of high-quality mathe-
matics instruction. The methodology that we describe will nonetheless be relevant
to all cases where instructional improvement goals involve significant teacher learning
and require teachers to reorganize rather than merely elaborate their current
classroom practices.
In the remainder of this chapter, we describe the key aspects of design studies
conducted to investigate and support system-wide improvement in mathematics
instruction. Although we draw on the MIST study to clarify the rationale for certain
tools and processes, our intent is to describe the methodology in broad terms.
The overall goal of design research at the level of an education system is to investigate
what it takes to support instructional improvement at scale (Bryk and Gomez 2008;
Coburn and Stein 2010; Roderick et al. 2009) by testing and revising conjectures
about school- and system-level supports and accountability relations. Design
studies of this type aim to both support and investigate the process of instructional
improvement at scale by documenting (1) the trajectories of (interrelated) changes
in the school- and system-level settings in which mathematics teachers work, their
instructional practices, and their students’ learning, and (2) the specific means by
which these changes are supported and organized across the system (Cobb and
Smith 2008).
Design studies of this type have two primary objectives. The first objective is
pragmatic, and is to provide leaders of the collaborating educational systems with
timely feedback about how their improvement strategies or policies are actually
playing out that can inform the ongoing revision of instructional improvement
efforts. The second objective is theoretical, and is to contribute to the development
of a generalizable theory of action (Argyris and Schön 1974) for system-wide
instructional improvement in mathematics by synthesizing findings across multiple
educational systems.
Design studies conducted at any level involve iterative cycles of designing to
support learning and of conducting analyses that inform the revision of the current
design. In contrast to studies conducted to investigate students’ learning, design
studies at the system level necessarily entail a partnership with system leaders. As a
consequence, cycles at this level also include a feedback phase in which researchers
share findings with system leaders who have the ultimate authority for making
502 E. Henrick et al.
decisions about improvement strategies. The length of the cycles is much longer
that in other types of design research. For example, in the MIST study, each cycle
spanned an entire school year.
In the sections below, we describe the following aspects of the methodology:
1. developing an initial set of conjectures that comprise an initial theory of action
about school- and system-level supports and accountability relations;
2. recruiting collaborating educational systems;
3. employing an interpretative framework for assessing an educational system’s
designed and implemented instructional improvement strategies;
4. conducting successive design, analysis and feedback cycles by: (a) documenting
each collaborating system’s current improvement strategies, (b) collecting and
analyzing data on how those strategies are actually playing out, (c) sharing findings
and recommendations with system leaders in time to inform their revision of
improvement plans, and (d) assessing the influence of recommendations on the
collaborating system’s instructional improvement strategies;
5. testing and revising conjectures that comprise a theory of action for system-wide
instructional improvement based on ongoing feedback analyses, the current research
literature, and retrospective analyses of data collected in successive cycles.
The basic goal of a design study conducted at any level is to improve an initial design
for supporting learning by testing and revising conjectures inherent in the design
about the course of participants’ learning and the means of supporting their learning
(Cobb et al. 2003a). A key concern when preparing for a system-level design study
is therefore to develop an initial set of conjectures for what it would take to support
improvement in the quality of mathematics teaching across an entire system.
In the MIST study, we found it valuable to follow the basic tenets of design as
articulated by Wiggins and McTighe (1998) and develop initial conjectures by
mapping out from the classroom (cf. Elmore 1979–80). The first step in the process
is to specify explicit goals for students’ mathematical learning and an associated
research-based vision of high-quality mathematics instruction. The learning
demands for teachers can then be identified by comparing the vision of high-quality
mathematics instruction that constitutes the goal for teachers’ learning with their
current instructional practices.
The second step is to develop an initial, tentative, and eminently revisable theory
of action by formulating conjectures about both supports for teachers’ learning and
accountability relations that press them to improve their practices. These conjectures
should clearly attend to teacher professional development and to instructional mate-
rials and associated tools designed for teachers to use. However, it also proved
important in the MIST study to broaden our purview by considering other types of
possible support such as mathematics teacher collaborative meetings scheduled
during the school day, the colleagues to whom teachers turned for instructional
18 Design Research for Sytem-Wide Improvement 503
advice during the school day, and mathematics teacher leaders or coaches who were
charged with supporting teachers in their classrooms and during collaborative
meetings. In addition, research on school instructional leadership oriented us to
consider the role of principals and other school leaders in pressing and holding
teachers accountable for improving the quality of instruction.
It is important to note that conjectures about supports and accountability relations
for teachers’ learning typically have implications for the practices of members of
other role groups. For example, conjectures about the role of coaches in supporting
teachers’ learning have implications for the practices of system leaders responsible
for hiring coaches and for supporting their development of effective coaching
practices. Similarly, conjectures about school leaders’ role in communicating appro-
priate instructional expectations to teachers have implications for the practices of
others in the system who are charged with supporting them in deepening their
understanding of high-quality mathematics instruction.
In following this process of mapping out from the classroom in the MIST study,
it proved critical to balance the ideal with the feasible by taking account of each
collaborating system’s current capacity to support members of different role groups
in improving their practices. As we worked through this process of formulating
initial conjectures, we also found that the challenge of improving classroom instruc-
tion had implications for the practices of personnel at the highest levels of the four
collaborating systems. As a consequence, it proved essential to formulate testable
conjectures about the means of supporting the learning of mathematics teachers,
mathematics coaches, school leaders, and system leaders in a coordinated manner.
It also became apparent as we worked through this process that issues of mathemati-
cal content really matter. The mathematical learning goals for students have direct
implications for the vision of high-quality instruction and thus for the learning
demands on the teachers. These learning demands in turn have implications for
conjectures about supports and accountability relations for teachers’ learning, and
thus for the practices of personnel at all levels of the system.
Research on instructional improvement at the level of an educational system is
thin, and gets thinner the further one moves away from the classroom. In order to
formulate MIST conjectures about potentially productive school- and system-level
supports, we drew on the limited number of relevant empirical studies and conceptual
analyses available in the mathematics education literature on mathematics teaching,
professional development, and teacher collaboration (Kilpatrick et al. 2003; Cobb
and McClain 2001; Franke and Kazemi 2001; Gamoran et al. 2000; Kazemi
and Franke 2004; Little 2002; Stein et al. 1998; Coburn and Russell 2008) and the
literature on education policy and leadership that viewed policy implementation as
involving learning (Blumenfeld et al. 2000; Coburn 2003; McLaughlin and Mitra
2004; Stein 2004; Tyack and Tobin 1995).
The resulting conjectures specified school and district structures, social relation-
ships, and material resources that we anticipated might support mathematics teachers’
and instructional leaders’ ongoing learning. These conjectures assumed that the
district has adopted research-based, inquiry-oriented mathematics textbooks and
would provide sustained teacher professional development.
504 E. Henrick et al.
Based on our experience in the MIST study, two types of conceptual tools are
important when conducting investigations of this type. The first tool is a theory of
action for large-scale instructional improvement in mathematics that consists of
506 E. Henrick et al.
A distinction that proved useful in the MIST study when analyzing the strengths and
weaknesses of improvement strategies is that between intentional learning events
that are ongoing and those that are discrete. The two key characteristics of ongoing
intentional learning events are that they are designed as a series of meetings that
build on one another, and that they involve a relatively small number of participants.
As an example, a mathematics specialist might work regularly with middle-school
principals as a group in order to support them in recognizing high-quality mathematics
instruction when they make classroom observations. Because a small number of
participants is involved, the group might evolve into a genuine community of practice
that works together for the explicit purpose of improving their practices.
It is important to note that although communities of practice can be produc-
tive contexts for professional learning (Horn 2005; Kazemi and Hubbard 2008),
the emergence of a community of practice does not guarantee the occurrence of
learning opportunities that further policy goals (Bryk 2009). Recent research in
both teacher education and educational leadership indicates the importance of
interactions among community members that focus consistently on issues central
to practice (Marks and Louis 1997) and that penetrate beneath surface aspects of
practice to address core suppositions, assumptions, and principles (Coburn and
Russell 2008). This in turn suggests the value of one or more members of the
community having already developed relatively accomplished practices so that
they can both push interactions to greater depth (Coburn and Russell 2008) and
508 E. Henrick et al.
provide concrete illustrations that ground exchanges (Penuel et al. 2006). The
critical role of expertise in a community of practice whose mission is to support
participants’ learning is consistent with the importance attributed to “more
knowledgeable others” in sociocultural accounts of learning (Bruner 1987; Cole
1996; Forman 2003).
The key aspects of ongoing intentional learning events that we have highlighted
are consistent with the qualities of effective teacher professional development iden-
tified in both qualitative and quantitative studies. These qualities include extended
duration, collective participation, active learning opportunities, a focus on problems
and issues that are close to practice, and attention to the use of tools that are integral
to practice (Borko 2004; Cohen and Hill 2000; Desimone et al. 2002; Garet et al.
2001). We view ongoing intentional learning events that have these qualities as a
primary means of supporting consequential professional learning that involves the
reorganization of practice.
Discrete intentional learning events include one-off professional development
sessions as well as a series of meetings that are not designed to build on each other.
For example, system leaders might organize monthly meetings for principals. These
meetings would be discrete rather than ongoing intentional learning events if
principals engage in activities that focused on instructional leadership in mathe-
matics only occasionally, and these activities do not build on each other. Discrete
intentional learning events can be valuable in supporting the development of
specific capabilities that elaborate or extend current practices (e.g., introducing a
classroom observation tool that fits with principals’ current practices and is designed
to make their observations more systematic). However, they are by themselves
unlikely to be sufficient in supporting the significant reorganization of practice
called for in systems that are pursuing ambitious instructional agendas.
Learning opportunities are not limited to those that are intentionally designed, but
can also arise incidentally for targets of policy as they collaborate with others to
carry out functions of the school or educational system. For example, if principals
meet regularly with mathematics coaches to discuss the quality of mathematics
teaching in the school, these meetings could provide learning opportunities for the
principal even though these meetings were not designed to support the principals’
learning. In general, the extent to which regularly scheduled meetings with a more
knowledgeable other involve significant learning opportunities depends on both the
focus of interactions (e.g., the nature of teachers’ classroom practices and student
learning opportunities) and on whether the expert has in fact developed relatively
accomplished practices and the novice recognizes and defers to that expertise
(Elmore 2006; Mangin 2007). However, the strategy of relying primarily on inciden-
tal learning events to support professional learning appears to be extremely risky.
18 Design Research for Sytem-Wide Improvement 509
of supporting learning (Borko 2004; Cobb et al. 2009; Lehrer and Lesh 2003; Meira
1998). In the context of large-scale instructional improvement efforts, designed
tools can also play a second important role by supporting members of a particular
role group in developing compatible practices, and by supporting the alignment
of the practices developed by members of different role groups (e.g., teachers, prin-
cipals, coaches). Examples include textbooks, curriculum guides, state mathematics
objectives, classroom observation protocols, reports of test scores, student written
work, and written statements of school and educational system policies.
Large-scale instructional improvement efforts almost invariably involve the
introduction of a range of new tools designed to be used in practice, including newly
adopted instructional materials and revised curriculum frameworks for teachers,
and new classroom observation protocols and data management systems for principals.
The findings of a number of studies conducted in the learning sciences substantiate
Pea’s (1993) claim that the incorporation of a new tool into current practices can
support the reorganization of those practices (Lehrer and Schauble 2004; Meira
1998; Stephan et al. 2003). However, it is also apparent that people frequently use
new tools in ways that fit with current practices rather than reorganizing those practices
as the designers of the tool intended (Wenger 1998). For example, the findings of a
number of studies of policy implementation and of teaching indicate that teachers
often assimilate new instructional materials to their current instructional practices
rather than reorganize how they teach as envisioned by the developers of the materials
(Cohen and Hill 2000; Remillard 2005; Spillane 1999). These findings suggest
that the design of tools for professional learning should be coordinated with the
development of supports for their increasingly accomplished use.
As a first design heuristic, it is important that users see a need for the tool when
it is introduced (Cobb 2002; Lehrer et al. 2000). This implies that either the tool
should be designed to address a problem of current practice or it should be feasible
to cultivate the need for the tool during intentional learning events. As an illustration,
consider a classroom observation protocol that has been designed to support principals
in focusing not merely on whether students are engaged but also on whether signifi-
cant learning opportunities arise for them. Most principals are unlikely to see a need
for the new observation form unless it is introduced during a series of intentional
learning events that might, for example, focus on the relation between classroom
learning opportunities and student achievement.
Second, it is also important that the tool be designed so that intended users can
begin to use it shortly after it has been introduced in relatively elementary ways that
are nonetheless compatible with the designers’ intentions and do not involve what
A. Brown (1992) termed lethal mutations. In the case of our example, it would seem
advisable to minimize the complexity of the observation protocol given the significant
reorganization of practice that most principals would have to make to use it in a way
compatible with the designers’ intentions (Nelson and Sassi 2005).
Third, in using the tool in rudimentary but intended ways, users begin to reorganize
their practices as they incorporate the tool. The challenge is then to support their
continued reorganization of practice by scaffolding their increasingly proficient
use of the tool either during intentional learning events or as they co-participate in
18 Design Research for Sytem-Wide Improvement 511
organizational routines with an accomplished user (J. S. Brown and Duguid 1991;
Lave 1993; Rogoff 1990). In the case of the observation protocol, for example,
mathematics coaches might support principals’ use of the tool as they conduct
Learning Walks™ together. Just as the failure to provide sustained teacher profes-
sional development around a new curriculum can lead to difficulties (Crockett
2007), failure to scaffold principals,’ coaches,’ and others’ use of new tools is also
likely be problematic.
18.11 Summary
Our analysis of the four types of support for learning indicates that improvement
strategies that are likely to be effective in supporting consequential professional
learning involve some combination of new positions that provide expert guidance,
ongoing intentional learning events in which tools are used to bridge to practice,
carefully designed organizational routines carried out with a more knowledgeable
other, and the use of new tools whose incorporation into practice is supported. We
do not discount the support that discrete intentional learning events and incidental
learning events might provide and recommend taking them into account when
assessing systems’ improvement strategies. However, research on professional
learning and on students’ learning in particular content domains indicates that they
are, by themselves, rarely sufficient to support significant reorganizations of
practice (Garet et al. 2001). The analysis we conducted during the MIST study of
the four districts’ instructional improvement efforts over a 4-year period is consis-
tent with this conclusion.
Thus far, we have discussed the key issues that need to be addressed when preparing
for a system-level design study. We now focus on the process of conducting a study
by enacting successive design, analysis, and feedback cycles. Each of the four
cycles we conducted in the MIST study spanned an entire school year, which is
much longer than in other types of design experiments (a day in the case of a classroom
design study and a few weeks or less for a professional development study).
In planning cycles, it is important to take account of patterns in system leaders’
work across the school year. In the U.S. educational systems, the school year runs
from August until May or the beginning of June. In the MIST study, we delayed
interviewing district leaders to learn about their current instructional improvement
plans until October of each year after they had finalized their plans for that school
year. We then determined that January-March would be the best time to collect data
because it would give us enough time to conduct the feedback analyses, and would
not interfere with standardized testing, which typically occurs near the end of the
512 E. Henrick et al.
school year. We shared our feedback and recommendations with district leaders
in May of each year so they could take account of our findings when they revised
district instructional improvement strategies over the summer.
The first phase of a cycle involves documenting the vision of high-quality mathematics
instruction that orients each collaborating system’s instructional improvement
initiative and the strategies that each system is implementing in an attempt to
achieve its vision. In the MIST study, it proved feasible to document the four
collaborating systems’ improvement strategies by interviewing six to ten key system
leaders in each system and by collecting system-level planning and implementation
documents in October of each year. The leaders were from a number of system units
that had a stake in mathematics teaching and learning. They included Curriculum
and Instruction that is responsible for selecting instructional materials and for
providing professional development for teacher and coaches, Leadership that is
responsible for providing professional development for school leaders and for holding
school leaders accountable, ELL that is responsible for supporting the learning of
English Language Learners, Special Education that is responsible for supporting the
learning of students who receive special education services, and Research and
Evaluation that is responsible for generating and analyzing data on students, teachers,
schools, and the district.
In addition to asking about current initiatives in middle-grades mathematics, it
proved useful to include interview questions that focused on student demographics,
the impact of regional and national policies, and the historical context of the system
including prior reform initiatives and previous mathematics instructional materials
and assessments. (Interview protocols are downloadable at http://vanderbi.lt/mist).
The transcribed interviews and the artifacts can be analyzed through an inductive
coding process in order to discern broad consistencies across participants in each
system. The goal in conducting these analyses is to clarify the intended or envisioned
practices of members of particular role groups (e.g., teachers, coaches, principals),
the intended means of supporting the learning of members of those groups, and
system leaders’ rationales for why the supports might enable members of each role
group to develop the envisioned forms of practice.
In the MIST study, we reported our findings for each collaborating system in a
five-page document. This System Design Document named each district strategy
and described the intended supports and accountability relations for members of
each role group. We shared this document with system leaders to determine whether
it accurately represented their plan for instructional improvement. We made
revisions until the district leaders agreed that the document accurately represented
their intended plan.
18 Design Research for Sytem-Wide Improvement 513
System Design Documents serve four useful purposes. First, they are useful in
preparing for the next phase of a cycle that involves collecting data to learn how
each system’s intended strategies are being implemented in schools. Second, the
major strategies identified in each document provide a framework for organizing
the feedback given to the system leaders about how their improvement strategies are
playing out. Third, system leaders who participated in the MIST study reported that
they found these documents useful in clarifying their improvement strategies
with others across the system. Finally, the System Design Documents produced in
successive cycles provide a record of changes in a system’s improvement policies
over time, thus enabling the system leaders to monitor progress and researchers to
document the influence of their recommendations on the improvement strategies
that system leaders attempted to implement in the next cycle.
To illustrate, we refer to the System Design Document we created for District B,
one of the four participating districts, during our first year of working with the
district. (Table 18.1 provides a summary of District B’s System Design Document,
2007–2008). The overall goal of the instructional improvement effort in District B
was to ensure that all students had opportunities to learn through engagement with
a rigorous mathematics curriculum, that teachers and school leaders had high
expectations for students’ learning, and that achievement disparities between White
students and traditionally underserved groups of students were eliminated. District
B was in its first year of implementing an inquiry-oriented mathematics curriculum.
To support this implementation, the district had assigned a mathematics coach to
each middle school the previous year and had provided them with a significant
amount of professional development that focused on both teaching the new curriculum
Table 18.1 Summary of a System Design Document for District B, 2007–2008 school year
District B instructional improvement goal
Ensure that all students have opportunities to learn through engagement with a rigorous
curriculum, that teachers and school leaders have high expectations for students’ learning, and
that achievement gaps between White students and traditionally underserved groups of students
are eliminated
Improvement strategies Supports for role groups to develop the intended
forms of practice
1. Develop principals and coaches Professional development for principals on observing
who work together to improve classroom and providing feedback to teachers
instruction Principal and the math coach are required to meet
weekly to discuss classroom instruction and supports
for teachers
Professional development for math coaches
2. Support teachers in teaching a Professional development for teachers on the
rigorous mathematics curriculum inquiry-oriented curriculum
effectively A comprehensive curriculum framework to support
the implementation of the rigorous curriculum
514 E. Henrick et al.
effectively and coaching other mathematics teachers at their schools. Each coach
taught for half of the school day and served as a coach for the remainder of the day.
The first improvement strategy that we identified was to support principals’ and
mathematics coaches’ development as instructional leaders who worked together to
improve the quality of mathematics instruction. Principals were expected to observe
classroom instruction regularly to assess the quality of teachers’ instructional
practices and determine their needs based on these observations. Principals received
professional development on observing and assessing the quality of mathematics
instruction, and were expected to meet with the mathematics coach at their school
every week to discuss the quality of classroom instruction and assess teachers’ needs.
The second strategy was to support teachers in teaching the inquiry-oriented
curriculum effectively. Supports for teachers’ learning included teacher professional
development provided by the mathematics coaches and a district Curriculum
Framework that aligned the curriculum with the state standards and provided
guidance on differentiating instruction for particular groups of students, especially
English Language Learners and special education students.
We used the Interpretive Framework described above to assess the strengths
and limitations of these two improvement strategies. District leaders clearly and
consistently articulated the forms of practice they intended teachers, coaches, and
principals would develop (e.g., principals were to observe classrooms and provide
feedback to improve instructional practices). In addition, these intended forms
of practice were compatible with the district’s overall goal of supporting teachers’
development of ambitious instructional practices. However, we considered it
unlikely that the supports for various role groups’ learning would be adequate.
With regard to the first strategy, principals would have to distinguish between
weak and strong enactments of ambitious instructional practices if they were to give
teachers effective feedback. The supports for principals’ learning included profes-
sional development on observing classroom instruction. We questioned whether
these ongoing intentional learning events would be effective because they focused
on characteristics of high quality instruction that were independent of subject matter
area, and because these characteristics were relatively global. Principals were also
expected to meet regularly with the mathematics coach to discuss the quality of
classroom instruction. Although these discussions might focus on content-specific
instructional practices, we doubted whether the resulting incidental learning
opportunities would be adequate. In addition, the coaches were new to the role and
it was not clear that they had developed sufficient expertise to support principals in
assessing the quality of instruction.
With regard to the second strategy, the effective implementation of the inquiry-
oriented curriculum that the district had adopted required that most teachers
significantly reorganize their instructional practices. Teachers participated in ongo-
ing intentional learning events- 4 days of district professional development led
by the math coaches. However, it was not clear that mathematics coaches had
developed the expertise to lead this professional development effectively given that
they were also teaching the new curriculum for the first time.
18 Design Research for Sytem-Wide Improvement 515
The next phase of the design cycle involves collecting data to document how each
system’s strategies are playing out in schools and classrooms. In the MIST study,
we collected multiple types of data to document the four systems’ instructional
improvement efforts: audio-recorded interviews conducted with the 200 participants;
on-line surveys for teachers, coaches, and school leaders; video-recordings of two
consecutive lessons in the 120 participating teachers’ classrooms, coded using the
Instructional Quality Assessment (IQA) (Boston 2012; Matsumura et al. 2008);
teachers’ and coaches’ scores on the Mathematics Knowledge for Teaching (MKT)
instrument (Hill et al. 2004); video-recordings of select district professional devel-
opment; audio-recordings of teacher collaborative planning meetings; and an on-line
assessment of teacher networks completed by all mathematics teachers in the
participating schools. In addition, the districts provided us with access to mathematics
achievement data for students in the participating 120 teachers’ classrooms. The
interviews and online surveys focused on the school and district settings in which
the participating teachers and school leaders worked and gave particular attention to
the formal and informal supports on which they could draw to improve their practices,
as well as to whom they were accountable and for what they were accountable.
As we had only 3 months to analyze data before district leaders began planning
strategies for the following school year, we limited the data we analyzed to provide
feedback about how districts’ strategies were being implemented to the audio-recorded
interviews conducted with the 50 participants in each district. (As our collaboration
with each district continued over 4 years, we were able to share additional findings
from other data sources, for example video-recordings of classroom instruction, in
subsequent reports as they became available.)
One of the challenges when conducting a system-level design study is to analyze
a large amount of data in a relatively short period of time while ensuring that
the findings shared with system leaders are reliable. In this context, an important
criterion for reliability is that claims about how improvement strategies are being
implemented can be justified by backtracking through successive steps of the analysis
to the raw data. This method involves using a series of structured tools to first
summarize transcriptions of each participant interview, and then to triangulate and
synthesize the responses both across participants in each school and across teachers,
coaches, and school leaders in each collaborating system.
In MIST, a team member completed an Interview Summary Form (ISF) for each
interview (teacher, coach, school leader, system leader). The ISF summarized each
participant’s response to interview questions that were central to understanding how
improvement strategies were playing out in schools. This information was then
synthesized across all participants in a school using the School Summary Form
(SSF). This required the triangulation of participant responses at each school, citing
evidence from the ISFs. Additional forms included a Principal Summary Form
(PSF), a Coach Summary Form (CSF) and a Teacher Summary Form (TSF) that
516 E. Henrick et al.
Table 18.2 Example of an analysis for the District B year 4 System Feedback and Recommendations Report
Example District B coach interview Coach descriptions about district and Principal expectations about the role of Excerpt from System
transcripts (from January interview; edited principal expectations about the role of the math coach: Feedback and
slightly for readability) the math coach: Recommendations
Excerpt from Coach Summary Form Excerpt from Principal Summary Form Report
(CSF) (includes 7 coaches) (PSF) (includes 11 school administrators)
I: So what are you being held Four out of the seven coaches Almost all of the administrators talked Our recommendation
accountable for as a math coach? interviewed were a part of a state grant about the role of the mathematics coach is that the district
and received additional support from the involving working with teachers and not needs to clarify the
state department of education. These being an evaluative presence. Almost all role of the coach with
coaches reported that the state expected principals expect the math coaches to both school leaders
them to model instruction, co-teach, work with teachers individually and in and the coaches by
provide training to develop teachers’ groups. Commonly cited specific making it explicit that
instructional practices, conference, activities included modeling, coaches should spend
observe teachers, conduct a coaching co-teaching, providing PD, and the majority of their
cycle, monitor student learning, provide observing/intervening in classrooms. time assisting either
instructional materials for teachers, assist Multiple administrators said that the groups of teachers or
Design Research for Sytem-Wide Improvement
teachers with the curriculum, and help math coach was responsible for helping individual teachers
write lesson plans. In addition, one teachers with the curriculum with instruction in
coach reported the expectation to work their classrooms
C: Well I guess my principal definitely with teachers both one-on-one and
has a lot of expectations. He knows that groups. The coaches supported by this
no matter what he asks me to do I’m going grant indicated they received a letter
to do it well, so even outside of what I from the state that listed these
should be doing. When we started the expectations
school year he asked me to help with
scheduling, and we had a major issue.
Maybe it is my job and maybe it’s not my
job, but it’s important to me that every
student started in the right math class
(continued)
517
Table 18.2 (continued)
518
I have to do data analysis, so when The other math coaches did not describe Some expectations emerged from only
students take assessments, I review as many concrete practices when asked a few administrators. In at least two
and compile that data and then have about expectations for their role; three schools, the principal expected the math
discussions with the teachers regarding reported being expected to support coach to lead collaborative time, and in
that data teachers who are not strong with one school, the math coach was expected
I have to hold our weekly grade level questioning and pacing and help to maintain a model classroom.
planning meetings and make sure that struggling teachers with certain concepts Administrators in two schools also
those are as productive as possible, that mentioned that the math coach should
teachers are you know sharing information analyze data
and are working together. Those are the
things I would say I’m most accountable
for, you know and then of course there’s
going out and coaching, but there’s not
much accountability with that
You know obviously if a teacher’s not While most of the coaches say their Some themes emerged that were more
doing a good job there’s a discussion, principals’ expectations are similar to the operational in nature such as ordering
but for the most part I really have a pretty district’s expectations, five coaches supplies and obtaining resources. Some
solid department. Everybody can be describe expectations above and beyond administrators also said that the math
coached and everybody can get better, the district’s expectations (e.g., tutoring, coach should be working with students
so it’s not like oh, you guys are solid, you data analysis). Four coaches mentioned individually
don’t need any coaching, but there’s not that in addition to working with teachers,
a lot of accountability that goes with it they are expected to analyze data and run
tutoring programs
Note that the table does not represent all of the data that were used to reach the findings and recommendations listed here. The data that were analyzed included
the coach, teacher, and principal interviews (which were triangulated), previous years’ video-recordings of classroom instruction and assessments of teachers’
and coaches’ mathematical knowledge for teaching, and previous years’ feedback and recommendation reports. As an illustration, this table includes one
coach’s interview response related to expectations around their role as a math coach, along with the subsequent related syntheses from the CSF and the PSF
E. Henrick et al.
18 Design Research for Sytem-Wide Improvement 519
The first phase of the next data collection, analysis, and feedback cycle involves
interviewing system leaders again to document their revised instructional improve-
ment strategies. The influence of recommendations made to system leaders can be
assessed by comparing their revised and prior improvement strategies. As we have
noted, assessing the influence of the recommendations is important both because
520 E. Henrick et al.
To this point, we have focused on the pragmatic objective of providing leaders of the
collaborating systems with timely feedback about how their improvement strategies
are actually playing out that can inform the revision of their instructional improvement
efforts. We now consider the theoretical objective of contributing to a generalizable
theory of action for system-wide instructional improvement in mathematics. In
doing so, we draw on our experience in the MIST study by discussing three types of
evidence that can inform the revision of conjectures that comprise the theory of
action: findings from feedback analyses about how the collaborating systems’
instructional improvement strategies are being implemented, the current research
literature, and the findings of retrospective analyses conducted by drawing on the
multiple sources of data collected in each cycle.
As we have noted, relevant research that can inform the design of instructional
improvement strategies becomes increasingly thin the further one moves away from
the classroom (Cobb et al. 2013; Honig 2012). Nonetheless, findings reported in
the literature can, on occasion, provide evidence for the revision of current conjectures.
522 E. Henrick et al.
The retrospective analysis of data collected during successive design and analysis
cycles is a key aspect of design studies conducted at any level. In the case of system-
level design studies, a primary goal of retrospective analyses is to investigate key
conjectures of the theory of action for instructional improvement. Based on our
work in the MIST study, we recommend that mutually informing lines of retrospective
analyses be established that focus on the major types of supports conjectured to be
important for instructional improvement (e.g., teacher collaborative time, teacher
networks, mathematics coaching, school instructional leadership).
As we have indicated, the types of data that can be analyzed to give collaborating
systems feedback about how their improvement strategies are playing out is constrained
by the need to ensure that the feedback is timely and can inform system leaders’
revision of their strategies. Retrospective analyses that can inform the revision of
the theory of action draw on a range of additional types of data that are collected
during each data collection, analysis, and feedback cycle. The primary concern
when making decisions about the types of data to collect is that the key constructs
of each conjecture are assessed including the relevant aspects of teachers’ knowledge
and instructional practices. For example, if the vision of high-quality mathematics
instruction that constitutes the goal for teachers’ learning requires that teachers
deepen their mathematical knowledge, then it is important to include an appropriate
measure of this knowledge. Similarly, if teachers’ informal professional networks
are conjectured to be an important support for their learning, then it is important to
develop instruments for assessing the relevant aspects of their networks (e.g., who
teachers turn to for instructional advice, frequency of their interactions with those
people, and content of their interactions).
The MIST team is currently conducting five interrelated lines of analysis that
focus on district-level and school-level teacher professional development (including
mathematics teacher collaborative meetings), teacher networks, mathematics coach-
ing, school instructional leadership, and district instructional leadership. We discuss
the current version of our theory of action for instructional improvement in mathe-
matics in the next section of this chapter.
Presenting the current iteration of our theory of action in any detail is beyond the scope
of this chapter, and we refer the reader to Cobb and Jackson (2011). To illustrate our
current conjectures, we focus on one component of the theory of action, school
instructional leadership.
18 Design Research for Sytem-Wide Improvement 523
their observations about the quality of teachers’ instructional practices, discuss how
the coach’s work with teachers is progressing, jointly select teachers with whom the
coach should work, and plan for future work with groups of teachers.
The ongoing analyses we have conducted while developing feedback for the
collaborating districts indicate that it is challenging for school leaders, most of
whom are not mathematics specialists, to develop the three instructional leadership
practices that we have described. As a consequence, we have also developed conjectures
about the nature of professional development that might support their development
of these practices.
First, we conjecture that if school leaders are to effectively and realistically press
teachers to improve the quality of instruction, professional development for school
leaders should enable them to recognize the instructional practices that are the focus
of teacher professional development, and to distinguish between low- and high-
quality enactments of those practices. We also conjecture that a consistent emphasis
on the same instructional practices across teacher, coach, and school leader profes-
sional development will contribute to the development of compatible visions of
high-quality instruction and to the alignment of supports for teachers’ learning.
Second, we conjecture that professional development should attend explicitly to
how to provide feedback to teachers that communicates expectations for ambitious
instruction. This might involve school leaders and district mathematics specialists
observing instruction or watching video-recordings of specific phases of lessons
and discussing the feedback they would provide.
Third, we conjecture that professional development should clarify the role of
coaches and mathematics teacher collaborative meetings in supporting teachers’
development of ambitious instructional practices. We have documented several
cases in which a school leader has taken over the agenda of mathematics teacher
meetings to the detriment of the participating teachers’ learning. We therefore
conjecture that it is important to give particular attention to how the distribution of
instructional leadership between coaches and school leaders should reflect their
complementary areas of expertise (Elmore 2006).
The contrast between our initial and current conjectures for school leadership is
representative of the changes we have made as we have revised and elaborated our
initial conjectures. The level of specificity of our current conjectures is essential if
we are to provide district leaders with actionable guidance on how they might
support instructional improvement in mathematics on a large scale. We regard the
current iteration of our theory of action as a work in progress and are further testing
and revising our conjectures as we continue to collaborate with two of the four
districts for a further 4 years.
18.18 Conclusion
Our purpose in this chapter has been to describe a design research approach for studying
and supporting improvements in the quality of mathematics teaching on a large scale.
The aim of this methodology is to both provide the leaders of educational systems,
such as urban school districts in the U.S., with feedback that can inform their instructional
18 Design Research for Sytem-Wide Improvement 525
Acknowledgments The analysis reported in this article was supported by the National Science
Foundation under grant Nos. ESI 0554535 and DRL 1119122. The opinions expressed do not
necessarily reflect the views of the Foundation. The empirical cases that we present in this article
are based on research conducted in collaboration with Thomas Smith (co-PI), Dan Berebitsky,
Glenn Colby, Annie Garrison, Lynsey Gibbons, Karin Katterfeld, Adrian Larbi-Cherif, Christine
Larson, Chuck Munter, Brooks Rosenquist, Rebecca Schmidt, and Jonee Wilson.
References
Argyris, C., & Schön, D. (1974). Theory of practice. San Francisco: Jossey-Bass.
Blumenfeld, P., Fishman, B. J., Krajcik, J. S., Marx, R., & Soloway, E. (2000). Creating usable
innovations in systemic reform: Scaling-up technology—Embedded project-based science in
urban schools. Educational Psychologist, 35, 149–164.
Borko, H. (2004). Professional development and teacher learning: Mapping the terrain. Educational
Researcher, 33(8), 3–15.
Boston, M. D. (2012). Assessing the quality of mathematics instruction. Elementary School
Journal, 113(1), 76–104.
Brown, A. L. (1992). Design experiments: Theoretical and methodological challenges in creating
complex interventions in classroom settings. Journal of the Learning Sciences, 2, 141–178.
Brown, J. S., & Duguid, P. (1991). Organizational learning and communities-of-practice: Toward
a unified view of working, learning, and innovation. Organization Science, 2(1), 40–57.
Brown, J. S., Collins, A., & Duguid, P. (1989). Situated cognition and the culture of learning.
Educational Researcher, 18, 32–42.
Bruner, J. (1987). Actual minds, possible worlds. Cambridge: Harvard University Press.
Bryk, A. S. (2009). Support a science of performance improvement. Phi Delta Kappan, 90(8),
597–600.
Bryk, A. S., & Gomez, L. M. (2008). Reinventing a research and development capacity. In F. Hess
(Ed.), The future of educational entrepreneurship: Possibilities for school reform (pp. 181–187).
Cambridge: Harvard Education Press.
Cobb, P. (2002). Reasoning with tools and inscriptions. The Journal of the Learning Sciences,
11(2&3), 187–216.
Cobb, P., & Jackson, K. (2011). Towards an empirically grounded theory of action for improving
the quality of mathematics teaching at scale. Mathematics Teacher Education and Development,
13(1), 6–33.
526 E. Henrick et al.
Cobb, P., & Jackson, K. (2012). Analyzing educational policies: A learning design perspective. The
Journal of the Learning Sciences, 21(4), 487–521.
Cobb, P., & McClain, K. (2001). An approach for supporting teachers’ learning in social context.
In F.-L. Lin & T. Cooney (Eds.), Making sense of mathematics teacher education
(pp. 207–232). Dordrecht: Kluwer.
Cobb, P., & Smith, T. (2008). District development as a means of improving mathematics teaching
and learning at scale. In K. Krainer & T. Wood (Eds.), International handbook of mathematics
teacher education: Vol. 3. Participants in mathematics teacher education: Individuals, teams,
communities and networks (Vol. 3, pp. 231–254). Rotterdam: Sense.
Cobb, P., & Steffe, L. P. (1983). The constructivist researcher as teacher and model builder. Journal
for Research in Mathematics Education, 14, 83–94.
Cobb, P., Confrey, J., diSessa, A., Leher, R., & Schauble, L. (2003a). Design experiments in
educational research. Educational Researcher, 32(1), 9–13.
Cobb, P., McClain, K., Lamberg, T., & Dean, C. (2003b). Situating teachers’ instructional
practices in the institutional setting of the school and school district. Educational Researcher,
32(6), 13–24.
Cobb, P., Zhao, Q., & Dean, C. (2009). Conducting design experiments to support teachers’ learning:
A reflection from the field. Journal of the Learning Sciences, 18, 165–199.
Cobb, P., Jackson, K., Smith, T., Sorum, M., & Henrick, E. (2013). Design research with educa-
tional systems: Investigating and supporting improvements in the quality of mathematics
teaching and learning at scale. In B. J. Fishman, W. R. Penuel, A.-R. Allen, & B. H. Cheng
(Eds.), Design based implementation research: Theories, methods, and exemplars (National
Society for the Study of Education Yearbook, Vol. 112, Issue 2, pp. 320–349). New York:
Teachers College.
Coburn, C. E. (2003). Rethinking scale: Moving beyond numbers to deep and lasting change.
Educational Researcher, 32(6), 3–12.
Coburn, C. E., & Russell, J. L. (2008). District policy and teachers’ social networks. Educational
Evaluation and Policy Analysis, 30(3), 203–235.
Coburn, C. E., & Stein, M. K. (Eds.). (2010). Research and practice in education: Building
alliances, bridging the divide. New York: Rowman & Littlefield Publishing Group.
Cohen, D. K., & Hill, H. C. (2000). Instructional policy and classroom performance: The
mathematics reform in California. Teachers College Record, 102, 294–343.
Cole, M. (1996). Cultural psychology. Cambridge: Belknap Press of Harvard University Press.
Common Core State Standards Initiative. (2010). Common core state standards for mathematics.
Retrieved from http://www.corestandards.org
Crockett, M. D. (2007). Teacher professional development as a critical resource in school reform.
Journal of Curriculum Studies, 39, 253–263.
Darling-Hammond, L. (2007). The flat earth and education: How America’s commitment to equity
will determine our future. Educational Researcher, 36(6), 318–334.
Design-Based Research Collaborative. (2003). Design-based research: An emerging paradigm for
educational inquiry. Educational Researcher, 32(1), 5–8.
Desimone, L., Porter, A. C., Garet, M., Suk Yoon, K., & Birman, B. (2002). Effects of professional
development on teachers’ instruction: Results from a three-year study. Educational Evaluation
and Policy Analysis, 24, 81–112.
Elmore, R. F. (1979–80). Backward mapping: Implementation research and policy decisions.
Political Science Quarterly, 94, 601–616.
Elmore, R. F. (2004). School reform from the inside out. Cambridge: Harvard Education Press.
Elmore, R. F. (2006). Leadership as the practice of improvement. OECD International Conference
on Perspectives on Leadership for Systemic Improvement. London.
Feldman, M. S. (2000). Organizational routines as a source of continuous change. Organization
Science, 11, 611–629.
Feldman, M. S. (2004). Resources in emerging structures and processes of change. Organization
Science, 15, 295–309.
18 Design Research for Sytem-Wide Improvement 527
Kazemi, E., & Hubbard, A. (2008). New directions for the design and study of professional
development: Attending to the coevolution of teachers’ participation across contexts. Journal
of Teacher Education, 59, 428–441.
Kazemi, E., Franke, M. L., & Lampert, M. (2009). Developing pedagogies in teacher education to
support novice teachers’ ability to enact ambitious instruction. In R. Hunter, B. Bicknell, &
T. Burgess (Eds.), Crossing divides: Proceedings of the 32nd annual conference of the
Mathematics Education Research Group of Australasia (Vol. 1, pp. 12–30). Palmerston North:
MERGA.
Kilpatrick, J., Swafford, J., & Findell, B. (Eds.). (2001). Adding it up: Helping children learn
mathematics. Washington, DC: National Academy Press.
Kilpatrick, J., Martin, W. G., & Schifter, D. (Eds.). (2003). A research companion to principles and
standards for school mathematics. Reston: National Council of Teachers of Mathematics.
Kozulin, A. (1990). Vygotsky’s psychology: A biography of ideas. Cambridge: Harvard University
Press.
Lampert, M., & Graziani, F. (2009). Instructional activities as a tool for teachers’ and teacher
educators’ learning. The Elementary School Journal, 109(5), 491–509.
Lampert, M., Beasley, H., Ghousseini, H., Kazemi, E., & Franke, M. L. (2010). Using designed
instructional activities to enable novices to manage ambitious mathematics teaching. In M. K.
Stein & L. Kucan (Eds.), Instructional explanations in the disciplines (pp. 129–141). New York:
Springer.
Lave, J. (1993). The practice of learning. In S. Chaiklin & J. Lave (Eds.), Understanding practice:
Perspectives on activity and context (pp. 3–32). Cambridge: Cambridge University Press.
Lave, J., & Wenger, E. (1991). Situated learning: Legitimate peripheral participation. London:
Cambridge University Press.
Lehrer, R., & Lesh, R. (2003). Mathematical learning. In W. Reynolds & G. Miller (Eds.),
Comprehensive handbook of psychology (Vol. 7, pp. 357–391). New York: John Wiley.
Lehrer, R., & Schauble, L. (2004). Modeling natural variation through distribution. American
Educational Research Journal, 41, 635–679.
Lehrer, R., Schauble, L., & Penner, D. (2000). The inter-related development of inscriptions and
conceptual understanding. In P. Cobb, E. Yackel, & K. McClain (Eds.), Symbolizing, mathema-
tizing, and communicating: Perspectives on discourse, tools, and instructional design (pp. 325–
360). Mahwah, NJ: Erlbaum.
Little, J. W. (2002). Locating learning in teachers’ communities of practice: Opening up problems
of analysis in records of everyday work. Teaching and Teacher Education, 18, 917–946.
Lobato, J. (2003). How design experiments can inform a rethinking of transfer and vice versa.
Educational Researcher, 32(1), 17–20.
Mangin, M. M. (2007). Facilitating elementary principals’ support for instructional teacher
leadership. Educational Administration Quarterly, 43, 319–357.
Marks, H. M., & Louis, K. S. (1997). Does teacher empowerment affect the classroom? The
implications of teacher empowerment for instructional practice and student academic
performance. Educational Evaluation and Policy Analysis, 19, 245–275.
Matsumura, L. C., Garnier, H., Slater, S. C., & Boston, M. D. (2008). Toward measuring instructional
interactions “at-scale”. Educational Assessment, 13(4), 267–300.
McLaughlin, M. W., & Mitra, D. (2004, April). The cycle of inquiry as the engine of school reform:
Lessons from the Bay Area School Reform Collaborative. Paper presented at the annual meeting
of the American Educational Research Association, San Diego.
Meira, L. (1998). Making sense of instructional devices: The emergence of transparency in
mathematical activity. Journal for Research in Mathematics Education, 29, 121–142.
National Council of Teachers of Mathematics. (1989). The curriculum and evaluation standards
for school mathematics. Reston: Author.
National Council of Teachers of Mathematics. (2000). Principles and standards for school
mathematics. Reston: Author.
Nelson, B. S., & Sassi, A. (2005). The effective principal: Instructional leadership for high-quality
learning. New York: Teachers College Press.
18 Design Research for Sytem-Wide Improvement 529
Pea, R. D. (1993). Practices of distributed intelligence and designs for education. In G. Salomon
(Ed.), Distributed cognitions (pp. 47–87). New York: Cambridge University Press.
Penuel, W. R., Frank, K. A., & Krause, A. (2006). The distribution of resources and expertise and
the implementation of schoolwide reform initiatives. Paper presented at the Seventh International
Conference of the Learning Sciences, Bloomington.
Penuel, W. R., Fishman, B. J., Cheng, B. H., & Sabelli, N. (2011). Organizing research and
development at the intersection of learning, implementation, and design. Educational
Researcher, 40(7), 331–337.
Remillard, J. (2005). Examining key concepts in research on teachers’ use of mathematics
curricula. Review of Educational Research, 75, 211–246.
Robinson, V. M. J., Lloyd, C. A., & Rowe, K. J. (2008). The impact of leadership on student out-
comes: An analysis of the differential effects of leadership types. Educational Administration
Quarterly, 44, 635–674.
Roderick, M., Easton, J. Q., & Sebring, P. B. (2009). The Consortium on Chicago School Research:
A new model for the role of research in supporting urban school reform. Chicago: The
Consortium on Chicago School Research at the University of Chicago Urban Education Institute.
Rogoff, B. (1990). Apprenticeship in thinking: Cognitive development in social context. Oxford:
Oxford University Press.
Rogoff, B. (1997). Evaluating development in the process of participation: Theory, methods, and
practice building on each other. In E. Amsel & A. Renninger (Eds.), Change and development:
Issues of theory, application, and method (pp. 265–285). Hillsdale: Erlbaum.
Schoenfeld, A. H. (2006). Design experiments. In J. L. Green, G. Camilli, P. B. Ellmore, &
A. Skukauskaite (Eds.), Handbook of complementary methods in education research
(pp. 193–206). Washington, DC: American Educational Research Association.
Sfard, A. (2008). Thinking as communicating: Human development, the growth of discourses, and
mathematizing. Cambridge: Cambridge University Press.
Sherer, J. Z., & Spillane, J. P. (2011). Constancy and change in work practice in schools: The role
of organizational routines. Teachers College Record, 113(3), 611–657.
Spillane, J. P. (1999). External reform initiatives and teachers’ efforts to reconstruct their practice:
The mediating role of teachers’ zones of enactment. Curriculum Studies, 31, 143–175.
Spillane, J. P., & Thompson, C. L. (1997). Reconstructing conceptions of local capacity: The local
education agency’s capacity for ambitious instructional reform. Educational Evaluation and
Policy Analysis, 19, 185–203.
Spillane, J. P., Halverson, R., & Diamond, J. B. (2004). Distributed leadership: Towards a theory
of school leadership practice. Journal of Curriculum Studies, 36, 3–34.
Spillane, J. P., Mesler, L., Croegaert, C., & Sherer, J. Z. (2007). Coupling administrative practice
with the technical core and external regulation: The role of organizational routines. Paper
presented at the annual meeting of the European Association for Research on Learning and
Instruction, Budapest.
Staples, M. (2007). Supporting whole-class collaborative inquiry in a secondary mathematics
classroom. Cognition and Instruction, 25(2), 161–217.
Steffe, L. P., & Thompson, P. W. (2000). Teaching experiment methodology: Underlying principles
and essential elements. In A. Kelly & R. Lesh (Eds.), Handbook of research design in mathematics
and science education (pp. 267–307). Mahwah, NJ: Erlbaum.
Stein, M. K. (2004). Studying the influence and impact of standards: The role of districts in teacher
capacity. In J. Ferrini-Mundy & F. K. Lester Jr. (Eds.), Proceedings of the National Council of
Teachers of Mathematics Research Catalyst Conference (pp. 83–98). Reston: National Council
of Teachers of Mathematics.
Stein, M. K., Silver, E. A., & Smith, M. S. (1998). Mathematics reform and teacher development:
A community of practice perspective. In J. G. Greeno & S. V. Goldman (Eds.), Thinking prac-
tices in mathematics and science learning (pp. 17–52). Mahwah, NJ: Lawrence Erlbaum.
Stein, M. K., Engle, R. A., Smith, M. S., & Hughes, E. K. (2008). Orchestrating productive math-
ematical discussions: Five practices for helping teachers move beyond show and tell.
Mathematical Thinking and Learning, 10(4), 313–340.
530 E. Henrick et al.
Stephan, M., Bowers, J., & Cobb, P. (2003). Supporting students’ development of measuring concep-
tions: Analyzing students’ learning in social context (Journal for research in mathematics
education monograph, Vol. 12). Reston: National Council of Teachers of Mathematics.
Supovitz, J. A. (2006). The case for district-based reform. Cambridge: Harvard University Press.
Tyack, D., & Tobin, W. (1995). The “Grammar” of schooling: Why has it been so hard to change?
American Educational Research Journal, 31, 453–479.
U.S. Department of Education. (2008). The final report of the National Mathematics Advisory
Panel. Washington, DC: Author.
van der Veer, R., & Valsiner, J. (1991). Understanding Vygotsky: A quest for synthesis. Cambridge,
MA: Blackwell.
Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes.
Cambridge, MA: Harvard University Press.
Wenger, E. (1998). Communities of practice. New York: Cambridge University Press.
Wiggins, G., & McTighe, J. (1998). Understanding by design. Washington, DC: Association for
Curriculum and Supervision.
Part XII
Final Considerations
Chapter 19
Looking Back
Initially, all the parts of this book were supposed to consist of two separate chapters,
which would allow the reader to use the book as an actual guide for the selection of
an appropriate methodology, based on both theoretical depth and practical implica-
tions. However, in the course of the emergence of the book we realized that not all
methodologies could be described in two such separate chapters, i.e., one describing
the methodology in a more general form including basic considerations and the
other illustrating this general description with a specific research example. Some
methodologies seemed to be much more tightly linked to research practice than we