Variation in Language - System - and Usage-Based Approaches
Variation in Language - System - and Usage-Based Approaches
Edited by
Peter Auer, Gesa von Essen, Werner Frick
Editorial Board
Michel Espagne (Paris), Marino Freschi (Rom), Ekkehard König (Berlin),
Michael Lackner (Erlangen-Nürnberg), Per Linell (Linköping),
Angelika Linke (Zürich), Christine Maillard (Strasbourg),
Lorenza Mondada (Basel), Pieter Muysken (Nijmegen),
Wolfgang Raible (Freiburg), Monika Schmitz-Emans (Bochum)
Volume 50
Variation in Language:
System- and Usage-
based Approaches
Edited by
Aria Adli, Marco García García and Göz Kaufmann
ISBN 978-3-11-034355-7
e-ISBN (PDF) 978-3-11-034685-5
e-ISBN (EPUB) 978-3-11-038457-4
ISSN 1869-7054
www.degruyter.com
Contents
Aria Adli, Marco García García, Göz Kaufmann
System and usage: (Never) mind the gap 1
Frederick J. Newmeyer
Language variation and the autonomy of grammar 29
Gregory R. Guy
The grammar of use and the use of grammar 47
Richard Cameron
Looking for structure-dependence, category-sensitive processes,
and long-distance dependencies in usage 69
Mary A. Kato
Variation in syntax: Two case studies on Brazilian Portuguese 91
Göz Kaufmann
Rare phenomena revealing basic syntactic mechanisms: The case of
unexpected verb-object sequences in Mennonite Low German 113
Leonie Cornips
The no man’s land between syntax and variationist sociolinguistics:
The case of idiolectal variability 147
Aria Adli
What you like is not what you do:
Acceptability and frequency in syntactic variation 173
VI Contents
Hubert Haider
“Intelligent design” of grammars – a result of cognitive evolution 203
Guido Seiler
Syntactization, analogy and the distinction between
proximate and evolutionary causations 239
Malte Rosemeyer
How usage rescues the system: Persistence as conservation 289
Aria Adli, University of Cologne
Marco García García, University of Cologne
Göz Kaufmann, University of Freiburg
System and usage: (Never) mind the gap1
At least since the Saussurian distinction between langue and parole, the relation
between grammar and language use has been a central topic of linguistic thought.
The present volume deals with this relation by focusing on language variation.
The improved possibilities of working with large corpora and the increased refine-
ment of experimental designs make this – once again – a worthwhile undertaking.
Quite unsurprisingly, different linguistic subfields make different uses of these
new possibilities, uses which reflect their respective theoretical frames. Many
sociolinguists apply usage-based approaches while most, though not all, syn-
tacticians adhere to system-based approaches. However, both usage and system
are heterogeneous and even somewhat fuzzy concepts. Therefore, the question
arises whether this distinction is at all meaningful. It may be more appropriate
to conceive a continuum between system- and usage-based approaches. Such a
continuum includes intermediate positions, several of which can be found in this
volume. On the system-side, the endpoint of such a continuum may be seen in
generative grammar. An important common denominator of generative (system-
based) approaches is the assumption that grammar is independent from usage
and that language use obeys the rules of a grammatical system. On the usage-
side, the endpoint may be represented by the model of emergent grammar, which
refers to the idea that linguistic structures and regularities are no more than an
epiphenomenon, i. e. “not the source of understanding a communication but
a by-product of it” (Hopper 1998: 156). An important common denominator of
1 The editors would like to thank the authors for their contributions and for their willingness to
participate in the process of internal reviewing. Our thanks also go to two external reviewers, to
Peter Auer for his continuous help and to Elin Arbin for checking the English of authors whose
native language is not English. Obviously, all remaining shortcomings are our responsibility.
Finally, we would like to thank the Freiburg Institute for Advanced Studies (FRIAS) for financing
both the workshop System, Usage, and Society, which took place in Freiburg in November 2011,
and the publication of this volume.
2 Aria Adli, Marco García García, and Göz Kaufmann
The primary reason for viewing language as a complex adaptive system, that is, as being
more like sand dunes than like a planned structure, such as a building, is that language
exhibits a great deal of variation and gradience. Gradience refers to the fact that many cat-
egories of language or grammar are difficult to distinguish, usually because change occurs
over time in a gradual way, moving an element along a continuum from one category to
another.
set to specific values, a process which is seen as fairly robust (Meisel 2011). Light-
foot (2006: 6) points out that “a person’s system, his/her grammar, grows in the
first few years of life and varies at the edges depending on a number of factors”
[highlighting added by us]. By way of illustration, Lightfoot (2006: 4–5) uses the
mold, not the sand dune, as metaphor:
The biology of life is similar in all species, from yeasts to humans. Small differences in
factors like the timing of cell mechanisms can produce large differences in the resulting
organism, the difference, say, between a shark and a butterfly. Similarly the languages of
the world are cast from the same mold, their essential properties being determined by fixed,
universal principles. The differences are not due to biological properties but to environ-
mental factors.
Within this approach, variation and gradience require other explanations than in
usage-based theories. Many generative syntacticians see their locus at the com-
munity level in the sense that during a period of change, multiple competing
grammars coexist. Introducing the constant-rate hypothesis, Kroch (1989: 200)
presents a quantitative corpus study of the rise of periphrastic do in English ques-
tions and negations. He claims that “when one grammatical option replaces
another with which it is in competition within the community across a set of lin-
guistic contexts, the rate of replacement, properly measured, is the same in all of
them”.2
However, for usage-based linguists, the locus of change (and of gradience)
is not exclusively the community where typically a generational change between
caretakers and children takes place, but also the mature individual whose linguis-
tic knowledge undergoes changes over lifetime. Usage-based theorists also point
out that frequency should not be seen in isolation, but rather in “interaction and
competition with various other factors of language use, such as recency, salience
and context” (Behrens et al. to appear: section 9).
Taking a usage-based stance, Torres Cacoullos (this volume) discusses
both change over lifetime and recency or priming effects. She studies the his-
torical development of complex verbal constructions in Spanish (locative estar +
gerund) to a single periphrastic unit of progressive aspect. In doing so, she shows
that progressive estar-constructions are primed by preceding non-progressive
estar-constructions. Torres Cacoullos argues that the priming effect (including
its changing intensity over time) is the result of the analyzability of the progres-
2 Yang (2000: 248) extends Kroch’s (1989) interpretation, assuming that multiple grammars do
not only exist within a community but also within an individual’s mind. He claims that “there
is evidence of multiple grammars in mature speakers during the course of language change.”
4 Aria Adli, Marco García García, and Göz Kaufmann
sive estar-construction. The capacity of this type of analyzability and the gradual
change over time connected to it is assumed to exist within an individual’s
grammar, given that priming is a psycholinguistic phenomenon based on individ-
ual cognitive processes. Furthermore, Torres Cacoullos argues that Kroch’s (1989)
above-mentioned constant-rate hypothesis does not hold for the probability of
selecting the progressive variant.
Rosemeyer (this volume) is another contribution from a usage-based per-
spective. On the basis of a quantitative historical corpus, he studies Spanish split
auxiliary selection, i. e. the question whether writers chose BE or HAVE in analytic
perfect constructions. Like Torres Cacoullos, he analyzes priming effects, namely
effects of persistence (linked to temporally close activation) and entrenchment
(linked to repeated activation) (in the sense of Langacker 1987: 59; Bybee 2002;
Szmrecsanyi 2005). Rosemeyer points out that both persistence and entrench-
ment have conserving effects on diachronic grammatical development, thereby
creating systematicity in the patterns of change.
The generative view on frequency effects is quite different. It is crucial to dis-
tinguish, as Meisel (2011: 3) has pointed out, between grammatical change that
involves parameter resetting in the sense of Universal Grammar (see e. g. Light-
foot 2006) and change that is not attributable to new parameter values:
As Sankoff (2005) and Sankoff and Blondeau (2007) have demonstrated, individuals may,
in fact, adapt their language use during adulthood to innovative patterns resulting from
generational change. Such lifespan changes may have profound consequences, but they
do not involve reanalysis of grammars, i. e. we do not find evidence suggesting that mental
representations of parameterized grammatical knowledge are subject to modifications after
childhood. Even attrition of syntactic knowledge only seems to affect a person’s ability to
use the knowledge developed early on in life.
For example, the frequency of subject pronoun realization can vary substantially
from one null subject language (NSL) to another (Otheguy, Zentella and Livert
2007). However, from a generative point of view a critical threshold must be
reached which leads to a parameter resetting from [+NSL] to [–NSL] or vice versa
(or possibly also to [+partial NSL] in the sense of Holmberg, Nayudu and Sheehan
2009). Beyond that critical threshold, change is not gradual but abrupt, because
in this case I-grammar changes. This has been clearly expressed by Lightfoot
(2006: 158):
I submit that work on abrupt creolization, the acquisition of signed languages, and on cata-
strophic historical change shows us that children do not necessarily converge on grammars
that match input. This work invites us to think of children as cue-based learners: they do
not rate the generative capacity of grammars against the sets of expressions they encounter
but rather they scan the environment for necessary elements of I-language in unembedded
System and usage: (Never) mind the gap 5
domains, and build their grammars cue by cue. The cues are not in the input directly, but
they are derived from the input, in the mental representations yielded as children under-
stand and “parse” the E-language to which they are exposed [...]. We may seek to quantify
the degree to which cues are expressed by the PLD [Primary Linguistic Data], showing
that abrupt, catastrophic change takes place when those cues are expressed below some
threshold of robustness and are eliminated.
Thus, one crucial question is whether there are different types of change, gradual
non-parametric ones that are better accounted for with usage-based models and
abrupt parametric ones that can be better explained within the generative model.3
Newmeyer (2003: 693–694) highlights one aspect that contradicts the usage-
based view: He shows that grammars are not always useful in the sense of opti-
mally responding to users’ pragmatic and social needs, i. e., languages often lack
lexical and/or grammatical properties that may arguably be useful and possess
properties that are not useful at all. As an example of the lack of distinctions that
would be useful in everyday communication, Newmeyer (2003: 693) mentions the
conspicuous rarity of the inclusive/exclusive pronoun distinction in the world’s
languages. If grammar was as adaptive and fluid as suggested by the sand dune
metaphor (most clearly expressed by the idea of emergent grammar in Bybee and
Hopper 2001), one would expect, as he states, such useful features to occur in
the great majority of languages. Unlike this, characteristics of dubious usefulness
such as the homonymy between English you2sg and you2pl should not occur. Like-
wise, we would not expect abrupt phenomena of change under the assumption of
a fully adaptive and fluid grammar.
The debate in the Journal Language in the years 2003 to 2007 collected diamet-
rically opposed opinions about the (ir)relevance of frequency and probabilities
in grammar. Newmeyer (2003) doubted in his paper the representativeness of
empirical data in many usage-based corpus studies on syntax (cf. also the dis-
cussion on the relation between theory and data in section 2.1). Guy (2005, 2007)
replied that this is no more (but also no less) than a challenge that can be over-
come. Indeed, quantitative sociolinguistic research has already established, as
3 We know from research in dynamic modeling that phenomena of change in very different
fields do incorporate both gradual and abrupt changes (cf. e. g. Thom 1980). Future research
needs to show whether grammatical change follows a similar pattern and whether gradual and
abrupt language change can be integrated into one theory.
6 Aria Adli, Marco García García, and Göz Kaufmann
Guy points out, high methodological standards with regard to corpus data. Fur-
thermore, the explanatory power of linguistic theory would be unnecessarily
diminished when quantitative correlations with other language-internal and
social factors are not taken into account.
At this point, it is important to highlight that variationist sociolinguistics is
not “inherently usage-based”. However, the importance of frequency or probabil-
ity represents an important zone of overlap between variationist and usage-based
models. It does not come as a surprise that one sociolinguistic subfield, namely
cognitive sociolinguistics (Kristiansen and Dirven 2008; Geeraerts, Kristiansen
and Peirsman 2010), builds on the premises of usage-based linguistics.
Likewise, relying on corpus data from actual language use in formal-syntactic
research does not by itself lead to the incorporation of usage-based positions into
generative thinking. The difference between theorists from both persuasions with
regard to the role of experience on the cognitive organization of the (child and
adult) speaker remains. Yet, taking corpus data seriously leads to a more serious
consideration of frequency and related phenomena such as gradience, recency,
and variation in usage. To put it in Barbiers’ (2013: 3) words, we could then “shift
away from the methodology of idealization of the data in search of the universal
syntactic properties of natural language, towards a methodology that takes into
account the full range of syntactic variation that can be found in colloquial lan-
guage”. However, most generative linguists still do not analyze social variation
in their research and those who do have abandoned classic generative premises.4
Seeing the social perspective as irrelevant goes back to Chomsky’s (1965: 3–5)
notion of the ideal speaker-listener. The following quotation, which is embedded
in Chomsky’s (2000: 31) critique of externalist philosophy, highlights this:
Suppose, for example, that “following a rule” is analyzed in terms of communities: Jones
follows a rule if he conforms to the practice or norms of the community. If the “community”
is homogeneous, reference to it contributes nothing (the notions norm, practice, con-
vention, etc. raise further questions). If the “community” is heterogeneous – apart from the
even greater unclarity of the notion of norms (practice, etc.) for this case – several problems
arise. One is that the proposed analysis is descriptively inaccurate. Typically, we attribute
rule-following in the case of notable lack of conformity to prescriptive practice or alleged
norms. […] The more serious objection is that the notion of “community” or “common
4 However, not all generative syntacticians follow this approach. Wilson and Henry (1998: 8)
point out that social variation and change is constrained by universal grammar, which defines
the set of possible grammars. A corresponding observation is that grammatical introspection –
the most important empirical source in generative syntax – is subject to systematic social vari-
ation (Adli 2013: 508; cf. also Bender 2007; Eckert 2000: 45).
System and usage: (Never) mind the gap 7
language” makes as much sense as the notion “nearby city” or “look alike”, without further
specification of interests, leaving the analysis vacuous.
The classic generative position, particularly widespread during the early years of
generative grammar, is that frequency effects and correlations between the use of
a construction and other language-internal and language-external factors were
(and regrettably sometimes still are) considered to be epiphenomenal, a position
most clearly expressed in Chomsky’s (1965: 3) famous stance:
Assuming that language has general properties of other biological systems, we should be
seeking three factors that enter into its growth in the individual: (1) genetic factors, the topic
of UG; (2) experience, which permits variation within a fairly narrow range; (3) principles
not specific to language. The third factor includes principles of efficient computation,
which would be expected to be of particular significance for systems such as language. UG
is the residue when third-factor effects are abstracted.
5 The notion of a user’s manual goes back to Culy (1996: 112), who sees its roots in pragmatics.
His initial idea was to explain systematic grammatical differences between registers/styles,
such as the distinctive use of zero objects in English recipes or the frequency of use of particular
grammatical forms or constructions. The user’s manual in Culy’s (1996: 114) terms is roughly
described as specifying “the characteristics of registers and, within each register, […] character-
istics of different styles”. The user’s manual essentially carries information on frequency of use
of items and constructions and default interpretations of variables in valency relations. This
notion was later taken up by Zwicky (1999), yet without proposing a notably refined definition.
System and usage: (Never) mind the gap 9
The criteria that have to be fulfilled for most syntacticians with regard to
optionality are much stricter than those usually applied in variationist socio
linguistics. While many syntacticians refute optionality if two variants are not
fully identical in meaning and distribution, it is good enough for sociolinguists
if two variants are optional in most contexts. This phenomenon is called “neu-
tralization in discourse” (Sankoff 1988: 153). In his debate with Newmeyer, Guy
(2007: 3) points out that “the prevailing consensus is that, while certain struc-
tures may have different meanings in some of the contexts they occur in, there are
often other contexts in which they function as alternants. Therefore, productive
variationist analyses can be conducted, given careful attention to contexts and
meaning”.
Syntacticians who try to incorporate optionality into the grammar system
represent a minority. Kato (this volume) is one example. She presents a study
within the generative paradigm which nevertheless takes a critical stance
towards standard minimalist assumptions. Kato attempts to account for variation
and optionality in Brazilian Portuguese syntax. First, she discusses the variation
between null and overt subject pronouns. It is noteworthy that she sees the locus
of this variation inside a person’s I-language without resorting to the idea of mul-
tiple internal grammars (such as Yang 2000; see also Roeper’s 1999 more radical
approach of “universal bilingualism”). In doing so, she refers to the distinction,
introduced by Chomsky (1981: 8), between the core and the extended periphery of
grammar, both constitutive of a person’s I-language. Kato builds on Kato (2011),
where core grammar is linked to early childhood acquisition and the extended
periphery in syntax to late childhood acquisition. It turns out that overt subject
pronouns, typical of current Brazilian Portuguese, are acquired before schooling
and null subjects during schooling. This means that the null subject is used by
older children and adults, yet its late acquisition “does not affect grammar as
a system”. The second phenomenon Kato discusses concerns optional surface
orders of Brazilian Portuguese wh-questions, namely the variation between
“fronted” and “in-situ” wh-constituents. Her study suggests that the positioning
of the wh-constituent is acquired in early childhood. Thus, Kato argues that the
variants belong to the child’s core grammar.
Barbiers (2005) is another example of a generative syntactician who incor-
porates the notion of optionality into grammar.6 He assumes that “variation and
6 Other examples are Fukui (1993), Saito and Fukui (1998), Henry (2002), Haider and Rosengren
(2003), Biberauer and Richards (2006), and Adli (2013). Barbiers (2005) builds his conclusions
on the data from SAND (Syntactische Atlas van de Nederlandse Dialecten, cf. www.meertens.
knaw.nl/sand and Barbiers 2013: 2–3 on the European Dialect Syntax Project). Other projects
10 Aria Adli, Marco García García, and Göz Kaufmann
which combine quantitative analyses of microdialectal variation with modern syntactic theory
are the Dialect Syntax of Swiss German (University of Zurich), and Kaufmann’s (2007) study on
the verbal syntax of Mennonite Low German.
7 For an interesting formal proposal that accounts for (genuine or apparent) optionality within
the minimalist framework, see the algorithm dubbed “combinatorial variability” presented by
Adger (2006) and Adger and Smith (2010).
System and usage: (Never) mind the gap 11
At the present stage of knowledge it is hard to see how the dispute for or against
a clear distinction of grammar and usage can be empirically settled. One way to
proceed can be to take a step back in order to engage in an epistemological dis-
cussion. Cameron (this volume) takes a critical stance against a binary view of
system and usage. He emphasizes that this distinction is better described as a
“fundamental assumption that contributes to theory building” and that “as such,
the distinction itself may not actually be falsifiable in a broad sense”. Essentially,
he takes up three phenomena cited by Newmeyer (2003) in favor of a binary dis-
tinction (long-distance dependencies, category-sensitive processes and structure
dependency) and shows that these phenomena have analogues or parallels in
usage. Cameron concludes: “I guess what I am arguing for is a new set of terms,
something other than grammar and usage or competence and performance,
something not binary, something n-nary”.
The contributions of Seiler and Haider go one step further by integrating
central thoughts from other fields of science. Seiler (this volume) engages in
a conceptual discussion on the foundation and the implications of the system-
usage debate. He states that usage-based linguistics has a certain kinship with
functionalist approaches to grammar (“in a formalist view on syntax, syntactic
structure is to some degree immune against usage”), while formalist theories
often embody the idea of an autonomous syntactic system. Yet he points out
that formalist and functionalist approaches to language should not be seen as
antagonistic since they may explain different aspects of language. For Seiler (this
volume), a strict system-usage dichotomy would be ill-fated, just as ill-fated as
the unproductive formalist-functionalist dichotomy in biology:
The fundamental structure of the debates in biology and linguistics is astonishingly similar.
In both disciplines, two schools defended their way of explaining aspects of nature as the
only possible one at their time: proximate vs. evolutionary in biology, formal vs. functional
in linguistics. The main difference between biology and linguistics lies in the fact that the
complementarity (and compatibility) of the two kinds of explanation has been widely
accepted by biologists since the modern evolutionary synthesis some seventy years ago. A
modern linguistic synthesis is still yet to come. For linguists, this is not exactly a reason to
be proud of.
On the surface, Haider’s (this volume) opinion with regard to functionalist and
structuralist (formalist) schools in linguistics seems comparable to Seiler’s, but
Haider is more radical, considering both approaches to be wrong: “The dispute
[between structuralism and functionalism in biology] turned out to be completely
irrelevant after Darwin’s theory of evolution gained ground”. Unlike Seiler,
12 Aria Adli, Marco García García, and Göz Kaufmann
Haider does not expect improvement from the “complementarity (and com-
patibility) of the two kinds of explanation”, i. e., functionalist and structuralist
approaches, thus thwarting a central goal of this volume to a certain degree. He
sees in grammar a cognitive organism which undergoes cognitive evolution and
claims that “the descent of species and the descent of languages encompass the
same abstract mechanism (self-replication, variation, selection) in two different
domains”, a comparison suggesting that linguistics will need a figure on a par
with Darwin in order to advance.
So far, we have dealt with the first question mentioned at the end of section
1.1, namely the question of how usage- and system-based approaches tackle vari-
ation in language. Bringing contributions from disparate theoretical perspectives
together in one volume, however, is also a good opportunity to raise fundamental
methodological issues. The rest of this introduction will, therefore, be dedicated
to the second question mentioned above, namely the relationship between theory
and data.
a more recent one by Featherston (2007a) (cf. also Schütze 1996). By focusing
on the empirical base of generative grammar and the critique it spawned, the
positions of both system and usage-based (variationist) approaches will become
clear. The fact that Featherston (2007a) criticizes some of the same things Labov
(1975) criticized 32 years earlier shows that the latter was right in concluding that
there was little room for optimism with regard to a possible approximation of the
different positions: “Ideological positions are too well established, and habits of
work are too firmly set to believe that there will be an immediate convergence of
thinking on these issues” (Labov 1975: 54).
With regard to the question of what kind of empirical data linguists should
use, Featherston (2007a: 271 and 308) considers the frequently applied practice of
using judgments of a single person to be “inadequate”, adding that “[t]here can
be little satisfaction in producing or reading work which so clearly fails to satisfy
scientific and academic standards”. Besides many critical reactions, Featherston
also receives support. Haider (2007: 389) writes:
Considering that both Featherston and Haider work in the generative frame and
nevertheless take such a critical attitude and considering that some generative
linguists use historical corpora or elicited data for their analyses (Lightfoot 1999;
Kroch 1989; Barbiers 2005, etc.) may raise our hopes, but in general the empir-
ical base of much generative work continues to be of a rather dubious nature.
Newmeyer (2007: 395) still describes the use of a “single person’s judgments”
as “standard practice in the field” not just in generative linguistics, but in cog-
nitive and functional linguistics as well. The problem with a “single person’s
judgments” is that generative linguists work with the judgments of conscious
and – even more importantly – self-conscious human beings and not with uncon-
8 Pullum (2007: 36) discusses the same point: “Looking back at the syntax published a couple of
decades ago makes it rather clear that much of it is going to have to be redone from the ground
up just to reach minimal levels of empirical accuracy. Faced with data flaws of these proportions,
biology journals issue retractions, and researchers are disciplined or dismissed”.
14 Aria Adli, Marco García García, and Göz Kaufmann
scious matter as classical natural sciences like physics or chemistry do. Due to
this, the attempt to separate acceptability from grammaticality, i. e., to filter out
the “noise” of acceptability from the supposedly pure essence of grammaticality,
may not be solvable in principle (cf. Schütze 1996: 25–27 and 48–52; Featherston
2007b: 401–403; Newmeyer 2007: 396–398), regardless of whether one aggregates
judgments of hundreds of disinterested informants or whether one uses the intro-
spection of a definitely not disinterested linguist.
Besides the acceptability-grammaticality issue, there is a whole array of
further problems; for example, the still unclear relationship between speakers’
judgments and speakers’ language production:9 Kempen and Harbusch (2005:
342) analyze sequences of (pronominal) arguments in the midfield of German
clauses and write that “[a]rgument orderings that embody mild violations of the
[linearization] rule, receive medium-range grammaticality scores […] but are vir-
tually absent from the corpora because the grammatical encoding mechanism
in speakers/writers does not (or hardly ever) produce them”. Unfortunately, the
theoretical relevance of such grammaticality-production mismatches is rarely the
focus of research (but cf. Adli this volume). Why do sequences, which are rated as
medium-range grammatical, not occur more often and what exactly does it mean
if a sentence is of medium-range grammaticality? Be this as it may, the mismatch
between medium-range grammaticality (judgment) and lack of occurrence (pro-
duction) does not constitute a fundamental problem for generative grammar. A
more threatening issue, however, is the mismatch of supposed ungrammaticality
(judgment) and occurrence (production). A case in question is the “prohibition
against the deletion of the relative pronouns which are subjects” (Labov 1975:
41–42). In spite of this judgment-based prohibition, this kind of deletion occurred
in fourteen out of 336 possible tokens in a corpus from Philadelphia (4.2 %).
Granted, 4.2 % is not a very high share, but fourteen occurrences is a robust
enough number for linguists to wonder how one could account for the existence
of these tokens. As many rare phenomena will not form part of the idiolects of
linguists, these linguists will simply not (be able to) submit them to their own
grammaticality judgments. Thus, the linguist who refuses to work with perform-
ance data (E-language) is bound to overlook possibly crucial linguistic facts since
many of these rare phenomena are only detected in large corpora (cf. section 2.2
9 Schütze (1996: 48) comments: “Over the history of generative grammar, much has made [sic!]
of its heavy reliance on introspective judgments and their nonequivalence to production and
comprehension”. Featherston (2007a: 271) also mentions production data as a possible source
for studies in the generative framework: “This focus [on judgment data] is in no way intended to
belittle the value of corpus data or make out that this data type is any less relevant”.
System and usage: (Never) mind the gap 15
for a more thorough discussion of rare phenomena). Without analyzing such phe-
nomena, we will not be able to produce a grammar which generates all possible
sentences of a language.
Another problem with regard to the use of grammaticality judgments can be
illustrated by means of the reactions of Grewendorf (2007) and Den Dikken et al.
(2007) to Featherston’s (2007a) article. Grewendorf (2007: 376) notes:
Even more pointedly, Den Dikken et al. (2007: 343 – Footnote 4) state:
As a side point, we do not see how the mean value of the judgments of a group of speakers
can confirm or disconfirm an individual’s judgments: one’s judgments are one’s judgments,
no matter what other speakers of ‘the same language’ might think.
With regard to this “‘my idiolect’ gambit” (Featherston 2007a: 279; Schütze 1996:
4–5), one can only hope that Haider (2007: 382 – Footnote 1) is not right when he
claims: “More often, the problem [the risk of having to give up a “dearly fostered
hypothesis”] is solved pragmatically. Conflicting evidence is simply ignored or
repressed”.10 In any case, accepting Grewendorf’s and Den Dikken et al.’s con-
victions would constitute the end of science as we know it. This was already seen
by Labov (1975: 14, 26, and 30), who writes that “[t]he study of introspective judg-
ments is thus effectively isolated from any contradiction from competing data.”
He adds that “[u]ntil more solid evidence is provided by those who have no theo-
retical stake in the matter, the most reasonable position is to assume that such
dialects [idiosyncratic dialects, i. e. idiolects] do not exist” and that “the uncon-
trolled intuitions of linguists must be looked on with grave suspicion”. What is at
stake here is not Labov’s question of whether idiolects exist or not – they proba-
bly do; what is at stake is the lack of control in the ‘my idiolect’ gambit and the
conflict of interest of researchers who base their theories on their evaluation of
sentences constructed by them. The fact that Grewendorf and Den Dikken et al.
10 Pullum (2007: 38) adds another possible technique for saving a “dearly fostered hypothesis”:
“In syntax, if you want some sequence of words to be grammatical (because it would back up
your hypothesis), the temptation is to just cite it as good, and probably you won’t be challenged.
If you are challenged, just say it’s good for you, but other dialects may differ”.
16 Aria Adli, Marco García García, and Göz Kaufmann
still mention positions which were convincingly rejected thirty years ago is telling
proof of the existence of what Labov calls an “ideological position”.
In spite of these problems, one must not forget the unprecedented progress
our understanding of language has achieved thanks to generative grammar. Fan-
selow (2007: 353) rightly emphasizes that “it [generative syntax] has broadened
the data base for syntactic research in a very profound way”. But while one’s own
intuitions may have been sufficient in the beginning of generative linguistics and
may still be sufficient for basic syntactic phenomena (cf. Labov’s 1975: 14 and 27
discussion of Chomsky’s clear cases), the ‘my idiolect’ gambit cannot be applied
to rare or controversial phenomena. One’s own intuitions simply do not fit a field
aiming to overcome a “pre-scientific phase of orientation” (cf. Haider’s 2007:
389 quote above). This does not mean that Grewendorf’s (2007: 373) grammar is
deviant “because [his] judgments do not correspond to the judgments of Feather-
ston’s group”, it just means that nobody should devise a theory based exclusively
on his/her own judgments; i. e., like Pullum (2007: 38) we should stop thinking
that the “how-does-it-sound-to-you-today method can continue to be regarded as
a respectable data-gathering technique”.
Due to the still widespread lack of interest in the empirical side of their
research, generative grammar has not yet made the methodological and analyti-
cal progress sociolinguistics and variation linguistics have achieved. With regard
to quantification and especially with regard to categorization, Labov’s early
methodology must be regarded as naïve (e. g., reading word lists and recounting
near-death experiences representing formal and informal styles, respectively).
But this objection has to be qualified, since these methods were used to establish
a new field and have been improved dramatically ever since. In contrast, many
generativists still use the same empirical base their colleagues used fifty years
ago, despite the fact that the field has gone through at least four major theoretical
phases (from Standard Theory to Minimalism).
Leaving the question of what kind of empirical data linguists should use, we
will briefly focus on the second point mentioned at the beginning of this section:
the question of which role empirical data should play in theory-testing and theory-
building. It does not come as a surprise that many generativists do not foster a
balanced view of theory-oriented and data-driven approaches, i. e., we are still far
away from Featherston’s (2007b: 408) conviction that “data and theory are indeed
in a mutually dependent relationship, both affecting the credibility of the other”.
Even for Barbiers (2005: 258), a generative linguist with much experience in the
analysis of elicited judgment/production data, sociolinguistics seems secondary
to generative linguistics: “Finally, it was argued that there are certain patterns
in individual and geographic variation about which generative linguistics has
nothing to say. That is where sociolinguistics comes in”. Surprisingly, Featherston
System and usage: (Never) mind the gap 17
(2007a: 310 and 314) himself also shows a rather biased opinion with regard to
this interplay putting data first: The most criticized stances in his paper are that
“[l]inguists need to look at the data first and develop their models afterwards […]”
and that “[d]ata is a pre-condition for theory, and the quality of a theory can never
exceed the quality of the data set which it is based on”. One can be sure that few
linguists, let alone generative linguists, would subscribe to this division of labor
(cf. especially the comments of Grewendorf 2007: 377–379).
In any case, as long as data-driven approaches only come in when formal
approaches fail and as long as elicited data are only seen as a source for checking
hypotheses at best, we cannot take full advantage of their potential. Therefore,
the question is not only whether to use new types of empirical data in system-
based approaches, but how to use them and how to correctly evaluate their use.
The present volume contains analyses of different types of elicited language data,
some of which are analyzed within the framework of system-based approaches:
Cornips (this volume) uses elicited language data, but also judgment tests in the
case of clusters with three verbal elements. Adli (this volume) combines judg-
ment and production data and offers comparisons between these data types for
wh-questions in French. With this, he tries to tackle the grammaticality-produc-
tion mismatch mentioned above. Kato (this volume) analyzes data from child
language acquisition and Kaufmann (this volume) uses translation data. His
informants were asked to translate Spanish, Portuguese and English stimulus
sentences into Mennonite Low German. As with all research methods, there exist
advantages (amount and comparability of the data, cf. Schütze 1996: 2) and dis-
advantages to such an approach (no natural speech, possible priming effects,
cf. for the latter Kaufmann 2005). Torres Cacoullos (this volume) and Rosemeyer
concentrate on structural changes in the verbal domain of Spanish by analyzing
historical texts. As mentioned above, both of them work within a usage-based
framework (unlike Cornips, Adli, Kato and Kaufmann). In this case, too, the
methodological problems are manifold, but Torres Cacoullos and Rosemeyer can
hardly be held responsible for this. Historical data reflect oral speech only to a
certain degree and one can only use the data one has, i. e., one cannot go back
and ask speakers/writers how they rate constructions for which one does not find
evidence in the written record.
One especially interesting case with regard to linguistic data are rare phenom-
ena, i. e., generally uncommon but nevertheless robust linguistic facts. Rare phe-
nomena raise some essential empirical and theoretical questions for the study
18 Aria Adli, Marco García García, and Göz Kaufmann
a trait (of any conceivable sort: a form, a relationship between forms, a matching of form
and meaning, a category, a construction, a rule, a constraint, a relationship between rules
or constraints, ...) which is so uncommon across languages as not even to occur in all
members of a single […] family or diffusion area (for short: sprachbund), although it may
occur in a few languages from a few different families or sprachbünde.
For several reasons, the study of rare phenomena remains an important linguistic
task (cf. also Cysouw and Wohlgemuth 2010: 3–4). First of all, it seems obvious
that the consideration of rara will provide an empirically much more detailed
picture of what is (im)possible in the languages of the world. Given that rare phe-
nomena may contradict or even falsify cross-linguistic assumptions and linguis-
tic universals, they may help us to formulate more adequate generalizations. As
a consequence, we may get better linguistic descriptions, which in turn may offer
more adequate explanations.
This also holds true for the other type of rare phenomena, namely those that
are anti-frequent in a given language (or language family). Under this label, we
refer to all kinds of linguistic traits that are very infrequently attested with respect
to other paradigmatic alternatives in the language(s) under consideration. The
rareness of these linguistic traits is, in principle, irrespective of the distribu-
tion and frequency of these traits in other languages, i. e., a phenomenon that
hardly occurs in a given language may or may not be rare in other languages.
Some examples of this kind of rare phenomena are differential object marking
with inanimate objects in Spanish (cf. García García 2014), wh-clefts and other
wh-variants in French (cf. Adli this volume), verb-second order violations in Ger-
manic languages (cf. Cornips this volume) or non-verb-final dependent clauses in
Mennonite Low German (cf. Kaufmann this volume).
in dependent clauses with two verbal elements. The problem in analyzing the
phenomenon in question is therefore not the lack of formal explanations, but the
necessity of finding the grammatical mechanisms whose interaction causes the
rare phenomenon.
The third case involving rare phenomena is presented in Adli (this volume).
It deals with the variation of wh-constructions attested in Modern French, where
nine different types of wh-variants can be distinguished, among them the wh-in-
situ construction (e. g. Tu fais le dessin quand ? ‘When do you do the drawing?’),
the whVS construction (e. g. Quand fais-tu le dessin ?) or the wh-cleft construction
(e. g. quand c’est que tu fais le dessin ?). As already mentioned, Adli’s study draws
on production data as well as gradient acceptability judgments (both types of lin-
guistic evidence were provided by the same set of individuals). The results show
that some variants, such as the wh-in-situ construction, are very frequent, while
others, such as the whVS construction or the wh-cleft construction, are only rarely
attested or do not occur at all. Yet all of these variants were rated as acceptable.
What is more, the variants belonging to a rather formal register were evaluated as
being more acceptable than those pertaining to a colloquial register. For example,
the formal whVS construction received the highest acceptability scores. However,
this preference in acceptability is not reflected in the production data. Thus, Adli’s
study reveals an interesting mismatch between usage and speaker judgments,
showing that some variants hardly occur although they are rated as acceptable.
Adli suggests that this frequency-acceptability mismatch is at least partly due
to register: “While frequency data from spontaneous speech […] provide insight
into colloquial language, acceptability data reflect the entire range of registers
available to a speaker”. Moreover, he concludes that acceptability judgments are
influenced by normative pressure, especially in the case of French.
Comparing the findings of rare phenomena presented in Cornips (this
volume), i. e., V2-violation in Germanic languages and the rare wh-variants
studied in Adli (this volume), one sees that both are socially dependent. However,
there exists an obvious difference. While the V2 violation is the result of a collo-
quial innovation process that is at present confined to peer conversations, some
of the scarcely produced wh-variants (e. g. the whVS construction) seem to be the
result of a socially determined conservation process.
The volume is divided into three parts. The first part, entitled “System, usage, and
variation”, opens with two central and complementary points of view: Frederick
Newmeyer’s “Language variation and the autonomy of grammar” and Gregory
System and usage: (Never) mind the gap 21
Guy’s “The grammar of use and the use of grammar”. Newmeyer discusses the
question of whether language variation calls into question the hypothesis of the
autonomy of grammar. On the basis of a modular approach to variation, he argues
that variation and probabilities do not pertain to linguistic knowledge proper.
Rather, they should be viewed as the result of the interaction between gram-
matical competence and extragrammatical factors such as processing pressure
or social factors. As already mentioned above, Newmeyer proposes that this inter-
action is modulated by the user’s manual – that is, conceived of as an interface
between grammatical competence and extragrammatical factors.
Guy takes a different stance on the relation between grammar and variation
and argues that language is “a uniquely social phenomenon”. For him, linguistic
knowledge is exclusively derived from usage and interaction with other users.
Accordingly, the core linguistic knowledge is assumed to include knowledge
about variation, probabilities and social factors. This does not mean that it is
devoid of abstract operations and mental representations. However, abstract
operations and mental representations are inferred from usage and are thus
inherently probabilistic and variable in nature.
The following two contributions are the papers of Richard Cameron and
Mary Kato. Cameron’s article “Looking for structure-dependence, category-sen-
sitive processes, and long-distance dependencies in usage” deals with specific
problems of the system-usage dichotomy, while Mary Kato’s paper “Variation in
syntax: Two case studies on Brazilian Portuguese” introduces the dimension of
variation as the central topic.
The dimension of variation is most strongly concentrated in the second part
of this volume, entitled “Rare phenomena and variation”. All contributions in
this part have a strong empirical orientation and take into account rare phenom-
ena. Göz Kaufmann writes about “Rare phenomena revealing basic syntactic
mechanisms: The case of unexpected verb-object sequences in Mennonite Low
German.” He stresses the importance of a thorough analysis of the possible inter-
play of seemingly unrelated syntactic mechanisms. Leonie Cornips’ paper “The
no man’s land between syntax and variationist sociolinguistics: The case of idio-
lectal variability” deals with the central question of intra-speaker variation which
she exemplifies by means of four case studies. Finally, Aria Adli’s contribution
“What you like is not what you do: Acceptability and frequency in syntactic vari-
ation” is especially important in view of the topics dealt with in section 2.1, where
advantages and disadvantages of different types of empirical data are discussed.
All papers in the third part of this volume, entitled “Grammar, evolution, and
diachrony”, deal with the dimension of time. The papers “Gradual loss of analyz-
ability: Diachronic priming effects” by Rena Torres Cacoullos and “How usage
rescues the system: Persistence as conservation” by Malte Rosemeyer analyze
22 Aria Adli, Marco García García, and Göz Kaufmann
the causes of historic change in the Spanish progressive and Spanish auxiliary
selection, respectively. Longer time periods and more abstract questions are the
focus of Hubert Haider and Guido Seiler. Both their contributions, “‘Intelli-
gent design’ of grammars – a result of cognitive evolution” and “Syntactization,
analogy and the distinction between proximate and evolutionary causations”
apply the biological concept of evolution to language and present hints of how
the gap between functional and formal approaches may be narrowed.
References
Adger, David (2006): Combinatorial variability. Journal of Linguistics 42: 503–530.
Adger, David and Jennifer Smith (2010): Variation in agreement: A lexical feature-based
approach. Lingua 120(5): 1109–1134.
Adli, Aria (2013): Syntactic variation in French wh-questions: a quantitative study from the
angle of Bourdieu’s sociocultural theory. Linguistics 51(3): 473–515.
Barbiers, Sjef (2005): Word order variation in three-verb clusters and the division of labour
between generative linguistics and sociolinguistics. In: Leonie Cornips and Karen
P. Corrigan (eds.), Syntax and Variation: Reconciling the Biological and the Social,
233–264. Amsterdam: John Benjamins.
Barbiers, Sjef (2013): Where is syntactic variation? In: Peter Auer, Javier Caro Reina and Göz
Kaufmann (eds.), Language Variation – European Perspective IV: Selected Papers from
the Sixth International Conference on Language Variation in Europe (ICLaVE 6), 1–26.
Amsterdam/Philadelphia: John Benjamins.
Behrens, Heike, Stefan Pfänder, Peter Auer, Daniel Jacob, Rolf Kailuweit, Lars Konieczny, Bernd
Kortmann, Christian Mair and Gerhard Strube (to appear): Introduction. In: Heike Behrens
and Stefan Pfänder (eds.), Experience counts: Frequency effects in language, Berlin/
Boston: Mouton de Gruyter.
Belletti, Adriana and Luigi Rizzi (2002): Editors’ introduction: some concepts and issues in
linguistic theory. In: Adriana Belletti and Luigi Rizzi (eds.), Noam Chomsky: On Nature and
Language, 1–44. Cambridge: Cambridge University Press.
Bender, Emily M. (2007): Socially meaningful syntactic variation in sign-based grammar.
English Language and Linguistics 11(2): Special Issue on Variation in English Dialect
Syntax: Theoretical Perspectives, 347–381.
Biberauer, Theresa and Marc Richards (2006): True optionality: When the grammar doesn’t
mind. In: Cedric Boeckx (ed.), Minimalist Essays, 35–67. Amsterdam: John Benjamins.
Bybee, Joan L. (2002): Sequentiality as the basis of constituent structure. In: Talmy Givón
and Bertram F. Malle (eds.), The Evolution of Language out of Pre-language, 109–132.
Amsterdam & Philadelphia: John Benjamins.
Bybee, Joan L. (2006): From Usage to Grammar: The Mind’s Response to Repetition. Language
82(4): 711–733.
Bybee, Joan L. (2010): Language, Usage and Cognition. Cambridge: Cambridge University
Press.
System and usage: (Never) mind the gap 23
Bybee, Joan L. and Paul J. Hopper (2001): Introduction to frequency and the emergence
of linguistic structure. In: Joan L. Bybee and Paul J. Hopper (eds.), Frequency and the
Emergence of Linguistic Structure, 1–24. Amsterdam: John Benjamins.
Chomsky, Noam (1965): Aspects of the Theory of Syntax. Cambridge: MIT Press.
Chomsky, Noam (1981): Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, Noam (2000): New Horizons in the Study of Language and Mind. Cambridge:
Cambridge University Press.
Chomsky, Noam (2009): Opening remarks. In: Massimo Piattelli-Palmarini, Juan Uriagereka
and Pello Salaburu (eds.), Of Minds and Language: A Dialogue with Noam Chomsky in the
Basque Country, 13–43. Oxford: Oxford University Press.
Culy, Christopher (1996): Null objects in English recipes. Language Variation and Change 8(01):
91–124.
Cysouw, Michael and Jan Wohlgemuth (2010): The other end of universals: theory and typology
of rara. In: Jan Wohlgemuth and Michael Cysouw (eds.), Rethinking Universals, 1–10.
Berlin/New York: Mouton de Gruyter.
Den Dikken, Marcel, Judy B. Bernstein, Christina Tortora and Raffaella Zanuttini (2007): Data
and grammar: Means and individuals. Theoretical Linguistics 33(3): 335–352.
Eckert, Penelope (2000): Linguistic Variation as Social Practice. Malden, MA/Oxford: Blackwell.
Fanselow, Gisbert (2007): Carrots perfect as vegetables, but please not as a main dish.
Theoretical Linguistics 33(3): 353–367.
Featherston, Sam (2007a): Data in generative grammar: the stick and the carrot. Theoretical
Linguistics 33(3): 269–318.
Featherston, Sam (2007b): Reply. Theoretical Linguistics 33(3): 401–413.
Fukui, Naoki (1993): Parameters and optionality. Linguistic Inquiry 24: 399–420.
García García, Marco (2014): Differentielle Objektmarkierung bei unbelebten Objekten im
Spanischen. Berlin/Boston: de Gruyter.
Geeraerts, Dirk, Gitte Kristiansen and Yves Peirsman (eds.) (2010): Advances in cognitive
sociolinguistics. Berlin/New York: Mouton de Gruyter.
Grewendorf, Günther (2007): Empirical evidence and theoretical reasoning in generative
grammar. Theoretical Linguistics 33(3): 369–380.
Guy, Gregory R. (1991): Explanation in variable phonology: an exponential model of morpho-
logical constraints. Language Variation and Change 3(1): 1–22.
Guy, Gregory R. (2005): Grammar and usage: A variationist response (Letters to Language).
Language 81(3): 561–563.
Guy, Gregory R. (2007): Grammar and usage: The discussion continues (Letters to Language).
Language 83(1): 2–4.
Haider, Hubert (2007): As a matter of facts comments on Featherston’s sticks and carrots.
Theoretical Linguistics 33(3): 381–394.
Haider, Hubert and Inger Rosengren (2003): Scrambling: nontriggered chain formation in OV
languages. Journal of Germanic Linguistics 15(3): 203–267.
Henry, Alison (2002): Variation and syntactic theory. In: Jack K. Chambers, Peter Trudgill and
Natalie Schilling-Estes (eds.), The Handbook of Language Variation and Change, 267–282.
Oxford: Blackwell.
Holmberg, Anders, Aarti Nayudu and Michelle Sheehan (2009): Three partial null-subject
languages: a comparison of Brazilian Portuguese, Finnish and Marathi. Studia Linguistica
(Special Issue: Partial Pro-drop) 63(1): 59–97.
24 Aria Adli, Marco García García, and Göz Kaufmann
Hopper, Paul (1998): Emergent grammar. In: Michael Tomasello (ed.), The New Psychology of
Language: Cognitive and Functional Approaches to Language Structure, 155–175. Mahwah:
Lawrence Erlbaum.
Kato, Mary A. (2011): Acquisition in the context of language change: the case of Brazilian
Portuguese null subjects. In: Esther Rinke and Tanja Kupisch (eds.), The Development of
Grammar: Language Acquisition and Diachronic Change – In Honour of Jürgen M. Meisel,
309–330. Amsterdam/New York: John Benjamins.
Kaufmann, Göz (2005): Der eigensinnige Informant: Ärgernis bei der Datenerhebung oder
Chance zum analytischen Mehrwert? In: Friedrich Lenz and Stefan Schierholz (eds.),
Corpuslinguistik in Lexik und Grammatik, 61–95. Tübingen: Stauffenberg.
Kaufmann, Göz (2007): The verb cluster in Mennonite Low German: A new approach to an old
topic. Linguistische Berichte 210: 147–207.
Kempen, Gerard and Karin Harbusch (2005): The relationship between grammaticality ratings
and corpus frequencies: a case study into word order variability in the midfield of German
clauses. In: Stephan Kepser and Marga Reis (eds.), Linguistic Evidence: Empirical,
Theoretical and Computational Perspectives, 329–349. Berlin/New York: Mouton de
Gruyter.
Kristiansen, Gitte and René Dirven (eds.) (2008): Cognitive Sociolinguistics. Language
Variation, Cultural Models, Social Systems. Berlin/New York: Mouton de Gruyter.
Kroch, Anthony (1989): Reflexes of grammar in patterns of language change. Language
Variation and Change 1: 199–244.
Labov, William (1969): Contraction, deletion, and inherent variability of the English copula.
Language 45(4): 716–762.
Labov, William (1972): Sociolinguistic Patterns. Philadelphia: University of Pennsylvania Press.
Labov, William (1975): What is a Linguistic Fact. Lisse: The Peter de Ridder Press.
Langacker, Ronald W. (1987): Foundations of Cognitive Grammar, Vol. 1: Theoretical prereq-
uisites. Palo Alto: Stanford University Press.
Lightfoot, David (1999): The Development of Language: Acquisition, Change, and Evolution.
Oxford/Malden, MA: Blackwell.
Lightfoot, David (2006): How new languages emerge. Cambridge: Cambridge University Press.
Meisel, Jürgen M. (2011): Bilingual language acquisition and theories of diachronic change:
Bilingualism as cause and effect of grammatical change. Bilingualism Language and
Cognition 14(2): 121–145.
Newmeyer, Frederick J. (2003): Grammar is grammar and usage is usage. Language 79(4):
682–707.
Newmeyer, Frederick J. (2006): Grammar and usage: A response to Gregory R. Guy (Letters to
Language). Language 82(4): 705–708.
Newmeyer, Frederick J. (2007): Commentary on Sam Featherston, Data in generative grammar:
The stick and the carrot. Theoretical Linguistics 33(3): 395–399.
Otheguy, Ricardo, Ana Celia Zentella and David Livert (2007): Language and dialect contact in
Spanish in New York. Language 83: 770–802.
Pullum, Geoffrey K. (2007): Ungrammaticality, rarity, and corpus use. Corpus Linguistics &
Linguistic Theory 3(1): 33–47.
Roeper, Thomas (1999): Universal bilingualism. Bilingualism: Language and Cognition 2(3):
169–186.
Saito, Mamoru and Naoki Fukui (1998): Order in phrase structure and movement. Linguistic
Inquiry 29(3): 439–474.
System and usage: (Never) mind the gap 25
Sankoff, David (1988): Sociolinguistics and syntactic variation. In: Frederick J. Newmeyer (ed.),
Linguistics: the Cambridge Survey. Vol IV: The Socio-cultural Context, 140–161. Cambridge:
Cambridge University Press.
Sankoff, Gillian (2005): Cross-sectional and longitudinal studies. In: Ulrich Ammon, Norbert
Dittmar, Klaus J. Mattheier and Peter Trudgill (eds.), An International Handbook of the
Science of Language and Society, Volume 2, 2, 1003–1013. Berlin/New York: Mouton de
Gruyter.
Sankoff, Gillian and Hélèn Blondeau (2007): Language change across the lifespan: /r/ in
Montreal French. Language 83(3): 560–588.
Schütze, Carson T. (1996): The Empirical Base of Linguistics: Grammaticality Judgments and
Linguistic Methodology. Chicago: University of Chicago Press.
Szmrecsanyi, Benedikt (2005): Language users as creatures of habit: A corpus-based analysis
of persistence in spoken English. Corpus Linguistics and Linguistic Theory 1(1): 113–150.
Thom, René (1980): Modèles mathématiques de la morphogenèse. Paris: C. Bourgois.
Tortora, Christina and Marcel den Dikken (2010): Subject agreement variation: Support for the
configurational approach. Lingua: 1089–1108.
Weinreich, Uriel, William Labov and Marvin I. Herzog (1968): Empirical foundations for a theory
of language change. In: W. P. Lehmann and Yakov Malkiel (eds.), Directions for Historical
Linguistics: A Symposium, 95–195. Austin, TX: University of Texas Press.
Wilson, John and Alison Henry (1998): Parameter setting within a socially realistic linguistics.
Language in Society 27: 1–21.
Yang, Charles D. (2000): Internal and external forces in language change. Language Variation
and Change 12(3): 231–250.
Zwicky, Arnold M. (1999): The grammar and the user’s manual. Paper presented at ‘LSA’s
Linguistic Institute (Forum Lecture)’, University of Illinois-Champaign & Urbana.
Part 1: System, usage, and variation
Frederick J. Newmeyer, University of Washington, University of
British Columbia and Simon Fraser University Canada
Abstract: This paper takes on the question of whether the facts of language vari-
ation call into question the hypothesis of the autonomy of grammar. A significant
number of sociolinguists and advocates of stochastic approaches to grammar
feel that such is the case. However, it will be argued that there is no incompati-
bility between grammatical autonomy and observed generalizations concerning
variation.
1 Introduction
The point of departure of this paper is a set of propositions which, while not
universally accepted among linguists, have at least a wide and ever-increasing
currency. They are, first, that a comprehensive theory of language has to account
for variation (Weinreich, Labov and Herzog 1968 and much subsequent work);
second, that much of everyday variability in speech is systematic, showing both
social and linguistic regularities (Labov 1969 and much subsequent work); third,
that language users are highly sensitive to frequencies, a fact that has left its mark
on the design of grammars (Hooper 1976 and much subsequent work); and fourth,
that an overreliance on introspective data is fraught with dangers (Derwing 1973
and much subsequent work). The question to be probed is whether, given these
propositions, one can reasonably hypothesize that grammar is autonomous with
respect to use. The paper is organized as follows. Section 2 introduces the concept
of the ‘autonomy of grammar’, along with some theoretical and methodological
considerations relevant to its understanding. The central section 3 examines and
attempts to refute recent claims that the facts surrounding language variation
show that autonomy is untenable. Section 4 is a brief conclusion.
1 I would like to thank Marco García García and Hubert Haider for their comments on the entire
pre-final manuscript. Thanks also to Ralph Fasold, David Odden, and Panayiotis Pappas for
fruitful discussion on the topic of this paper. They are not to be held responsible for any errors.
30 Frederick J. Newmeyer
I begin with some fairly obvious and somewhat trivial examples of these points
and then turn to more complex cases. As far as (2a) is concerned, nobody would
suggest that speakers with a serious head cold should be endowed with a separate
grammar, even though their vowels are consistently more nasalized than those of
the healthier members of their speech community. Appeal to the partial blockage
of the passages involved in speech production suffices to explain the phenome-
non. Many different types of generalizations fall under (2b). For example, speakers
might know that adjectives like asleep, awake, and ajar are different from most
other adjectives in that they do not occur prenominally. But they can hardly know
that the reason for their aberrant behavior derives from the fact that these adjec-
tives were historically grammaticalizations of prepositional phrases (awake was
originally at wake). A child in acquiring his or her language does not learn the
history of that language. Along the same lines, children acquiring German learn
the principles involved in V2 order and those acquiring English learn to produce
the retroflex ‘r’ sound. But neither learn that these elements of their languages
are typologically quite rare. Likewise, speakers cannot be assumed to know epi-
phenomenal facts, that is, properties of their language that are the byproduct of
other properties (which may or may not be part of knowledge). Speakers know
Language variation and the autonomy of grammar 31
Not every regularity in the use of language is a matter of grammar. (Zwicky and Pullum 1987:
330, cited in Yang 2008: 219)
Finally, to exemplify (2c), I know the meaning of the definite article the, its privi-
leges of occurrence, and its pronunciation. I also happen to know that I am more
likely to use that word than any other word of English. These however are dif-
ferent ‘kinds of knowledge’. I learned the former as an automatic consequence
of acquiring competence in English. The latter is a metalinguistic fact that arose
from conscious observation and speculation about my language.
So given these considerations, how can we know what to include in models
of grammatical competence and what to exclude from it? In particular, given
the theme of this volume, how can we know to what extent (if at all) variabil-
ity is encoded in the grammar itself? As it turns out, classical formal grammar
has nothing to say about probabilistic aspects of grammatical processes, except
to hypothesize that where we find variability we have ‘optional’ grammatical
rules. For example, in Chomsky (1957) active and passive pairs were related by
an optional transformational rule. No attempt was made to capture as part of the
rule the fact that actives are used more frequently than passives and are used in
different discourse circumstances. In fact, an approach to grammar excluding the
direct representation of probabilities might be the best one to take if it could be
shown, in line with (2a–c), that the probabilities in question are a different sort
of knowledge from grammatical knowledge or are not in any reasonable sense
‘knowledge’ at all.
So the crucial question is to what extent speakers actually ‘know’ the prob-
abilities associated with points of variation and if they do know them, then what
kind of knowledge that is. One alternative to their knowing probabilities might
be that quantitative aspects of speaker behavior are no more than a reflection of
principles that, in their interaction, lead them to act in a certain way a certain per-
centage of the time. Let me give an example of an epiphenomenal consequence
of interacting principles that is drawn from everyday life. My place of work is four
blocks north of where I live and four blocks west. I could construct a ‘grammar
of my walk to work’ to characterize my procedure for proceeding from my home
to my office. Each intersection that I cross has a traffic light. If the first light that
I come to is green, I continue straight on to the north. If the light is red, I turn
32 Frederick J. Newmeyer
to the left (to the west). I continue this procedure at each intersection up to the
point where I don’t overshoot my mark to the west or to the north. This leads
to the possibility of more than a dozen different routes for getting to work. But
in practice they are not equally frequent, because traffic lights differ from each
other in the percentage of time that they are green or red. It would not be hard,
if I wanted to do it, to calculate the percentage of time that I take each route. But
I do not ‘know’ these percentages, in any reasonable sense of the word ‘know’. I
certainly have a vague feeling that I take some routes more than others. But the
percentages are not encoded in my ‘grammar of walking to work’. They fall out as
an epiphenomenal by-product of the interaction of my grammar of walking and
the timing of traffic lights.
I think that at this point the reader can see where I am heading, as far as
probabilistic generalizations in linguistics are concerned. To the extent that var-
iability is predictable externally, it does not need to be encoded in the grammar.
But before turning to linguistic examples, the reader might well ask about my
walking to work story: ‘Is the percentage of the time that I take any particular
route to work completely predicted by reference only to my grammar of walking
and to the timing of traffic lights?’ The answer is ‘no’, and the reason for that neg-
ative answer is quite relevant to how we should handle linguistic data that are not
fully predictable. We return to this problem below.
Turning to language, a huge number of facts that one might be tempted to put
in the grammar no more belong there than probabilities belong in my grammar of
walking to work. So consider a pair of sentences from Manning (2002):
(3) a. It is unlikely that the company will be able to meet this year’s revenue forecasts.
b. That the company will be able to meet this year’s revenue forecasts is unlikely.
Manning points out that we are far more likely to say (3a) than (3b) and suggests
that this likelihood forms part of our knowledge of grammar. No it does not. It is
part of our use of language that, for both processing and stylistic reasons, speakers
tend to avoid sentences with heavy subjects (see Hawkins 1994; 2004). As a con-
sequence, one is more likely to say things like (3a) than (3b). It would be super-
fluous to repeat in the grammar what is adequately accounted for outside of it.
The probability of using some grammatical element might arise as much from
real-world knowledge and behavior as from parsing ease. For example, Wasow
(2002) notes that we are much more likely to use the verb walk intransitively than
transitively, as in (4a–b):
In this view, the observed probability of a particular variant results from the inter-
action of the formal grammar and extragrammatical faculties, as modulated by
the user’s manual.
What then is the nature and function of the user’s manual? This construct
presents one face to grammatical competence and one face to the external factors
that shape grammar. In a nutshell, it tells us what to do with our grammars and
how often to do it. For example, it might tell an English speaker to avoid stranding
prepositions in formal writing, to extrapose heavy subjects, and to avoid certain
vulgar expressions in polite conversation. These usage conventions are not totally
arbitrary, of course. They are shaped by stylistic level, processing pressure, and
2 A referee poses the question of where information structure fits into this picture. While space
does not permit a comprehensive reply, in my view, generalizations pertaining to the grammar-
discourse interface are partly subsumed under ‘grammatical competence’, partly handled in the
user’s manual, and are partly shaped by external factors such as the exigencies of constructing
a coherent discourse.
34 Frederick J. Newmeyer
social factors respectively. But they are not totally predictable either. Clearly lan-
guage varieties differ in the degree to which one factor predominates in a partic-
ular situation. So there is a certain degree of arbitrariness in the user’s manual.
Let me give a concrete example of the functioning of the user’s manual. For
several decades there has been a debate on the nature of Subjacency and other
constraints on extraction.3 Two counterposed views have been put forward:
In support of (5a), it is typically pointed out that the effects of Subjacency differ
somewhat from language to language and that there is no one-to-one correspond-
ence, in any particular language, between parsing and other pressure and what is
ruled out by the constraint (see Fodor 1984; Newmeyer 2005). In support of (5b)
it is typically pointed out that, to a very great extent, the effects of Subjacency
do follow from external pressure (Deane 1992; Kluender 1992). Furthermore, the
effects of Subjacency are variable, in that some island effects are stronger than
others. In English, for example, it is more difficult to extract from finite clauses
than from non-finite clauses (see Szabolcsi and den Dikken 1999).
In my view, both the formalists and the functionalists are partly right and
partly wrong. Subjacency is just the sort of phenomenon we would expect to find
localized in the user’s manual. The interaction between external pressure, in this
case largely parsing pressure, and the language-particular grammar, generates
Subjacency effects. But I use the term ‘generates’ advisedly. The interaction of
grammar and performance invites, so to speak, the existence of Subjacency
effects, but it does not fully predict them. The user’s manual, for reasons of his-
torical accident, the vagaries of use, and so on, necessarily accommodates a
certain degree of arbitrariness.
3 Subjacency is a constraint that rules out extractions of elements in particular syntactic con-
figurations. For example, the deviant English sentences *What did you wonder where Bill put?
and *The woman who I believe the claim that Mary talked to are Subjacency violations.
Language variation and the autonomy of grammar 35
Let us now turn to some key studies of language variation that might seem to
call into question the correctness of the hypothesis of grammatical autonomy.
Throughout this section, I elaborate on the point, discussed in the previous
section, that not all generalizations about grammatical patterning are necessarily
handled the same way. Some are encoded directly in the grammar and some are
not. Take /-t, -d/ deletion in English, that is, the deletion of a coronal stop in final
clusters:
The probability of deletion is tied to the nature of the preceding and following
segment, as is partly illustrated in Table 1:
Table 1: Following segment effect on English /-t, -d/ deletion (Guy 1980: 14)
— Following Context,
Does that mean we need to complicate the rule of deletion by including this infor-
mation? In other words, do we need a variable rule? Not necessarily, if the knowl-
edge of how often we delete is a different kind of knowledge from whether we are
allowed to delete at all. And it is a different kind of knowledge. Let’s look at that
point in more detail.
Guy (1997) has constructed several arguments with the goal of demonstrating
that the variable weights need to be stated in the rule itself.
For example, he has argued that if the regularities of variability were stated
in some separate performance component, then we would need to state the same
constraint twice, once in the grammar and once in performance. So consider the
fact that final /t/ and /d/ are never pronounced after another /t/ and /d/:
(8) The more shared features between the /t/ and /d/ and what precede them, the less likely the
sequence will be realized in actual speech (Guy and Boberg 1994: n. p.).
As Guy notes, these appear to be Obligatory Contour Principle (OCP) effects, and
he remarks:
Now, if the variable data arise not because of the competence OCP constraint, but stem from
a separate performance OCP constraint, it should come as a theoretical surprise, a random
coincidence, that the two are so similar in nature and direction of effect. (Guy 1997: 134)
But there is no ‘competence OCP constraint’, in the sense of there being a principle
of universal grammar called the OCP. As Odden (1986) and others have pointed
out, what are called OCP effects differ wildly from language to language and are
not even present in some languages. What we have is universal articulatory- and
acoustic-based pressure to avoid sequences of segments that are ‘too close’ to
each other. English grammaticalizes this pressure to a certain extent more than
some languages and less than others. This pressure is responsible for the impos-
sibility of forms like (7a). But that has nothing to do with the rule of /t, d/-deletion
per se, much less a variable condition that needs to be imposed on it. In fact, we
never find geminate /t/’s and /d/’s in English. What we have then is an interface
principle of the user’s manual that looks at English phonology in one direction
and looks at external pressure in the other direction and generates the statistical
generalization.
Is this interface principle an automatic exceptionless consequence of the
interaction of English phonology and phonetic pressure? Certainly not, but the
fact that variable data cannot be derived in their entirety from universal principles,
does not mean that they need to be stated ad hoc in the competence grammar. Let
me draw another analogy with the grammar of my walking to work. There are,
in fact, more factors than the timing of traffic lights that affect the probability
of my taking one route more often than another. Some are ‘global’, in that they
would affect anybody following the same strategy. For example, one intersection
might be blocked by construction and therefore likely to be avoided. Some con-
straints are what one might call ‘local’ or ‘personal’. For example, I might opt
more often for a particular route because the view appeals to me. But the fact that
the probabilities are not fully predictable does not mean that one needs to revert
to encoding them in the grammar of walking per se. One derives what one can
with the understanding that there will always be a residue of contributing factors
that are unexplained and perhaps inexplicable.
Language variation and the autonomy of grammar 37
Another argument from Guy (1997) involves the fact that deletion is much
more frequent before /l/ than before /r/. Without deletion, resyllabification
would result in /tl-/ /dl-/ onsets, which, as Guy points out, are lexically impos-
sible in English. But they are not impossible universally, so, he argues, there is
no hope of deriving the facts from articulatory universals. But it is not necessary
to derive them from articulatory universals. They can be derived in part from the
fact, already noted, that English bans /tl-/ and /dl-/ onsets. In other words, the
probability-computing function of the user’s manual has access to the mental
grammar. The user’s manual is also aware that the rule of /-t, -d/ deletion, which
can produce such onsets, is optional. Taking into account these two facts, it
instructs the speaker to delete /t/ and /d/ before the relevant /l/’s with a high
frequency rate. Of course there is a lot more to be said than that. For example, the
user’s manual also has information from the other direction. For phonetically-
based reasons, /tl-/ and /dl-/ onsets are relatively rare. That fact surely influences
the probability of /-t, -d/ deletion in this case, though I am not in a position to
specify precisely how.
When we turn to syntax, it is even easier to pinpoint the problems with purely
grammatical approaches to variability. At least in phonology, we can usually
say with confidence that two variants are just different ways of saying the same
thing. That is much less true in syntax. In an important paper, Beatriz Lavandera
(1978) pointed out that the choice of syntactic variants is determined in part by
the meaning that they convey. Viewed from that angle, assigning probabilities to
rules, structures, or constraints seems especially problematic. The probabilities
may be more a function of one’s intended meaning than of some inherent property
of the linguistic unit itself. No, it is not the case that all syntactic variants differ
in meaning. But the great majority do, if our definition of ‘meaning’ includes the
full range of discourse-pragmatic aspects of interpretation. Let’s take the various
possibilities of post-verbal orderings of elements in English as an example:
(11) Verb-Particle
a. Sandy picked the freshly baked apple pie up.
b. Sandy picked up the freshly baked apple pie.
38 Frederick J. Newmeyer
Arnold, Wasow, Losongco and Ginstrom (2000) calculated the probabilities for
speaker choice of one ordering variant or the other and found a complex inter-
action of meaning factors, in particular whether the constituent is new to the dis-
course or not, and processing factors, such as the ‘heaviness’ (that is, the pro
cessing complexity) of the constituent. In other words, the (a) and (b) variants in
(9–11) can be used to convey different meanings. Hence we have a good example
here of why we would not want to tie variability to particular rules or grammatical
elements. Since the heavy NP shift alternants, the dative alternants, and the verb-
particle alternants do not mean the same thing, the alternant that is chosen in
discourse is a function in part of the meaning that the speaker wishes to convey.
That fact would be obscured by a probabilistic rule relating the variants in ques-
tion. So we see how incorporating variability into a particular rule can mask an
explanation of the underlying generalization.
Another example can be drawn from variable subject-verb agreement in
Brazilian Portuguese (BP; see Guy 2005 for a summary). In that language, sub-
jects can occur both preverbally and postverbally. But interestingly, subject-verb
agreement is disfavored, but not categorically impossible, with postposed sub-
jects. Guy, following a popular, but not universally accepted, analysis, suggests
that subjects of unaccusative verbs are originally VP-internal and have to raise
across the verb to trigger the feature checking that accomplishes agreement. In
his analysis, the variability in agreement is a property of the feature checking
process (and hence purely grammar-internal). I offer as an alternative the idea
that agreement in post-verbal position is completely optional, as far as the com-
petence grammars of BP speakers is concerned. Why is that a better alternative?
As a first point, it needs to be stressed that preverbal and postverbal subjects
do not have the same meaning. In BP, as in all Romance languages, preverbal
and postverbal subjects differ in their discourse properties (Naro and Votre 1999).
These meaning differences would be obscured by a variable rule relating the two
subject positions. But there is more to be said than that. If postverbal subjects
do in fact originate in object position, then we have an independent explanation
for the agreement facts. Verb-object agreement is crosslinguistically significantly
less common than subject-verb agreement (Siewierska and Bakker 1996), a fact
which is rooted ultimately in the greater topicality of subjects vis-à-vis objects
(Corbett 2005). In the approach advocated here, all of the relevant generalizations
can be accommodated. The grammar of BP allows agreement with arguments in
both positions. The user’s manual interfaces discourse-based and functional
factors on the one hand and the grammar on the other hand to derive the statis-
tical generalizations.
As far as syntactic rules and meaning are concerned, there certainly are
processes that have little or no effect on meaning. So it is sometimes claimed
Language variation and the autonomy of grammar 39
that there is no meaning difference between sentences in English where the sub-
ordinate clause is marked by the complementizer that and those where it is not.
(12a–b) are examples:
(13) The presence or absence of that is affected by (Bolinger 1972; Quirk, Greenbaum, Leech and
Svartvik 1985; Thompson and Mulac 1991; Biber, Johansson, Leech, Conrad and Finegan
1999; Hawkins 2001; Dor 2005; Kaltenböck 2006; Kearns 2007; Dehé and Wichmann 2010):
a. the type and frequency of the matrix verb
b. the type of the main clause subject (pronominal vs. full noun phrase)
c. the choice of matrix clause pronoun
d. the length, type, and reference of the embedded subject
e. the position and function of the embedded clause
f. the voice of the main clause (active vs. passive)
g. ambiguity avoidance
h. the linear adjacency or not of the matrix verb and that
i. the speech register
j. the ‘truth claim’ (Dor 2005) to the proposition of the embedded clause
k. the rhythmic pattern of the utterance
No doubt with sufficient ingenuity one could write a variable rule of that-deletion
sensitive to all of the conditioning factors in (13a–k). But that would be a mistake,
since each of the conditioning factors conditions other processes in English. For
example, consider (13a). The matrix verbs that inhibit that-deletion, for example,
factive verbs like regret, are the same ones that resist infinitival complements and
resist extraction from the complement:
Even when variants have the same meaning, it is clear that they can differ
stylistically. That fact poses more than a small problem for handling variation
grammar-internally. Put simply, it would lead to a different set of probabilities
for each genre, carrying the idea of handling variation grammar-internally to an
unacceptable conclusion. It is sometimes claimed that stylistic variation poses no
problems, since it is said to be quantitatively simple, involving raising or lower-
ing the selection frequency of socially sensitive variables without altering other
grammatical constraints on variant selection (Boersma and Hayes 2001; Guy
2005). In fact, Guy (2005: 562) has written that “it is commonly assumed in VR
analyses that the grammar is unchanged in stylistic variation.” The research on
register does not support such an idea. Biber has shown that there are at least six
‘dimensions’ in which genres interact:
Different genres, and the grammatical variability that they manifest, map differ-
ently onto each dimension. Along Dimension (15e), for example, we find differ-
ences within press reportage genres. Passives and other past participial construc-
tions are much more probable in spot news broadcasts than in financial reporting.
We find similar statistical differences between scientific and humanistic writing.
As far as spoken language is concerned, there are systematic differences along
Dimensions (15a), (15c), and (15f) with respect to different types of telephone con-
versations. What all of this shows, and Biber gives many more examples, is that
each speaker of English would need to be endowed with a multitude of different
variable rule-containing grammars if one were serious about handling variation
grammar-internally.
The question that has to be raised is: ‘If variable rules are so well motivated
and have been so successful, then why have people all but stopped formulating
them?’ As long as twenty years ago, Ralph Fasold was writing about ‘The quiet
demise of variable rules’ (Fasold 1991). It is true that there are a lot of people doing
probabilistic approaches to grammar these days. But by and large they have engi-
neering tasks as their ultimate goal. They are not building models of grammatical
knowledge. The models mix speech forms from different speech communities and
styles willy nilly. As one well-known practitioner of this approach has remarked:
‘As far as I’m concerned, if I can Google it, it’s English’ (attributed, perhaps apoc-
Language variation and the autonomy of grammar 41
(16) Possible occupants of the complementizer position in English relative clauses (Romaine
1982):
a. She’s the person who I saw (Wh-phrase)
b. She’s the person that I saw (that-complementizer)
c. She’s the person ___ I saw (φ)
She toyed with the idea of writing a variable rule associating the three options,
but soon realized that formal grammatical analysis does not relate the three
options by means of the same rule. Who is generally regarded as belonging to
the system of fronted wh-elements, while that is a complementizer. So a variable
rule relating the three options would not be a simple matter of adding a set of
probabilities to an existing motivated grammatical rule. Rather, it would involve
adopting a grammatical analysis accepted by few if any grammarians.
One could make the same point about the relationship between sentences
like (3a–b) above, which I repeat here as (17a–b):
(17) a. It is unlikely that the company will be able to meet this year’s revenue forecasts.
b. ?That the company will be able to meet this year’s revenue forecasts is unlikely.
Whenever a choice among two (or more) discrete alternatives can be perceived as having
been made in the course of linguistic performance, and where this choice may have been
42 Frederick J. Newmeyer
I can hardly pretend to have mastered the entire body of sociolinguistic literature,
but I am not aware of a paper in which gender, class, identity, and so on have
been incorporated into the statement of a variable rule. In other words, advocates
of variable rules themselves have adopted, to an extent, a modular approach to
linguistic variation. So I am not suggesting anything radical to variationists – just
that they follow through and make their approach a consistently modular one.4
4 Conclusion
References
Anttila, Arto (1997): Deriving variation from grammar. In: Frans Hinskens, Roeland van Hout and
W. Leo Wetzels (eds.), Variation, change, and phonological theory, 35–68. Amsterdam:
John Benjamins.
Anttila, Arto (2002): Variation and phonological theory. In: Jack K. Chambers, Peter Trudgill and
Natalie Schilling-Estes (eds.), Handbook of language variation and change, 206–243.
Oxford: Blackwell.
4 Hubert Haider has observed (personal communication) that “[f]rom a European perspective,
a sociolinguistic concept of variable rules for covering language variation appears to be amus-
ingly naïve. Only in a context like that of the US, without historically grown, easily identifiable,
regional dialects, could such a position be at all tenable.”
Language variation and the autonomy of grammar 43
Arnold, Jennifer E., Thomas Wasow, Anthony Losongco and Ryan Ginstrom (2000): Heaviness
vs. newness: The effects of structural complexity and discourse status on constituent
ordering. Language 76: 28–55.
Biber, Douglas (1988): Variation across speech and writing. Cambridge: Cambridge University
Press.
Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad and Edward Finegan (1999):
Longman grammar of spoken and written English. London: Longman.
Boersma, Paul and Bruce Hayes (2001): Empirical tests of the Gradual Learning Algorithm.
Linguistic Inquiry 32: 45–86.
Bolinger, Dwight (1972): That’s that. The Hague: Mouton.
Chomsky, Noam (1957): Syntactic structures. The Hague: Mouton.
Chomsky, Noam (1973): Conditions on transformations. In: Stephen R. Anderson and Paul
Kiparsky (eds.), A festschrift for Morris Halle, 232–286. New York: Holt Rinehart and
Winston.
Corbett, Greville G. (2005): Number of genders. In: Martin Haspelmath, Matthew S. Dryer, David
Gil and Bernard Comrie (eds.), The world atlas of language structures, 126–29. Oxford:
Oxford University Press.
Deane, Paul D. (1992): Grammar in mind and brain: Explorations in cognitive syntax. Berlin/New
York: Mouton de Gruyter.
Dehé, Nicole and Anne Wichmann (2010): Sentence-initial I think (that) and I believe (that)
Prosodic evidence for use as main clause, comment clause and discourse marker. Studies
in Language 34: 36–74.
Derwing, Bruce L. (1973): Transformational grammar as a theory of language acquisition: A
study in the empirical, conceptual, and methodological foundations of contemporary
linguistic theory. Cambridge: Cambridge University Press.
Dor, Daniel (2005): Toward a semantic account of that-deletion in English. Linguistics 43:
345–382.
Fasold, Ralph (1991): The quiet demise of variable rules. American Speech 66: 3–21.
Fodor, Janet D. (1984): Constraints on gaps: Is the parser a significant influence? In: Brian
Butterworth, Bernard Comrie and Östen Dahl (eds.), Explanations for language universals,
9–34. Berlin/New York: Mouton.
Givón, Talmy (1980): The binding hierarchy and the typology of complements. Studies in
Language 4: 333–377.
Guy, Gregory R. (1980): Variation in the group and the individual: The case of final stop
deletion. In: William Labov (ed.), Locating language in time and space, 1–36. New York:
Academic Press.
Guy, Gregory R. (1997): Competence, performance, and the generative grammar of variation.
In: Frans Hinskens, Roeland van Hout and W. Leo Wetzels (eds.), Variation, change, and
phonological theory, 125–143. Amsterdam: John Benjamins.
Guy, Gregory R. (2005): Grammar and usage: A variationist response. Language 81: 561–563.
Guy, Gregory R. and Charles Boberg (1994): The obligatory contour principle and sociolinguistic
variation. Toronto Working Papers in Linguistics: Proceedings of the Canadian Linguistics
Association 1994 Annual Meeting.
Hawkins, John A. (1994): A performance theory of order and constituency. Cambridge:
Cambridge University Press.
Hawkins, John A. (2001): Why are categories adjacent? Journal of Linguistics 37: 1–34.
44 Frederick J. Newmeyer
Hawkins, John A. (2004): Efficiency and complexity in grammars. Oxford: Oxford University
Press.
Hooper, Joan B. (1976): Word frequency in lexical diffusion and the source of morphophonemic
change. In: William M. Christie (ed.), Current progress in historical linguistics, 95–106.
Amsterdam: North-Holland.
Kaltenböck, Gunther (2006): ‘…That is the question’: Complementizer omission in extraposed
that-clauses. English Language and Linguistics 10: 371–396.
Kearns, Katherine S. (2007): Epistemic verbs and zero complementizer. English Language and
Linguistics 11: 475–505.
Kluender, Robert (1992): Deriving island constraints from principles of predication. In:
H. Goodluck and M. Rochemont (eds.), Island constraints: Theory, acquisition, and
processing, 223–258. Dordrecht: Kluwer.
Kuno, Susumu (1973): Constraints on internal clauses and sentential subjects. Linguistic
Inquiry 4: 363–385.
Labov, William (1969): Contraction, deletion, and inherent variability of the English copula.
Language 45: 716–762.
Lavandera, Beatriz R. (1978): Where does the sociolinguistic variable stop? Language in Society
7: 171–182.
Manning, Christopher, D. (2002): Probabilistic syntax. In: Rens Bod, Jennifer Hay and Stefanie
Jannedy (eds.), Probabilistic linguistics, 289–341. Cambridge, MA: MIT Press.
Naro, Anthony J. and Sebastião J. Votre (1999): Discourse motivations for linguistic regularities:
Verb/subject order in spoken Brazilian Portuguese. Probus 11: 75–100.
Newmeyer, Frederick J. (1986): Linguistic theory in America: Second edition. New York:
Academic Press.
Newmeyer, Frederick J. (1996): Benchmarks: 35 years of linguistics. The Sciences 36: 13.
Newmeyer, Frederick J. (1998): Language form and language function. Cambridge, MA: MIT
Press.
Newmeyer, Frederick J. (2002): Optimality and functionality: A critique of functionally-based
optimality-theoretic syntax. Natural Language and Linguistic Theory 20: 43–80.
Newmeyer, Frederick J. (2005): Possible and probable languages: A generative perspective on
linguistic typology. Oxford: Oxford University Press.
Odden, David (1986): On the role of the Obligatory Contour Principle in phonological theory.
Language 62: 353–383.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech and Jan Svartvik (1985): A comprehensive
grammar of the English language. Harlow: Longman.
Romaine, Suzanne (ed.) (1982): Sociolinguistic variation in speech communities. London:
Edward Arnold.
Sankoff, David (1988): Variable rules. In: Ulrich Ammon, Norbert Dittmar and Klaus J. Mattheier
(eds.), Sociolinguistics: An international handbook of the science of language and society,
984–997. Berlin/New York: Walter de Gruyter.
Saussure, Ferdinand de (1916/1966): Course in general linguistics. New York: McGraw-Hill.
[Translation of Cours de linguistique générale. Paris: Payot, 1916].
Siewierska, Anna and Dik Bakker (1996): The distribution of subject and object agreement and
word order type. Studies in Language 20: 115–161.
Szabolcsi, Anna and Marcel den Dikken (1999): Islands. Glot International 4–6: 3–8.
Thompson, Sandra A. and Anthony Mulac (1991): The discourse conditions for the use of the
complementizer that in conversational English. Journal of Pragmatics 15: 237–251.
Language variation and the autonomy of grammar 45
to be a speaker except what we hear and perceive from those around us.1 This
is true of the child language learner, and it is also true of the linguist; even a
linguist’s intuition about grammaticality is a product of the system. The idea
that intuition and introspection give us a usage-free pathway to inspect the inner
workings of language directly is misguided. But given that our information comes
only from usage, the next questions for the linguist are, what do we do with that
information, and what does the mental capacity to be a speaker consist of?
These are the questions that point us to the ‘system’ part of the title of this
volume. But we must be cautious in how we use the terminological distinction
between ‘system’ and ‘usage’, lest we fall into the essentialist trap of believing
that two different labels must refer to two essentially distinct things – that the
‘system’ is something separate from all the evidence obtained from usage. I take
this pair of terms to be a contemporary update of the familiar dichotomy that has
beguiled and bedeviled linguistic theory since Saussure famously distinguished
langue from parole. In my view, the grand Swiss scholar, widely considered to be
the father of modern linguistics, did the discipline a disservice with his love of
dichotomies, particularly those opposing synchronic and diachronic linguistics,
and langue and parole. Linguistics has been mesmerized ever since by the seduc-
tive metaphorical opposition between a system and its products, subsequently
restated as competence vs. performance, I-language vs. E-language, grammar vs.
usage.
Before analyzing the substance of this dichotomy, let us consider how these
concepts have worked in terms of the sociology of the field – what have they
meant for the people and practices in linguistics? From this perspective, I believe
the dichotomy has flourished for two reasons. First, it’s a simplifying assumption:
it designates all variability, heterogeneity, idiosyncrasy, and other messy stuff as
belonging to something other than ‘real’ language, and allows us to set it aside
while we figure out the general patterns. This is a typical step in the early stages of
a scientific field: it is easier to work out generalizations and models if we can start
by ignoring some of the complexity of reality. Thus a simple model of mechani-
cal motion might start by ignoring friction, and a simple model of gravity might
start by ignoring relativity. These were in fact the ways those theories developed
in physics, so we can surely forgive our predecessors in linguistics for doing the
same thing when they postulated a categorical, homogeneous, abstract mental
grammar, ignoring diversity in society and variability in the individual. But when
1 I neglect here the still-debated issue of innateness; even if individuals possess an innate
mental faculty that aids in language acquisition, it does not aid them in acquiring any particular
language in the absence of linguistic interaction with other speakers.
50 Gregory R. Guy
field than to facilitate it. It is time to abandon this conceptualization of our prob-
lems, and move to dealing with language as it is, not as we would imagine it to be.
Class
80 6–8
9
60 4–5
2–3
(r)
1
40 0 Figure 1: Coda /r/ production in New York
City English: Stratification by socioeconomic
20
class and speech style (A: casual speech,
0 B: careful speech, C: reading style, D: word
A B C D Dʻ lists, D’: minimal pairs), from Labov (2006:
Style 152)
cally encounter a curvilinear distribution, with a peak in the lower middle class
or upper working class, as illustrated in Figure 2 from Labov’s (1980: 261) study of
two ongoing vowel changes in Philadelphia English, the fronting of the nucleus
of (aw) (e. g., ounce, house) and the raising of the nucleus of (ey) in closed sylla-
bles (e. g., made, take).
200 200
p<.10
p<.05
p<.01
Regression coefficients (Hz)
(aw) (eyC)
00 00
F2 F2
–100 –100
–200 –200
Figure 2: Curvilinear social stratification of two vowel changes in Philadelphia English (from
Labov 1980: 261)
0.8
Female
0.7
0.6
0.5
Prob.
0.4
Male
0.3
0.2
0.1
Figure 3: Gender and class distribution of an
0
LWC UWC MC intonational change in progress in Australian
Social class English (data from Guy et al. 1986: 37)
The grammar of use and the use of grammar 53
100
90
80
70
60
% [w]
50
40 M
30 GH
20 OV
10 QC
0
over 80
70–79
60– 69
50–59
40– 49
30– 39
20– 29
14– 19
100
90
GH
80
70
QC
60
% snuck
50
40
30
20
10 Figure 5: Replacement of sneaked by snuck
0 in two dialects of Canadian English (from
over 80
70–79
60– 69
50–59
40– 49
30– 39
20–29
14– 19
Strikingly, the s-shaped curve of linguistic change reproduces across real time, as
seen in Figure 6, from Kroch’s (1989) study of the syntactic change in English that
introduced do as an auxiliary verb in questions and negations. The pervasiveness
of such patterns implies that during language change, speakers systematically
encounter in usage information about the direction of the change, and engage
with older forms and newer forms at the same time. Grandparents, parents, and
children are all speaking differently, and since these generational differences are
an intimate part of everyone’s linguistic experience, speakers effectively know
which way the change is heading – they can hear what is new and what is old
in the voices of their own speech community. More generally, they are regularly
exposed to systematic differences in frequency of occurrence of linguistic vari-
ables, and through the social distribution of these frequency patterns, are aware
of the social significance of quantitative information. This must have implications
for their construction and operation of their linguistic systems.
Affirmative transitive
100 adverbial & yes/no question
90 Negative question
80 Affirmative intransitive
70 adverbial & yes/no question
60
Affirmative object question
% do
50
40 Negative declarative
30
20
10
0
1400 1450 1500 1550 1600 1650 1700
Figure 6: The rise of periphrastic do in English (Kroch 1989: 223, based on data from Ellegård
1953)
All the above examples illustrate the point that the diversity of language in use is
not chaotic; rather it is orderly, along social dimensions. But it is also orderly in
the linguistic sense, that is, the linguistic constraints on variability systematically
reflect the organizing principles of language. Note that Kroch’s data in Figure 6
show systematic conditioning by syntactic construction that persists across three
centuries. A further example comes from my research on Brazilian Portuguese
(Guy 1981). Plural marking in popular Brazilian Portuguese is highly variable.
Unlike other dialects of Portuguese, and standard varieties of Spanish, Italian,
etc., where number agreement is obligatory across the NP, as in (1), Popular Bra-
zilian Portuguese (PBP) shows optional or variable plural marking, as in exam-
ples like (2).
The grammar of use and the use of grammar 55
first 95 5247
second 28 3947
third 21 552
fourth and fifth 11 42
The cases presented above are but a small sample of an overwhelming set of
empirical observations demonstrating that variability is a pervasive but orderly
feature of language use. On the basis of such evidence, Weinreich, Labov and
Herzog (1968) further argue that variability is ‘inherent’ in language – i. e., it per-
56 Gregory R. Guy
meates the linguistic system. The classic dichotomous approach, which distin-
guishes the linguistic system from usage, postulates that the object of study of lin-
guistics is a homogeneous monostylistic idiolect, which is internally invariant (cf.
Chomsky 1965). Alas this imaginary object does not exist in the world; indeed, no
observation of language even approximates it. Every speech community includes
diverse speakers, every individual commands different styles, every utterance
includes variable elements. Speakers clearly perceive, process, produce, compre-
hend, and manipulate variability in all aspects of language. Thus variability is, in
a word, an inherent feature of the linguistic system, and no adequate account
of the linguistic system can fail to accommodate variability. Consequently, any
grammar of a language, or theory of grammar, that fails to account for variability
is inadequate on its face – it does not even reach Chomsky’s most elementary level
of ‘observational’ adequacy. Furthermore, if variability is an intrinsic part of the
grammar, then we lose one of the motivations for the grammar/usage distinction.
Now, elsewhere in this volume Newmeyer has objected that quantitative
properties like those illustrated above are not ‘in’ the grammar, but lie outside
it, in a ‘user’s manual’, or derive from interactions between the grammar and
various grammar-external factors. Nevertheless, the evidence shows that this
restatement of the system/usage dichotomy continues to be inadequate, as well
as theoretically profligate rather than parsimonious. A cogent case in point is the
deletion of final coronal stops in English, which shows an exponential relation
among retention rates in three morphological categories.
The relevant categories reflect three derivational levels of words in English:
underived or monomorphemic forms like best, old, which have the full consonant
cluster in their dictionary entries, have the highest rates of deletion and lowest
rates of retention. Irregular ‘semiweak’ verbs like left, told are derived early in
the phonology, and have intermediate rates of retention. And finally, the highest
rates of retention are found in regular past tense forms like missed, rolled, which
are derived late in the phonology. Strikingly, the retention rates in the three cat-
egories are exponentially related. I have argued in previous work (Guy 1991, 1992)
that this shows an iterated application of a single variable rule, with a constant
‘base’ probability of applying, which operates throughout the several stages of
a derivation in a multilevel phonology. The highest retention rates are found in
past tense forms that are exposed once to the deletion process at the final level of
the phonology; forms that are exposed at two levels have the square of the basic
retention rate, and the underived forms that are exposed at three levels show the
cube of the base rate. This relationship was demonstrated in my original work,
and it has subsequently been confirmed in a large number of other studies. The
data from three such studies appear in Table 2, which shows in the first three
columns, the number of tokens analyzed, the percentage of those forms which
The grammar of use and the use of grammar 57
had a retained /t/ or /d/, and the probability of retention predicted by my model,
using the ‘best fit’ for the base rate of retention shown in the fourth column. Thus
Santa Ana’s speakers appear to have an underlying base probability of retention
of .75, predicting retention rates of 75 % in regular pasts, the square of this value
– 56.3 % – in irregular pasts, and the cube of this value – 42.2 % – in underived
forms. These numbers fit very closely to the observed percentages of deletion
(74.3 %, 59.3 %, and 42.1 %). A chi-square test shows that the differences between
the predictions of the model and the observations are not significant (p = .57,
where the criterion for significance is typically taken to be p < .05).
Table 2: The exponential relationship: coronal stop retention in three data sets
Five further tests of the model by Labov and his students appear in Table 3. In
no case is the exponential model statistically rejected (i. e., the predictions of the
model are never significantly different, with p < .05, from the observed data), and
in the studies that have compared it to alternative models, it fits as well or better
than the alternatives.
This is thus a robust and systematic quantitative feature of English phonol-
ogy. It is eminently rule-governed; it is not sporadic or random; and it shows a
highly specific mathematical relationship. English speakers do not simply delete
final stops more in underived words and less when the stop represents an affix;
rather, they delete these categories in a specific ratio. It is hard to imagine any
other process than iterated application of a single probabilistic operation that
can generate these numbers. For instance, it can’t be adequately modeled by just
assigning separate probabilities to the three categories. Functional and usage-
58 Gregory R. Guy
Table 3: Five tests of the exponential relationship in coronal stop deletion, 1991–1997
[W. Labov, p. c.]
Year Regular Past Irregular Past Underived Words Best fit pr Chi-square,
missed, rolled lost, told best, old sig
based accounts that differentiate these classes by their functional load (e. g., avoid
deleting tense markers because of their communicative content) fail to predict
any specific quantitative relation among them. The only model that explains the
exponential relationship is one in which a single operation (stochastic deletion)
is recursively applied in the derivation of forms, with the mathematical result
that the associated probability is multiplied by itself one, two, or three times.
Recursion and derivation in language are ordinarily understood as grammatical
operations. In this case, those operations are quantified.
Strong confirmation of this model is found in the way the process interacts
with other constraints. Those that are external to the word, such as the favoring
effect of a following consonant, are not multiplied during the derivation; rather,
they apply only once in the postlexical phonology, after words are inserted into a
phrase marker. Consequently, they are constant in magnitude across the different
derivational classes. However, internal constraints such as the effect of a preced-
ing consonant are indeed present throughout a derivation; consequently they
appear magnified in underived words, which experience them repeatedly, com-
pared to regular past tense forms, which experience their effects only once. These
predictions are quantitatively confirmed, as shown in Table 4, which expresses
the contextual effects on the process as partial probabilities of deletion occur-
ring in a context (a context with a value of 1 implies categorical deletion, while
a value of 0 implies categorical retention; intermediate values above .5 indicate
deletion favoring contexts, and those below .5 indicate deletion-inhibiting con-
texts). The word-internal constraint shows a much larger range between favoring
and disfavoring contexts in the underived words than in the regular past tense
The grammar of use and the use of grammar 59
Table 4: Internal vs. external constraints on coronal stop deletion: interaction with derivational
class (Guy 1992: 233 and 235) (partial probabilities from separate analyses of morphological
classes)
– – – Morphological Class – – –
– – – Morphological Class – – –
What this implies is that variability and quantitative properties are found in the
system, inside the grammar. And as we saw in the previous section, systematic,
regular ‘grammatical’ properties are also found within the use of language. So the
dichotomy that opposes system and usage, assigning invariant and categorical
properties to the system/grammar, and variable and probabilistic properties to
usage, is turning into an obstacle to explanation, rather than a facilitator.
4 T
owards an integrated theory:
Grammar emerges from experience
What then are the elements of a more coherent vision that eschews the facile
system vs. usage dichotomy in pursuit of a model of the fundamental unity of
60 Gregory R. Guy
grammar and use? We can start where everyone starts, as a child encountering
the language in use in the community around us. Usage constitutes our entire
input. We have an intelligent mind, perhaps even endowed with specialized
neural networks that facilitate language processing. But whether or not language
is cerebrally special, we face the general problem of identifying units, colloca-
tions, and productive principles that will allow us not simply to reproduce the
specific utterances that we have heard, but to form our own novel utterances that
will be correctly interpreted by others. We must make our output well-formed, so
we have to figure out what ‘well-formedness’ consists of.
Basically, we have to find patterns. The patterns are the grammar, the
system. Where do they come from? The usage-based perspective on this issue,
associated with linguists like Bybee (2001, 2002) and Pierrehumbert (2001, 2006),
argues that system is emergent, consisting of generalizations across observed
usage. Let me present an example from my research with Daniel Erker on Spanish
pro-drop (Erker and Guy 2012).
Spanish has optional use of subject personal pronouns (SPPs). They can
occur overtly or be omitted, as in (3), where both full and omitted forms com-
municate the same meaning.
So how does a speaker know or decide where to use one or the other? Much
previous research on this topic has turned up some systematic, widely general
patterns of use that are governed by morphosyntactic and discursive structures
of Spanish. For example, SPP occurrence is regularly constrained by verbal mor-
phology, verb semantics, and discourse reference (cf., inter alia, Otheguy and
Zentella 2012). The morphological constraint contrasts different tense/mood/
aspect forms, with the result that TMA categories with more distinctive verbal
inflection (e. g., the preterit, where every person/number category has a distinc-
tive inflected form) are associated with lower probabilities of pronoun occurrence
than those with less distinctive inflection (e. g., the imperfect, where first and
third person singular forms are systematically identical). The verbal semantics
constraint favors SPP use for verbs of mental activity, while those of external
activity show less SPP occurrence. And the discourse level constraint considers
the flow of reference in a text: a subject which makes reference to a different
person than the subject of a preceding sentence (switch reference) is more likely
to be expressed by an overt pronoun. These patterns are confirmed in our data,
as shown in Table 5.
The grammar of use and the use of grammar 61
Table 5: Three constraints on Spanish SPP occurrence (from Erker and Guy 2012: 540–541)
N % overt SPP
Switch Reference
Switch in reference from previous clause 2653 40 %
No switch in reference 2233 29 %
T = 8.1, p < .001
These effects are quite regular and systematic, recurring in many studies. They
constitute valid generalizations about Spanish syntax. But when Erker and I
looked at their distribution with respect to the lexical frequency of the verb, we
found that these patterns are primarily associated with high frequency words.
First consider TMA. Dividing the verbs into low frequency and high frequency
forms, we find that the main effect of TMA is primarily a phenomenon of the
frequent words. The differences among the TMA categories is modest (although
significant) in infrequent verbs, but dramatically greater among high frequency
words. This is graphically illustrated in Figure 7; in this and subsequent figures
the diverging lines indicate stronger constraint effects in the high frequency
words, and significant differences between contexts are indicated by dark stars.
Similar results emerge for the other constraints on Spanish pro-drop. In the
case of semantic content, the picture is even clearer: no significant differences
among semantic categories in the infrequent forms, but among the frequent
forms, the contrast is forcefully evident (Figure 8).
And for switch reference (Figure 9), clause sequences involving a switch in
reference favor more pronoun expression than those with no switch for both low
and high frequency verbs, but again the effect is significantly greater for high
frequency words.
The robust generalization that emerges from these results is that the various
constraints on Spanish pro-drop that have been well documented in many pre-
62 Gregory R. Guy
Imperfect Mental
50 47 Indicative 60
Activity
% Pronouns Present
% Pronouns Present
45 59
42 Present 50
41
40 Indicative Stative *
40 36
* 41
35 34 33
* 30 32
30 29
25 Preterit 20 External
22 Indicative 17 Activity
20 10
Infrequent Forms Frequent Forms Infrequent Forms Frequent Forms
Significant Significant No difference, Significant
difference, difference, F=.94, p =.38 difference,
F=9.9, p<.001 F=5.3, p<.001 (n=494, 924, 2393) F=51.2, p<.001
(n=pres 1861, (n=pres 834, (n=346, 514, 208)
pret 823, imp 524) pret 54, imp 184)
Figure 7: TMA and frequency in Spanish Figure 8: Semantic content and frequency in
pro-drop (from Erker and Guy 2012: 544) Spanish pro-drop (Erker and Guy 2012: 543)
Switch in
50
referent
% Pronouns Present
49
45
*
40
37
35 Not a
* switch
33
30
28
25
Infrequent Forms Frequent Forms
Significant Significant
difference, difference,
F=38.8, p<.001 F=27.3, p<.001
(n=not 1771, (n=not 462,
switch 2037) switch 608)
vious studies are regularly much stronger for high frequency verbs. The standard
interpretation of these constraints has been to suppose that a speaker’s gram-
matical representation of verbal properties and discourse structure governs their
probabilistic choice between using a null or overt pronoun. But these results indi-
cate that the grammatical properties such as tense-mood form or semantic cat-
egory are actually activated among the words that speakers most often encounter
The grammar of use and the use of grammar 63
Figure 10: Frequency and semantic content effects on pro-drop in Dominican and Mexican
Spanish (Erker and Guy 2012: 547)
Erker and Guy argue that these results show that the grammatical analysis, in
terms of TMA classes, semantic classes, and even the discourse patterning, is
emergent, rather than primary. Speakers presumably need some minimal level of
experience with words and structures to begin to discern patterns and formulate
hypotheses. Consequently, they display robust patterns in high frequency forms,
but weak to nonexistent patterns in the low frequency forms. In fact, the low
frequency forms basically default to the dialect-specific overall rate of pronoun
occurrence. But for verbs that are frequently encountered in usage, distinct gen-
eralizations begin to emerge.
The evidence just presented suggests that speakers infer grammatical properties
and ‘rules’ from experience, as shown by the fact that they do a ‘better’ – or at
least more robust – job of inferring them about words that they hear and use more
often. Grammar is thus emergent from and derivative of experience, rather than
64 Gregory R. Guy
a priori or primary. But we should not leap from such evidence to the conclusion
that there is no mental grammar at all, that speakers simply replicate the quanti-
tative data that they encounter in their linguistic input, without constructing any
mental apparatus of abstract representations, patterns, and operations. On this
point I think I part company with positions taken by Bybee (2001) and some other
usage-based theorists: the evidence suggests that speakers do in fact construct
mental grammars – abstract analyses, categories, and operations – to enable and
govern their own productions.
An illustration of this point comes from my work with Sally Boyd (Guy and
Boyd 1990) on the acquisition of the morphological constraint on English coronal
stop deletion that was discussed above. There is a distinctive developmental
pattern in the treatment of the irregular, semiweak past tense forms with respect
to stop deletion: deletion rates in this category decline with age, as shown in
Figure 11.
0.95
0.85
Factor weight
0.75
0.65
0.55
0.45
0.35 Figure 11: Age grading of English final stop
0 20 40 60 80 deletion in irregular semiweak past tense
Speaker’s Age verbs (from Guy and Boyd 1990: 8)
Boyd and Guy interpret this pattern in terms of the mental representation of these
categories. The youngest children (aged 4–5 in this study) evidently have just
two form classes for English past tenses: strong and weak. The semiweak forms,
having salient root vowel changes, are perforce assigned to the strong class, with
mental representations like tell~toll, keep~kep, so they have essentially categor-
ical absence of the final stop. In our study this appears as very high rates of t,d
deletion for the youngest children. In the next developmental stage, speakers set
up distinct representations for these words that incorporate the final coronal stop,
but treat these as suppletive allomorphs, not derived forms. At this stage, reached
in adolescence for our speakers, these words are deleted at approximately the
same rate as underived words. The lowest rate of deletion is not reached for most
people until adult life, and represents the development of a derived representa-
tion in which the final -t,d is identified as an affix, and attached at the first level
of the lexicon. This then generates the lowered deletion rate in these forms.
The grammar of use and the use of grammar 65
Consider the cognitive task confronted by the language learner who seeks to
discern an optimal representation for this fragment of English morphophonol-
ogy. They encounter massive evidence that English distinguishes present and past
tense verb forms, and that it has two form classes – strong verbs with root vowel
changes, and weak verbs with the coronal stop affix. But there are only about 14
lexical items that have both of these alternations, like tell-told, leave-left, keep-
kept, etc. Hence in order to set up a distinct representation for such forms, the child
must first pick the relevant words out of the crowd, recognize that they have vari-
ably occurring final stops, and then further recognize that the rate of stop deletion
is subtly different in the holistic and derived representations. This takes appre-
ciable amounts of time and data, so in childhood English speakers are unable
to replicate the adult treatment of the semiweak verbs. Roberts’ (1993) study of
Philadelphia children demonstrates this with large Ns, as shown in Figure 13.
66 Gregory R. Guy
1
Probability of deletion
0.8
Children (N=1841)
0.6
0.4
Parents (N=604)
0.2
6 Conclusion
The evidence drawn from the study of actual human language is incompatible
with an idealized model that seeks to characterize linguistic systems in isolation
from use; it is also incompatible with a model that seeks to characterize usage in
isolation from any kind of abstract system. Hence I conclude that puristic models
at both extremes of the theoretical spectrum on these issues are destined to fail:
a puristic generative model, which keeps a strict separation between grammar
and usage, fails to give an adequate account of the interpenetration of structure,
variability, and probabilistic properties in both grammar and usage, while a puris-
tic usage-based model, which denies abstraction, fails to account for grammat-
ically governed divergences between experience and production, like the results
in Figures 12 and 13. An adequate model requires an integrated approach: usage
supplies language acquirers with all of their data, including a vastly enriched
fountain of information about social diversity, directions of change, and the
orderly linguistic structure of inherent variability. From this input, they construct
the set of inferences, representations, and operations that we call the grammar.
Crucially, the grammar incorporates and encompasses variability and quantifica-
tion, enabling speakers to do the fine quantitative tuning of their productions that
is so fundamental to situating one’s speech appropriately in the social universe,
and conveying appropriate messages about interactive stance, speech style, and
The grammar of use and the use of grammar 67
References
Bayley, Robert (1994): Consonant cluster reduction in Tejano English. Language Variation and
Change 6: 303–326.
Bybee, Joan (2001): Phonology and language use. Cambridge: Cambridge University Press.
Bybee, Joan (2002): Word frequency and context of use in the lexical diffusion of phonetically
conditioned sound change. Language Variation and Change 14: 261–290.
Chambers, Jack K. (2002): Patterns of variation including change. In: Jack K. Chambers, Peter
Trudgill and Natalie Schilling-Estes (eds.), The handbook of language variation and
change, 349–372. Malden, MA: Blackwell.
Chambers, Jack K. and Troy Heisler (1999): Dialect topography of Quebec City English. Canadian
Journal of Linguistics 44: 23–48.
Chomsky, Noam (1965): Aspects of the Theory of Syntax. Cambridge: MIT Press.
Ellegård, Alvar (1953): The Auxiliary Do: The Establishment and Regulation of its Use in English.
Gothenburg: Acta Universitatis Gothoburgensis.
Erker, Daniel and Gregory R. Guy (2012): The role of lexical frequency in syntactic variability:
Variable subject personal pronoun expression in Spanish. Language 88: 526–557.
Fromkin, Victoria (1974): The linguistic development of Genie. Language 50: 528–54.
Guy, Gregory R. (1981): Linguistic Variation in Brazilian Portuguese: Aspects of phonology,
syntax and language history. Ph.D. dissertation, University of Pennsylvania. Ann Arbor:
University Microfilms.
Guy, Gregory R. (1991): Explanation in variable phonology: An exponential model of morpho-
logical constraints. Language Variation and Change 3: 1–22.
Guy, Gregory R. (1992): Contextual conditioning in variable lexical phonology. Language
Variation and Change 3: 223–239.
Guy, Gregory R. and Sally Boyd (1990): The development of a morphological class. Language
Variation and Change 2–1: 1–18.
68 Gregory R. Guy
Guy, Gregory R., Barbara Horvath, Julia Vonwiller, Elaine Daisley and Inge Rogers (1986): An
intonational change in progress in Australian English. Language in Society 15: 23–52.
Kroch, Anthony (1989): Reflexes of grammar in patterns of language change. Language
Variation and Change 1–3: 199–244.
Labov, William (2006): The social stratification of English in New York City (2nd edition).
Cambridge: Cambridge University Press.
Labov, William (1980): The social origins of sound change. In: William Labov (ed.), Locating
language in time and space, 251–265. New York: Academic Press.
Mufwene, Salikoko (2011): Language evolution: an ecological perspective. Perspectives 4.
Réseau français des instituts d’études avancées (www.rfiea.fr).
Otheguy, Ricardo and Ana Celia Zentella (2012): Spanish in New York. New York: Oxford
University Press.
Pierrehumbert, Janet (2001): Exemplar dynamics: Word frequency, lenition, and contrast. In:
Joan Bybee and Paul Hopper (eds.), Frequency effects and the emergence of linguistic
structure, 137–157. Amsterdam: John Benjamins.
Pierrehumbert, Janet (2006): The next toolkit. Journal of Phonetics 34: 516–530.
Roberts, Julie (1993): The acquisition of variable rules: t,d deletion and -ing production in
pre-school children. Ph.D. dissertation, University of Pennsylvania.
Santa Ana, Otto (1992): Chicano English evidence for the exponential hypothesis: A variable
rule pervades lexical phonology. Language Variation and Change 4: 275–289.
Schütze, Carson (1996): The empirical base of linguistics: Grammaticality judgments and
linguistic methodology. Chicago: University of Chicago Press.
Weinreich, Uriel, William Labov and Marvin Herzog (1968): Empirical foundations for a theory
of language change. In: Winfred Philipp Lehmann and Yakov Malkiel (eds.), Directions for
historical linguistics, 95–195. Austin: University of Texas Press.
Richard Cameron, University of Illinois at Chicago
Looking for structure-dependence,
category-sensitive processes, and
long-distance dependencies in usage1
1 Introduction
1 I send special thanks to Luis López and Kay González-Vilbazo for their help in articulating
ideas on structure-dependence and long-distance dependencies. Also a very warm thank you to
Aria Adli, Marco García García and Göz Kaufmann for that most surprising of invitations.
70 Richard Cameron
However, I would also call this ancient art of talking the engine for language
change. Because language change emerges from usage within communities, it
follows that usage contributes to the shaping of grammar. Precisely how and how
much, when, where, why, and by whom, I take as empirical questions in line with
the agenda of Weinreich, Labov, and Herzog (1968).
Now, an issue entailed by the very title of Newmeyer’s work, as well as by my
exposition here on grammar and usage, is what I would call a binary approach to
the organization of argument. If we claim that “Grammar is grammar and usage
is usage,” we have set up an either/or binary relationship between the two with
two attendant implications. First, the boundary between grammar and usage is
actually and always recognizable. Second, any subsequent response, be it debate
or research, will also assume binarism as a ground rule. As such, binarism serves
as framing ideology, be it scientifically motivated or not. Consider a few of Eagle-
ton’s (1991: 1–2) multiple definitions of ideology as, “a body of ideas characteristic
of a particular social group or class” or “the medium in which conscious social
actors make sense of their world” or “action-oriented sets of beliefs.” In other
words, we might say that binarism serves as a belief that enables some linguists
to make sense of language as they pursue the actions of research. One might
also draw on some of Eagleton’s pejorative definitions of ideology such as “false
ideas which help to legitimate a dominant political power” or “socially necessary
illusion.” I prefer the nonjudgmental first set of definitions as more useful. My
preference is also the stuff of ideology.
Of course, binarism has long organized research and argument in linguistics
at least since Saussure’s distinction of langue and parole. Other contemporary
examples are core vs. periphery or I-language vs. E-language or competence vs.
performance. In all of these, a binary method of organization sets the research
agenda from the beginning.
A more recent example is Hauser, Chomsky and Fitch (2002) on the nature
of the language faculty as being narrow (FLN) vs. broad (FLB) with the key claim
that the FLN is exclusively characterized by recursion. In response, Jackendoff
(2011) asked if recursion could be found in other domains of cognition such as
vision. If so, because recursion would not be unique to language per se, the key
proposal about the nature of the FLN is falsified, though the possible existence of
a FLN and FLB is not.
I will pursue a strategy analogous to that of Jackendoff, but considerably
less ambitious. My point of departure emerges from Newmeyer’s (2003: 687) cri-
tique of early connectionist models of grammar where he asserted that they are
“hopeless at capturing even the most basic aspects of grammar, such as long-
distance dependencies, category-sensitive processes, structure-dependence, and
so on.” This provides the basis for a prediction. If “there is a world of difference
72 Richard Cameron
between what a grammar is and what we do,” these basic aspects of grammar
should not have parallels in usage. However, if patterns of usage provide struc-
tural parallels, this will give us cause to rethink the “world of difference” claim as
observationally inadequate, at least with respect to Long-Distance Dependencies,
Category-Sensitive Processes, and Structure-Dependence.
As Jackendoff (2011) focused on the issue of recursion because Hauser,
Chomsky and Fitch (2002) highlighted recursion in their claim about the nature
of FLN relative to FLB, so will I focus on these three aspects because Newmeyer
identified them as distinctive of grammar and grammar as different from usage. I
explore the prediction first with (1) Structure Dependence, (2) move to Category-
Sensitive Processes, and then (3) Long-Distance Dependencies. When discussing
Structure Dependence, I explore Adjacency Pairs and the concept of “signifi-
cant silence” from Conversation Analysis. For discussion of Category-Sensitive
Processes and Long-Distance Dependencies, I draw on Variationist treatments
of internal constraints on English (ing) and Spanish null/pronominal subject
alternations.
In my pursuit of these basic aspects of grammar in usage, I operate on the
basis of an assumption or two. First, I assume a simple model of grammar. Within
most conceptions of generative grammar that I am aware of, binarism falls away.
Linguistic grammars are typically comprised of at least three interfaced compo-
nents: phonology, syntax, and semantics, with debate remaining on the status
of phonetics, pragmatics and perhaps morphology. As far as I can tell, it is also
the case that the structural properties and outputs of these different modules of
the grammar are understood to be similar yet different. The similarities would
involve some sort of hierarchical organization and the basic idea that a grammar
consists of recognizably discrete units and methods for combining them into
larger units. Second, if phonology and semantics may differ from and yet share
certain features with syntax and thereby be partners in grammar, then if usage
differs from yet shares certain features of grammar, these shared features will
give us cause to rethink the binary distinction between them as problematic or,
as I noted before, as observationally inadequate. Key to this is the discovery of
structures in usage that parallel structures in grammar.
that some of their findings do display hierarchical organization. Yet, yes in the
following fashion. Recall the simple model of a grammar as consisting of discrete
units and methods for combining them. In Conversation Analysis, those units are
called “turn constructional units” or speaking turns. These units are combined
by what Sacks, Schegloff and Jefferson (1974: 696) identify as a “prominent type
of social organization” called turn-taking. Central to the workings of turn taking
and to all sequential organization in conversation is the Adjacency Pair.
An Adjacency Pair consists of two utterances which recognizably stand in an
action sequence such that the first pair part, spoken by one person, sets up and
requires the second pair part spoken by another person engaged in interaction
with the first (Schegloff 2007: 12–27). In other words, the first pair part projects the
necessity of the second pair part. As far as I know, both the Adjacency Pair and
Turn-Taking are universals of human spoken interaction.
Does the Adjacency Pair format constrain and enable both what we produce
and how we process, parse and interpret what is said as well? Consider the
concept of “significant silence.” Within the format of the Adjacency Pair, the rel-
evance of the second pair part as set up by the first pair part is referred to as “con-
ditional relevance” (Levinson 1983: 293; Schegloff 2007: 20). A case of “significant
silence” occurs when speech in the second pair part is relevantly absent. Con-
sider example (1) as cited by Levinson (1983: 320). T1, T2, etc. refer to turns. In line
T2, the (2.0) refers to seconds of silence.
(1) T1 C: So I was wondering would you be in your office on Monday (.) by any chance?
T2 (2.0)
T3 C: Probably not.
T4 R: Hmm yes =
T5 C: =You would?
T6 R: Ya
T7 C: So if we came by could you give us ten minutes of your time?
In T2, the two second silence occurs sequentially after C’s request in T1. Sequen-
tially, the silence is in the slot where the second pair part is expected and which
would have been produced by R in keeping with the format of the Adjacency Pair.
Moreover, using the format, we are able to assign ownership of the silence to one
of the two participants. It is the silence of R because it is R’s turn to speak. The
absence of R’s response is, then, interpreted by C as a negative response to the
request of T1 as evidenced by C’s utterance in T3. Hence, the silence is signifi-
cant. The Adjacency Pair format, and associated expectations, in turn constrains
how this silence is to be interpreted. In other words, interpretation of silence is
structure-dependent.
Looking for structure-dependence 75
Notice that centrally embedded within the Adjacency Pair parts of Q1 and A1 are
Q2 and A2, another Adjacency Pair, which go towards establishing conditions
necessary for providing A1. Thus, completion of A1 depends on completion of Q2
and A2. Schegloff (2007: 111–114) provides a much longer example with multiple
embeddings.
Given that Adjacency Pairs, and more broadly, Turn Taking are putative uni-
versals which can be shown to constrain and enable both what we produce and
how we process, parse and interpret what is said, and given that Adjacency Pairs
are subject to center-embedding, I conclude that this type of usage displays the
basic grammar feature of Structure-Dependence.
Kroch 1989) has also been investigated both synchronically and diachronically.
Indeed, in the work of Torres Cacoullos and Schwenter (2008) the authors take
on a meaningful variable, variable se-marking on Spanish verbs, and use quan-
titative methods to sort out what meanings may be associated with the variable’s
variants. Second, we identify those contexts in which variation does not occur
so as to get at contexts where variation can occur. At the same time, we generate
initial hunches or hypotheses about what types of constraints or correlations
may provide the bases for statistical patterning. Here is where category-sensitive
processes may come into play as possible constraints. Third, we analyze the data
which may push us to go back through the first two steps. Fourth, we generate
results. Fifth, we interpret the results either on the basis of a theoretically moti-
vated prediction or via ex-post facto interpretation.
What are category-sensitive processes? I will assume that these are linguistic
processes which are sensitive or responsive to certain types of categories and not
others. Examples that come to mind are case assignment and subcategorization.
Also consider Napoli’s (1993: 50–53) textbook discussion of the suffix [-er] in
English. The nominalizing agentive suffix of [-er] in English selects as input a
verb in order to generate a noun as in ‘organize’ to ‘organizer.’ Thus, this partic-
ular suffix is sensitive or responsive to a specific category as input. Without that
input, the nominalization fails. Consider the unhappy sequence of ‘organization’
to ‘organizationer.’ Likewise, the homophonous comparative suffix of [-er] must
select either an adjective or adverb which are either monosyllabic or disyllabic
with a light syllable. Thus, ‘fast’ to ‘faster’ and ‘pretty’ to ‘prettier’ but not ‘beau-
tiful’ to ‘beautifuller.’ Note that we are discussing lexical categories like nouns
or verbs or adjectives or adverbs. Indeed, in Carnie’s (2007: 483) more recent
textbook, if we look for ‘category’ in the index, we find the entry of “see parts
of speech.” In these examples, it seems clear that a type of co-occurrence must
occur between type of suffix and type of input for the process to proceed. Can
sociolinguistic variation be analyzed in a similar fashion?
When studying variation, we seek to establish statistical correlations between
variants of the variable to something else. A statistical correlation, or in variation-
ist terms a constraint, is a type of co-occurrence. Constraints are factors, external
to the set of variants of a variable which influence selections within the set. The
factors that we discover may be of three types: linguistic, stylistic, or social such
as class, gender, or ethnicity. To say that factors influence selections within the
set of variants is equivalent to saying that the variants of a variable are more or
less sensitive or responsive to certain types of factors and not others. Or, more
accurately, as speakers use variants they are sensitive or responsive to certain
types of factors and not others.
Looking for structure-dependence 77
1) Adjective:
Abby: Sometimes my sister’s annoying.
2) Gerund: Where the nominal form may be clearly associated with a verbal
form.
Beyonce: And we bring it to every book club meeting, every month.
78 Richard Cameron
3) Gerund-Participle:
Kenny: Yea, Koby got accused of...cheating on his wife.
4) Progressives/Quasi-Progressives:
Kevin: I’m savin’ my money.
Delia: And I started laughing very very hard.
6) Noun: Where the nominal form is not clearly associated with a verbal form
(from the child’s perspective) or where the form is monomorphemic, as in
“morning” or “evening”.
Beyonce: I only eat the um cookie part. I don’t eat the frosting.
8) Prepositions:
Elizabeth: It was during the summer so we went on vacation a lot.
Turning to Table 1, I provide an overview of the rates of the two variants, the velar
[ŋ] and the alveolar [n] in the children’s speech drawn from interviews I did with
them in pairs over pizza and cups of juice at lunch. I rank the constraints from
highest to lowest rates of the velar variant.
Notice that the categories which form the basis for statistical correlations are
lexical categories such as nouns, adjectives, and verbs. Moreover, they can be
ranked in terms of most to least favoring of the velar variant. Thus, we may say
that the statistical expression of the velar variant is Category-Sensitive. Velars like
nouns and dislike verbs and alveolars like verbs and dislike nouns. I conclude,
thereby, that Category-Sensitivity, another basic feature of grammar, is also a
structural characteristic of usage.
Looking for structure-dependence 79
Table 1: Overall for (ing) by Grammatical Category: Arranged most to least for [ŋ]
Preposition N 5 0 5
% 100 0 100
Noun N 28 3 31
% 90 10 100
Adjective N 84 13 97
% 87 13 100
Gerund N 68 12 80
% 85 15 100
Gerund Participle N 57 23 80
% 71 29 100
Or something N 43 23 66
% 65 35 100
Nothing/Something N 28 18 46
% 61 39 100
Progressives N 337 276 613
% 55 45 100
Though such movements can be very long, they are subject to various kinds
of island constraints. In current terms, they may also be motivated by the need to
check features. Checking features and, in particular, island constraints, seem to
me to be quite grammar internal. As such, here I think Newmeyer is right to claim
a clear difference between what a grammar is and what we do. I cannot think of
any clear analogue or parallel to island constraints in usage, though the potential
relationship of processing and island constraints does come to mind. Yet, even
here, it is unclear that island constraints can be reduced to processing limitations
(Sprouse, Wagers and Phillips 2012).
Nonetheless, notice that island constraints apply to wh-movement or topi-
calizations within the frame of a given sentence. Long-distance agreement may
differ in at least one key fashion. Whereas the long-distance dependencies of wh-
questions or topicalizations occur, ultimately, within the boundaries of a given
sentence, long-distance agreement may occur across multiple sentence bound-
aries, not simply clauses within a sentence. Consider this invented example in
Spanish from my colleague Kay González-Vilbazo.
By +Pro I mean that a subject pronoun was expressed, not the null subject option.
Given that this particular finding has emerged in numerous studies, we may ask
why it occurs. By way of an answer, I will detour into a very brief discussion of
what I have termed Switch Reference following on the earlier work of Silva-Cor-
valán (1982). Switch Reference is a central constraint on the variable alternation
of pronominal and null subject expression. In turn, I will identify shortcomings
of this constraint and show that by pursuing the shortcomings, we end up with a
basis for answering why the pattern occurs. Note that the finding and the answer
both come from an analysis of usage.
Switch Reference refers to two related reference relations that may hold
between two subject noun phrases. When these two noun phrases have different
referents, they are ‘switch’ in reference. When these two noun phrases share the
same referent, they are ‘same’ in reference.
Because my concern is with subject pronouns, I limit my focus to the subjects
of tensed verbs. Therefore, the relationship of switch or same reference holds
between two noun phrases in any stretch of discourse wherein the second noun
phrase is the human subject of the first tensed verb to occur after the preceding
subject noun phrase of a tensed verb. To clarify this, we may visualize the rela-
tionship in the following manner:
I call NP(1) the Trigger and NP(2) the Target. The Target is the subject NP which we
may identify as either switch or same in reference relative to the preceding Trigger
NP. When quantifying variation of pronominal and null subjects, the focus is on
the Target.
The following exchange from a radio talk show2 illustrates this:
2 Broadcasting date: Oct. 12, 1989: 6:00 – 6:30; Notiuno 1320 AM: San Juan; 5s = Fifth Caller,
A = Alcalde or Mayor of San Juan.
82 Richard Cameron
(1) 5s: And it’s the only house that has no electricity
(2) A: Yes yes las- last night the number that you gave me?
(3) 5s: uhhuh
(4) A: From that house we called it.
Table 3: Switch reference in San Juan and Madrid (frequency and rates)
San Juan
Madrid
Table 4: Varbrul weights for pronoun expression switch reference: San Juan vs. Madrid3
In line 1, the speaker and her boyfriend are referenced by the null first person
plural subject of (we). In other words, the first person plural subject of line 1 is
understandable as a set constituted by the set-members of the speaker and her
boyfriend. The individual members of this set are expressed in lines 2 and 3. In
line 4, Carina reintroduces the subject set that was initially expressed in line
1. The subject set of line 4 is partially co-referential or same in reference to the
subject of line 3 because the subject set of line 4 properly includes the subject of
line 3. The subject of line 3 él (‘he’) is a set member of the set in line 4. Yet, the
two subjects are not identical in reference and are, thereby, switch in reference.
However, if we combine the noun phrases of line 2 and line 3, then the subject
noun phrase of line 4 is same in reference to the set elements expressed in lines 2
and 3 as well as to the initial first person plural subject of line 1.
Adhering to a definition of switch reference in which two noun phrases are
considered the same in reference if and only if the preceding Target and Trigger
are referentially identical cannot capture, statistically, how speakers respond to
the explicit or inferable presence of set members of plural subject sets in the pre-
ceding discourse beyond the preceding Trigger noun phrase. Therefore, turning to
plural subjects only, I ask the following questions about the preceding discourse.
Have the set members or the set itself been explicitly or inferably mentioned in the preced-
ing discourse:
(a) within 5 preceding clauses?
(b) within 6 to 10 preceding clauses but not 5?
(c) more than 10 but not less than 10?
(d) not mentioned either explicitly or inferably?
The gradation of 5, then 10, then more than 10 clauses back in the discourse is an
attempt to operationally capture the effects of the saliency of given information
which the set members represent. I hypothesize that plural subject noun phrases,
the elements of which have been entered into the discourse within 5 preceding
clauses will more frequently be null than those plural subject noun phrases
whose elements were mentioned more than 10 clauses before but which have not
since been mentioned. This strategy of analysis follows on the use of sets and set-
elements in the work of Prince (1984) on left-dislocations and topicalizations, two
types of long-distance dependencies. I call this Set-to-Elements Saliency. Also rel-
evant here is the work of Givón (1983) and Gundel, Hedberg, and Zacharski (1993).
As with Switch Reference, despite differences in their rates, the speakers from
San Juan and Madrid show parallel behavior and patterning for Set-to-Element
Saliency. See Table 5.
San Juan
Distance back of set-elements % of Subjects that are +Pro Total N and % of Total N
Madrid
Distance back of set-elements % of Subjects that are +Pro Total N and % of Total N
Therefore, despite the rate difference of subject pronoun expression between the
two dialects, both dialects respond in kind to the preceding presence of either the
set members or the set itself for plural subjects. This level of similarity is made
even more explicit when the data is submitted to varbrul analysis because the
varbrul weights associated with the different factors for San Juan and Madrid are
very close in value.
Table 6: Varbrul weights for set-to-elements saliency: San Juan vs. Madrid
(Nosotros(as) & Ellos(as) only)4
Table 7: Switch reference on singular subjects only: San Juan vs. Madrid (Percentages = rates of
pronoun expression)
Notice that again we find a pattern for singular subjects that is similar to the one
we found for all subjects combined. Also observe how the singular subjects get
distributed across the categories of switch or sameness of reference. For the San
Juan speakers, of the total of 1,755 subjects analyzed, 50 % fall in the category of
switch and 50 % in the category of same. A similar, pattern happens in Madrid. Of
the total of 1,509 subjects, 49 % fall in the category of switch and 51 % in that of
same. Thus, the category which favors null subjects receives a proportion of data
basically equivalent to the category favoring pronoun expression.
This is not the case for plural subjects. Returning to Table 5, notice that the
first distance of “Within 5 clauses” which highly favors null subject expression is
the category with the highest raw incidence of plural subjects for both San Juan
and Madrid. In San Juan, this accounts for 89 % of the data and 93 % in Madrid.
This indicates that usage or discourse is typically structured such that plural sub-
jects more frequently occur in a context that favors null subject expression than
do singular subjects. I conclude, thereby, that usage has recurrent organization
that can and does influence speaker choice of options within the syntax and
which can, at times, also provide a basis for an account of that structure as it is
spoken by everyday people in interaction.
5 Conclusion
I did not set out to falsify the distinction between grammar and usage. Indeed,
as already mentioned, I don’t see this distinction as a theory so much as a funda-
mental assumption that contributes to theory building and research design. As
noted, I also find many of Newmeyer’s arguments for distinguishing the two quite
compelling. However, I do find the binarism inherent in the distinction curious,
at times amounting to a type of ideology. Ideologies can change. Therefore, I have
sought to problematize that binarism. If there are certain structural character-
istics of grammar which have analogues or parallels, not exact replicas, in usage,
then the binarism which organizes the ground rules for the argument is problem-
atic and “the world of difference between what a grammar is and what we do,”
Looking for structure-dependence 87
I guess what I am arguing for is a new set of terms, something other than grammar
and usage or competence and performance, something not binary, something
n-nary.
References
Bernal, Byron and Alfredo Ardila (2009): The role of the arcuate fasciculus in conduction
aphasia. Brain 132: 2309–2316.
Boeckx, Cedric (2009): On long-distance agree. Iberia 1: 1–31.
Cameron, Richard and Scott Schwenter (2013): Pragmatics and variationist sociolinguistics. In:
Bayley, Robert, Richard Cameron and Ceil Lucas (eds.), The Oxford Handbook of Sociolin-
guistics, 464–483. Oxford: Oxford University Press.
Cameron, Richard (2010): Growing up and apart: Gender divergences in a Chicagoland
elementary school. Language Variation and Change 22: 279–319.
Cameron, Richard (1996): A community-based test of a linguistic hypothesis. Language in
Society 25(1): 61–111.
Cameron, Richard (1995): The scope and limits of switch reference as a constraint on
pronominal subject expression. Hispanic Linguistics 6/7: 1–27.
Cameron, Richard (1993): Ambiguous agreement, functional compensation, and nonspecific
tú in the Spanish of San Juan, Puerto Rico and Madrid, Spain. Language Variation and
Change 5: 305–334.
Cameron, Richard (1992): Pronominal and Null Subject Variation in Spanish: Constraints,
Dialects, and Functional Compensation. Ph.D. dissertation, University of Pennsylvania,
Philadelphia (distributed as IRCS Report 92–22 by The Institute for Research in Cognitive
Science, University of Pennsylvania, Philadelphia).
88 Richard Cameron
Abstract: This article deals with two types of syntactic variation in Brazilian Por-
tuguese found in both the community and the individual. The first deals with the
possibility of null or overt third-person pronouns in embedded clauses, and the
second concerns the possibility of “fronted” or “in-situ” wh-constituents. In both
cases, variation is related to syntactic change, but the former case is not found
in the core grammar of children, while the latter case is. We propose that the
null subject – the old form – is the result of schooling and is used increasingly
frequently with age, while the position of the wh-constituent has to do with the
change in the position of focus being present in the core grammars of children
and reflects the frequency of the variants in the input.
1 Introduction
Diachronic studies on Brazilian Portuguese (BP) have shown some major changes
in its syntax since the beginning of the nineteenth century, a scenario where vari-
ation in the community is expected to occur: “The spread of a new parameter
setting through a speech community is typically manifested by categorically
different usage on the part of different authors rather than by variation within
the usage of individuals, although the data are sometimes not as clear as that ide-
alization would suggest, because a writer often commands more than one form of
a language”. (Lightfoot 1991: 162, highlighted by the author)
Here, however, I will discuss two cases where variation, or “optionality”1, is
found both in the community and in the individual. The first case concerns the
variation in BP between the null subject and the overt pronominal subject when
the antecedent is a c-commanding element. In the same context, European Por-
tuguese (EP) licenses only the null variant.
* This study had the support of CNPq grant 305515/2011–2017. I thank the audience at the Work-
shop on System, Usage, and Society, in Freiburg, November 2011 for discussions and sugges-
tions. I also thank the careful review and the innumerable comments of the volume organizers.
Needless to say, all remaining faults are mine.
1 I use variation and optionality as quasi-synonyms.
92 Mary A. Kato
(1) a.
O [pai do Joãok]i disse que elei/k/Øi/*k estava cansado BP
the father of-the John said that Ø was tired
b.
O [pai do Joãok]i disse que Øi/j/ele*i/k estava cansado EP
the father of-the John said that Ø was tired
‘John’s father said that he was tired.’
to assume that UG determines a set of core grammars and that what is actually
represented in the mind of an individual even under the idealization to a homo
geneous speech community would be a core grammar with a periphery of marked
elements and constructions”.
In section 3, we will see the variation not only between plain wh-questions
as in (2), but also among cleft wh-questions (see Section 2). Here the hypotheses
in Kato (2011) do not work since children exhibit different types of wh-questions
already in their core grammars. A refined formal analysis will show that variants
with the same meaning but different numeration, namely the initial vocabulary
in the derivation, may not appear during children’s core grammars, and that
forms yielded by erasure operations at PF count as phonological variants and as
a single syntactic variant. Section 4 provides a brief conclusion.
2 T
he variation between null subjects (NS)
vs. overt pronominal subjects in Brazilian Portuguese
Duarte (1995) shows that the overt subject pronouns (OSP) have been replacing
the null referential subject since the nineteenth century in Brazilian Portuguese,
thereby disobeying Chomsky’s (1981) well-known Avoid Pronoun principle.
100%
80 77
80% 75
60%
50
54
40%
26
33
20% Figure 1: Null referential subjects in Brazilian
Portuguese over seven time periods (adapted
0%
1845 1882 1918 1937 1955 1975 1992 from Duarte 1995: 17)
Kato (2000) shows that BP has been losing other properties of the NS parameter,
namely free inversion7 and long clitic climbing,8 a sign that the change has a para
metric nature. However, there are contexts of resistance that deserve our atten-
tion: (a) in the subject of complement clauses, where variation was seen in (1), (b)
in non-referential subjects, where the NS expletive is still retained9 (cf. (3)), and
(c) in minimal answers where the NS is still categorical (cf. (4)).
b. Ø quero
Ø want
‘I do.’
Observe, however, that in the minimal answer it is not only the subject that is
null, but also the object and other possible complements or adjuncts that may
also be elliptical. In Kato (2009) the apparent null subject is not analyzed as a NS,
but as the result of focus extraction of the verb followed by IP ellipsis (for Finnish
cf. Holmberg 2001). Notice that having an NS or an OSP does not affect the result
of ellipsis, which is why both EP and BP exhibit the same type of minimal answer.
The most intriguing NS is the anaphoric one in complement clauses. Several anal-
yses have been proposed,11 but Kato (2009) interprets it as a logophoric pronoun12
which occurs as the subjects of complements of dicendi verbs. The logophoric NS
(LNS) therefore occurs in a subset of contexts of prototypical pro identified by
inflection.
9 Kato and Negrão (2000) have subparametrized NS languages between those where both ref-
erential and non-referential subjects can be null and those where only expletives can be null.
But in BP change has affected such structures as it becomes more and more common to have an
alternative construction in which an adjunct is raised to satisfy the EPP instead of merging a null
pronoun (Kato and Duarte 2003).
(i) São Paulo chove, Rio faz sol.
São Paulo rains, Rio makes sun.
‘It rains in São Paulo, it is sunny in Rio.’
10 In order to avoid representing the NS as pro, we will represent it as Ø.
11 It has been proposed to be a variable (Modesto 2000) or an A-trace (Ferreira 2000).
12 In Kuno’s (1972) definition, logophoric pronouns are pronouns which are either the speaker
or the addressee in the direct discourse, and which appear as third person in indirect discourse.
Variation in syntax: Two case studies on Brazilian Portuguese 95
The intriguing question that I aim to answer here is: why does BP allow mor-
phological “doublets”?
13 See studies by Kato 2000, Figueiredo Silva 2000, Modesto 2000, Ferreira 2000 and Rodrigues
2004 and the consensual judgment of these data.
96 Mary A. Kato
NS expletives
100%
weak prons demons
nouns
80%
60%
40%
20%
0%
Figure 2: Types of subject produced by Ana
2; .2
11
25
21
3
3
4
2; 1
9
9.
6.
7.
.2
8.
10
4.
5.
4.
2;
2;
10
2;
2;
2;
2;
(a) In the early stages her acquisition is similar to that of children of other
languages, exhibiting root infinitives (Rizzi 1992) and imperatives, but also
minimal answers, which can also be analyzed as extractions of the focalized
element of the questions.
(b) As the child comes close to the target grammar, however, the OSP takes over,
and the NS becomes less frequent; the two do not constitute “doublets”.
14 A full version of the work where I develop these hypotheses can be read in Kato (2011).
15 MLU (Mean Length of Utterance), mean length: 1.5–2.7.
Variation in syntax: Two case studies on Brazilian Portuguese 97
But the most interesting data come from Magalhães’ (2003) study on the acqui-
sition of the logophoric NS (LNS) through schooling.
Table 1: Pronominal and logophoric null subjects in complement clauses (adapted from Magal-
hães 2003)
In sum, children exhibit an insignificant rate of LNS when they start school, but
show real variation between OSP and LNS when they end secondary education.
Considering that the LNS is learned by instruction, its place is in the periphery of
the literate Brazilian I-language. The individuals “code-switch” between the OSP,
learned through selection during language acquisition, and the LNS, learned
through instruction.18
since Old Portuguese (OP).19 Here, we also include the judgments regarding the
grammar of EP.
Below are the quantitative data found in Lopes-Rossi’s (1996) study covering the
nineteenth century to today (Table 2).
Wh-questions in Brazilian Portuguese underwent two basic changes: a) a
change in word order, from Wh-VS to Wh-SV to wh-in-situ, and b) a shift from VS
to SV order, in which the copula introduces the cleft wh-question; this SV type
is also found in EP. The reduced type of the cleft construction (7d), however, is
a Brazilian phenomenon. The most impressive quantitative change, exclusive to
BP, can be seen in the wh-in-situ construction: From zero occurrence in the first
19 Old Portuguese also had the reverse pseudo-cleft, which disappeared with the appearance
of the reverse cleft type. Here we do not include its analysis (see examples and analysis in Kato
and Ribeiro 2009).
20 Que and o que are in variation, the former preferred by Europeans and the latter by Brazilians.
21 In BP, the order Wh-VS is only acceptable with formulaic expressions, unaccusative verbs and
with wh-constructions of the quanto type (cf. Kato 2000; Kato and Duarte 2002).
(i) Como vai o Pedro? (ii) Quando chegam eles? (iii) Quanto ganha a Ana?
‘How is Peter?’ ‘When do they arrive?’ ‘How much does Ana earn?’
22 Examples of this type were not found in the author’s corpus, but are common in children, as
will be seen below.
23 Like EP, OP licenses the order Wh-SV if the wh-expression is of the D-linked type:
(i) Que carro a Maria comprou? (ii) Em que cidade o João nasceu?
‘Which car has Maria bought?’ ‘In which city was John born?’
24 While BP has no structural restrictions for wh-in-situ structures, EP is not so unconstrained,
and its frequency is much lower.
Variation in syntax: Two case studies on Brazilian Portuguese 99
Table 2: The evolution of wh-questions over time in BP (adapted from Lopes-Rossi 1996)
What is surprising is the fact that Lessa (2003) found examples like (9a-d) of
in-situ clefts with initial copula in Brazilian children’s language and also in the
syntax of their mothers’ input.28
28 Notice that while the wh-element is not in initial position the main stress falls on it, and with
the copula in initial position the prosodic pattern changes with the stress on the second syllable.
29 The author claims that in some dialects the in-situ type is not the most frequent one, and in
this case the most frequent variant would be acquired first. According to her, this would account
for the late acquisition of wh-in-situ in children born in São Paulo compared to the early acqui-
sition of children born in the Brazilian Northeast. This is very much in line with recent work by
Yang (2002), who claims that frequency in the input accounts for early acquisition.
Variation in syntax: Two case studies on Brazilian Portuguese 101
Here, we will adapt Lessa’s study using not the frequency of each mother, but the
general frequency found in adults talking on TV as found in Lopes-Rossi’s (1996)
study. The conclusions are the same. The most frequent form in the input is the
first to be acquired by children. The less frequent pattern in the adult is the last
to be learned.
Table 3: Frequency of adult types of wh-questions and earliness of acquisition of each type by
the children L and E
Adult TV Children30
(adapted from Lopes-Rossi 1996) (adapted from Lessa 2003)
Order of acquisition of the types
L E
Focus (F) is a syncretic head which checks both focus and wh-elements
in Portuguese.31 In OP and EP the verb has been analyzed as moving from T to
the head of FocP, or traditionally CP, producing the V2 effect.32 Here, however,
I follow Kato and Raposo’s (1996) analysis, in which the wh-operator moves to
Spec of FocP, but the verb stays in T, with F null (Ø), though with Focus features.
The subject stays in vP, contrary to Germanic languages. The only context where
the head F is not null is when the verb moves to it in OP and EP, and the Spec
of FocP is left empty. In the representation below, a wh-question has a silent Q
in the head of ForceP to denote that the sentence is interrogative, a wh-element
in the Spec of FocP, with its head null, with VP containing the copy/trace of the
wh-element.
(11) [ForcePQ [FocP O que Ø+F [TP comprou [vP a Ana tcomprou [VP ..... t o que ]]]]
In Kato and Raposo’s (1996) analysis, the verb moves to the Focus head in OP
and EP only when the whole sentence is focalized, resulting in enclisis (cf (13a)).
When only the wh- or the focalized element moves to Spec of FocP, the verb stays
in T and the resulting pattern is proclisis (cf. (13b, c)). In BP, the verb always
stays in T, the subject moves to Spec of T, and we have a generalized proclisis (cf.
(13d)).
31 In Portuguese wherever you find a wh-element, you also find a marked focus element.
(i) Quem é que chegou? (ii) A ANA é que chegou.
‘Who is it that arrived?’ ‘ANA is it who arrived.’
32 See Ambar 1992 and Lopes-Rossi 1996.
33 BP has lost the third-person clitics, but proclisis is possible with first- and second-person
clitics.
Variation in syntax: Two case studies on Brazilian Portuguese 103
(13) a. [FP amouV+T Ø+F [TP as tV+T [vP o Pedro [VP tV tas]]]]
b. [FP MUITAS MULHERESi Ø+F [TP o amaramV [vP ti [VP tV to]]]]
c. [ForceP Q [FP quemwh Ø+F [TP te amouV [vP tquem [VP tV tte]]]]
d. [ForceP Q Ø+F [TP o Pedroi me amou [vP ti tV [VP tV tme]]]]
(14) [ForcePQ [FocP o que Ø+F [TPé [vP té [CP [que [TP a Ana comprou to que]]]]]
The great innovation happened in the second half of the nineteenth century and
the first half of the twentieth century only in BP. Recall that the above patterns
(7c, d, e, f), repeated below as (15a, b, c, d) are exclusively Brazilian. We are con-
sidering that the wh-in-situ in BP has two different types: an echo-question type
with rising intonation, which BP shares with EP, and an ordinary question type
with falling intonation, which seems to be exclusive to BP.35
(15) a. É o que que a Ana comprou? In-situ cleft type *OP *EP BP
b. (O) que que a Ana comprou? Reduced cleft type *OP *EP BP
c. O que a Ana comprou? Wh-SV type *OP *EP BP
d. A Ana comprou o que? Wh-in-situ type *OP %EP BP
Starting in the nineteenth century, instead of using the FocP position in the
periphery of the sentence, BP starts using the low FocP position adjacent to the
copula à la Belletti (2004).
(16) [ForceP Q [TP foi [FocP o que Ø+F [vP tfoi [VP tfoi que [TP a Ana comprou to que]]]]]]
(17) a. é quem que chegou? (vs. foi quem que chegou?, the old form)
Is who that arrived?
‘Who is it that arrived?’
In a study unrelated to clefts, Kato (2007) shows that in BP, when the present
tense copula is initial, it is usually dropped.
This change in the focus position allowed V1 constructions, which in turn allowed
the copula in initial position, favoring its erasure.
Notice that with the erasure of the copula, we find expressions that lead to the
phenomenon of haplology, a sound change that involves the loss of a syllable
when it is next to a phonetically identical (or similar) one:
Recall that the resulting cases have been analyzed as a zero complementizer in
previous work by me and others.36
In (21) we can see the evolution of wh-questions over time:
(i) from a. to b. the raising of a thematic verb is replaced by the raising of the
copula;
(ii) from b. to c. the position of the FocusP projection changes, from the high
sentential periphery to the low vP adjacent position;
(iii) from c. to d. the copula grammaticalizes, becomes invariable and allows
erasure;
(iv) from d. to e. the resulting form can undergo haplology in PF, licensing the
erasure of the complementizer.
(21) a. [FocP o que Ø [IP comprou [vP a Maria [VP tcomprou to que]]]]
b. [FocP o que Ø [IP é [VP té [CP que [IP a Maria comprou to que]]]]
c. [IP é [FocP o que Ø [VP té [CP que [IP a Maria comprou to que]]]
d. [IP ( ) [FocP o que Ø [VP té [CP que [IP a Maria comprou to que]]]
e. o que (que) a Maria comprou
It is important to note that the only structural change was the use of the lower
focus, adjacent to vP, while in the older periods the focus was always fronted to
the sentential periphery.37
The old form shown in (21a) is no longer in the child’s core grammar, but the
reverse cleft, still very much present in the child’s input, is part of the child’s core
grammar. The real innovative types derived from the in-situ cleft are all there, and
they do not count as doublets because they are phonological, and not syntactic
variants.
36 See, for instance Kato and Duarte 2002 and Hornstein, Nunes and Grohman 2005. For the
former, the loss of the VS order in wh-questions was triggered by the loss of the null subject,
which caused Spec of TP to be filled with a weak pronoun. We may assume that we had two
independent triggers and, for some speakers who lost the null subject earlier, the origin of the
order WH-SV may have had a structural motivation.
37 However, in declarative focalization, the focalized element makes use of this low FocP, as
seen in examples such as (i) and (ii):
(i) Foi A MARIA que comprou a casa.
(ii) [IP Foi [FocP A MARIA [vP ser [ que a Maria comprou a casa]]
106 Mary A. Kato
The analysis of the echo wh-in-situ question will not be discussed in any more
detail here. My view is that the only real in-situ type of wh-question is the echo
one. Here I assume Kayne’s (1994) analysis of wh-in-situ in general, proposing
that what moves to the Spec of F is the entire sentence, as with yes/no questions.
This analysis shows that in BP echo questions have the same prosody as yes/no
questions. However, this does not apply to the authentic wh-questions, which
have falling intonation in BP.
With regard to the latter type of wh-questions, Kato (2013) analyzes BP as
always having an obligatory last-resort wh-movement of a short type. The idea
was inspired by Miyagawa (2001) for whom Japanese has wh-movement of a short
38 The first version of this work, where two types of wh-in-situ were identified, was presented
in Kato (2013).
39 French is also a language with a kind of optional wh-in-situ, but the prosodic description is
different as it has only a rising intonation.
Variation in syntax: Two case studies on Brazilian Portuguese 107
b. [ForcePQ [TP vocêi conheceu [FocP quemq [Ø+wh [vP ti tv [VP tv tq]]]]]] (\) (wh-question)
you met who
‘Who have you met?
Notice, moreover, that the position of the wh-element is the same as the one that
is exclusive to BP in the other cleft wh-constructions:
(25) a. [ForceP Q [TP é [FocP o que Ø+wh [VP té [CP que [TP a Maria comprou to que]]]
b. [ForceP Q [TP é [FocP o que Ø+wh [VP té [CP que [TP a Maria comprou to que]]]
c. [ForceP Q [[TP A Mariai comprouV [FocP o que Ø+wh [vP ti [VP tV tque ]]]]
There is no optionality between (25a) and (25b) on the one hand and the wh-in-
situ case on the other, as they have a different numeration: The in-situ case has
no copula or complementizer.
To summarize the structure of wh-questions, one can say that there are
indeed only two basic structures: one that results in the wh-in-situ structure and
one that results in all the others from a canonic cleft through successive phono-
logical erasure. But the two structures belong to the same grammar, as the two
types occupy the same low vP-adjacent FocP position.
4 Conclusion
This article has discussed two types of variation in Brazilian Portuguese: (a)
the variation of overt pronouns and null categories as subjects of complement
108 Mary A. Kato
References
Ambar, Manuela (1992): Para uma Sintaxe da Inversão Sujeito-Verbo em Português. Lisboa: Ed.
Colibri.
Aronoff, Mark (1976): Word Formation in Generative Grammar. Cambridge: MIT Press.
Belletti, Adriana (2004): Aspects of the low IP area. In: Luigi Rizzi (ed.), The Structure of IP and
CP. The Cartography of Syntactic Structures, 16–51. (Oxford Studies in Comparative Syntax
2.) New York: Oxford University Press.
Berlinck, Rosane (2000): Brazilian Portuguese VS order: a diachronic analysis. In: Mary A. Kato
and Esmeralda V. Negrão (eds.), Brazilian Portuguese and the Null Subject Parameter,
175–194. Frankfurt a. M./Madrid: Vervuert/Iberoamericana.
Cheng, Lisa and John Rooryck (2000): Licensing wh-in-situ. Syntax 3/1: 1–19.
Chomsky, Noam (1981): Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, N. (1988): Language and Problems of Knowledge: the Managua Lectures. Cambridge:
The MIT Press.
Cyrino, Sonia M. L. (1993): Observaçoes sobre a mudança diacrônica no português do Brasil:
objeto nulo e clíticos. In: Ian Roberts and Mary A. Kato (eds.), Português Brasileiro: Uma
viagem diacrônica, 163–184. Campinas: Editora da UNICAMP.
Duarte, M. Eugenia. L. (1995): Perda do princípio “evite pronome” no Português Brasileiro.
Ph. D. dissertation, Instituto de Estudos da Linguagem, UNICAMP.
Variation in syntax: Two case studies on Brazilian Portuguese 109
30th Regional Meeting of the Chicago Linguistic Society: Parasession on Variation and
Linguistic Theory, 180–201. Chicago, IL: Chicago Linguistic Society.
Kuno, Sussumu (1972): Pronominalization, reflexivization, and direct discourse. Linguistic
Inquiry 3/2: 161–195.
Lessa de Oliveira, Adriana (2003): Aquisição de constituintes-Qu em dois dialetos do português
brasileiro. M. A. thesis, UNICAMP.
Lightfoot, David (1991): How to Set Parameters. Cambridge: MIT Press.
Lopes-Rossi, M. Aparecida (1996): A sintaxe diacrônica das Interrogativas-Q do Português.
Ph.D. dissertation, Instituto de Estudos da Linguagem, UNICAMP.
Magalhães, Telma V. (2003): Aprendendo o sujeito nulo na escola. Letras de Hoje 36/1:
189–202.
Magalhães, Telma V. (2006): O sistema pronominal sujeito e objeto na aquisição do Português
Europeu e Português Brasileiro. Ph.D. dissertation, Instituto de Estudos da Linguagem,
UNICAMP.
Miyagawa, Shigeru (2001): The EPP, Scrambling, and wh-in situ. In: Ken Hale and Michael
Kenstowicz (eds.), A Life in Language, 293–338. Cambridge, MA: MIT Press.
Modesto, Marcello (2000): Null subjects without rich agreement. In: Mary A. Kato and
Esmeralda V. Negrão (eds.), Brazilian Portuguese and the Null Subject Parameter, 147–174.
Frankfurt a. M./Madrid: Vervuert/Iberoamericana.
Noonan, Maire (1989): Operator licensing and the case of French interrogatives. In: E. Jane
Fee and Kathryn Hunt (eds.), Proceedings of the 8th West Coast Conference on Formal
Linguistics, 315–330. University of British Columbia: Stanford Linguistics Association.
Pagotto, Emilio (1993): Clíticos, mudança e seleção natural. In: Ian Roberts and Mary A. Kato
(eds.), Português Brasileiro: Uma viagem diacrônica, 185–206. Campinas: Editora da
UNICAMP.
Rizzi, Luigi (1994): Early null subjects and root null subjects. In: Teun Hoekstra and Bonnie
Schwartz (eds.), Language Acquisition Studies in Generative Grammar, 151–176.
(Language Acquisition and Language Disorders 8.) Amsterdam/Philadelphia: John
Benjamins.
Rizzi, Luigi (1997): The fine structure of the left periphery. In: Liliane M. Haegeman (ed.),
Elements of Grammar: Handbook of Generative Syntax, 281–337. Dordrecht: Kluwer.
Rodrigues, Cilene (2004): Morphology and null subjects in Brazilian Portuguese. In: David
Lightfoot (ed.), Syntactic Effects of Morphological Change, 160–178. Oxford/New York:
Oxford University Press.
Saito, Mamoru and Naoki Fukui (1998): Order in phrase structure and movement. Linguistic
Inquiry 29/3: 439–474.
Yang, Charles (2002): Knowledge and Learning in Natural Language. Oxford: Oxford University
Press.
Part 2: Rare phenomena and variation
Göz Kaufmann, University of Freiburg
Rare phenomena revealing basic syntactic
mechanisms: The case of unexpected
verb-object sequences in Mennonite Low
German1
Abstract: The main focus of this article is dependent clauses with one verbal
element in Mennonite Low German. In some of these clauses, the complement
surfaces after the verb, thereby defying the expected word order of German vari-
eties. This unexpected linearization pattern does not constitute a case of ungram-
maticality, but one of syntactic analogy, in line with the informants’ syntactic
behavior with regard to verb clusters in dependent clauses with two verbal ele-
ments. Due to this relationship, the analysis of dependent clauses with one verbal
element will also shed light on the structure of verb clusters.
1 Introduction
This article analyzes the distribution and structure of dependent clauses with one
verbal element in which the ObjNP/PP (noun/prepositional phrase functioning
as complement) unexpectedly surfaces after the finite verb as illustrated in (1a).
This example forms part of a data set of roughly 14,000 sentences translated
from English, Spanish, or Portuguese into Mennonite Low German (MLG). The
translations were elicited in six Mennonite colonies in North and South America
between 1999 and 2002. Example (2a) shows a dependent clause with two verbal
elements, again with the ObjNP surfacing in post-verbal position. Both (1a) and
(2a) contrast with the expected serializations in (1b) and (2b):
1 I would like to thank Leonie Cornips, Martin Pfeiffer, Peter Öhl, Peter Auer, and Aria Adli for
their helpful comments. The usual disclaimers apply.
114 Göz Kaufmann
(1) a. wann hei unterschrieft [0.4] diesen contrato [0.6] dann verliest der viel Geld2
(Mex-26; m/34/MLG3)
if he signs-VERB […] this contract […] then loses he much money
b. wann hei det Kontrakt [ehm] unterschrieft dann wird her viel Geld verlieren
(Mex-77; f/46/MLG)
if he the.NEUTER contract [ehm] signs-VERB then will he much money lose
(2) a. dü bruuks: [0.7] Brill wiels dü nich sehne kanns die Tofel (Bol-4; m/44/MLG)
you need […] glass because you not see-VERB2 can-VERB1 the blackboard
b. de bruukt ne Brill wegens her nich de [0.6] Tofel sehne kann (Bol-8; m/20/MLG)
he needs a glass because he not the.REDUCED […] blackboard see-VERB2 can-VERB1
The translations in (1a) and (2a) occur rarely in the stimulus sentences <11> and
<26>:4 (1a) appears once in eighty translations with one verbal element (1.3 %);
(2a) appears twice among 311 translations with two verbal elements (0.6 %). Their
rareness is probably caused by the unexpected post-verbal position of the com-
plements, i. e. diesen Contrato (‘this contract’) and die Tofel (‘the blackboard’). On
first sight, one may assume a priming effect in (1a) and (2a) causing the marked
sequence (but cf. the discussion of Table 4 below). This could be either a case
2 The representation of MLG does not claim phonetic accurateness. Filled pauses are indicated in
brackets ([eh] or [ehm]), unfilled pauses with the indication of their length if longer than 0.3 sec-
onds. Break-offs or repairs are marked with a hyphen; a colon represents a markedly prolonged
pronunciation of a phonetic segment. The parts of the translations relevant for the a nalysis are
underlined. In the glosses, only relevant grammatical information such as the hierarchy among
verbal elements, particles, and deviating gender or case of ObjNPs is given. Underlined elements
in the glosses represent semantic deviations from the stimulus sentence; a ∅ represents an
element which was not translated. Crossed out elements represent cases where the informant
included words not present in the stimulus sentence. Whenever the interview language was
Spanish or Portuguese, the stimulus version is given both in that language and in English.
3 All translations presented are coded according to the informants’ origin (Mex = Mexico;
USA = USA; Bra = Brazil; Men = Menno, Paraguay; Fern = Fernheim, Paraguay; Bol = Bolivia) and
their coding number. Also given are the sex of the informant (m(ale) or f(emale)), his or her age
in years, and his or her dominant language(s) (MLG (Mennonite Low German), SHG (Standard
High German), Engl(ish), Span(ish), or Port(uguese)).
4 The rare phenomenon dealt with in this article is not rare in the typological sense, i. e. the
sequence verb-object in dependent clauses with one verbal element is obviously not a rare phe-
nomenon in the languages of the world. It is, however, a rare phenomenon in MLG and in most
Continental West Germanic languages.
Rare phenomena revealing basic syntactic mechanisms 115
of short-term priming due to the translation task (all stimulus languages feature
the sequence verb1-(verb2-)ObjNP/PP) or it could be the long-term consequence of
contact-induced language change. In (2a), however, priming could only be part of
the explanation, since the two verbal elements do not appear in the sequence of
the Spanish stimulus sentence (sehne-VERB2 kanns-VERB1 vs. puede-VERB1 ver-
VERB2; both ‘can see’).
In the Mennonite data set, there are 27 tokens like (2a) in a data pool of 3.120
comparable clauses with two or more verbal elements (0.9 %). Due to space lim-
itations, we will not be able to analyze these tokens thoroughly. The structure
of (2a) and the necessary explanations for its occurrence are, however, com-
pletely different from those of (1a). The major structural difference is that the
ObjNP in (2a) appears after two or more verbal elements, while there is just one
verbal element in (1a). The necessity for different explanations is underlined by
the fact that the informants who produce tokens like (2a) are significantly older
than the ones who produce tokens like (1a) (36.3 years old vs. 27.5 years old; F
(1,84) = 10.2, p = 0.002**), that they come from different colonies and that their
syntactic behavior with regard to verb clusters with two verbal elements is not
comparable. We will, therefore, focus on tokens like (1a): There are 59 tokens like
(1a) in 1,837 translations with one verbal element (3.2 %; only stimulus sentences
where at least one token like (1a) is present are included in this calculation).
Their occurrence is unrelated to translation problems since almost all inform-
ants producing these tokens are fluent in both MLG and the respective majority
language. In Section 4.2.3, we will see that the occurrence of these translations
is best explained by the informants’ syntactic preferences with regard to clusters
with two verbal elements (for an earlier analysis of this phenomenon, cf. Kauf-
mann 2007: 193–198).
Section 2 gives some historical and linguistic information about the Men-
nonites in the Americas; Section 3 explains the elicitation procedure. The central
Section 4 is dedicated to the (socio)linguistic analysis of the variant represented
by (1a). In this section, we will also develop the structure of this variant. In Sec-
tions 4.4.1 and 4.4.2, the influence of the morphological shape of the complement,
i. e. whether it is marked by a preposition or not and whether it is definite or indef-
inite, will be dealt with. Section 5 concludes by showing how empirical studies of
rare phenomena can contribute to the advancement of linguistic theory.
The origins of the Mennonites can be found in East Holland, Frisia, Flanders and
what is today northwest Germany. In these regions, Anabaptist communities had
formed during the Reformation. Due to religious persecution many of these Ana-
116 Göz Kaufmann
baptists emigrated to West and East Prussia during the sixteenth century. There,
a koiné was formed out of the varieties the Mennonites had used in their home-
lands and the local varieties of Low German. Moreover, the Mennonites began to
use Standard High German (SHG) instead of Dutch for official purposes such as
church service and schooling. When the Prussian government imposed stricter
rules on the Mennonites in the eighteenth century, some of them accepted an
invitation by Catherine II of Russia to settle in the Ukraine. At the end of the nine-
teenth century, however, Russian officials introduced laws to ensure a certain
degree of integration, causing the more tradition-bound Mennonites to emigrate
to Canada around 1870. During and after World War I, the situation of German-
speaking immigrants became difficult in Canada. Again, the more conservative
members did not yield to outside pressure. Some decided to move to Mexico,
where most of them settled in the northern state of Chihuahua (Ciudad Cuauhté-
moc; 40,000 people; colonies for which the number of inhabitants is given are
analyzed in this paper). Other Mennonites found a new home in Paraguay and set
up the colony of Menno (9,000 people). Mennonites from Mexico and from Menno
founded several daughter colonies, most importantly Santa Cruz de la Sierra in
Bolivia (50,000 people), various communities in Belize, and one in Seminole,
Texas (4,000 people). The Mennonites who stayed in Russia in 1870 accepted the
new situation and introduced a more modern school system with more emphasis
on SHG. Due to their economic success, these Mennonites faced severe persecu-
tion in the Soviet Union when Stalin gained absolute power in 1927. Because of
these unfavorable prospects, many Mennonites tried to leave the Soviet Union
and some succeeded in emigrating to Canada, Paraguay (colony Fernheim; 4,000
people) and Brazil (Colônia Nova; 1,000 people) in 1930.
These different migration paths led to different language repertoires. Besides
MLG and SHG, this includes the majority language of each colony’s homeland and
possibly other languages such as, for example, Guaraní and local tribal languages
in Paraguay. MLG is still the unrivaled ingroup language in Mexico, Bolivia and
in the Paraguayan colony Menno. It is weakest in the United States and Brazil,
the two colonies where competence in the respective majority language is best. In
the Paraguayan colony Fernheim, there are already some families who use SHG
instead of MLG at home. With regard to SHG, the two Paraguayan colonies benefit
from their modern school system in which this language is both a subject of learn-
ing and a medium of instruction. Granted, this is also true for many schools in
the more conservative colonies in the USA, Mexico and Bolivia, but these schools
can hardly be called modern. SHG used to play an important role in the Brazilian
colony as well, but due to political intervention and the size of the colony it has
lost this position.
Rare phenomena revealing basic syntactic mechanisms 117
The data analyzed consist of the oral translation of 46 stimulus sentences from
English, Spanish, or Portuguese into MLG. The 313 informants5 did not have
access to a written version of the stimulus sentences. The stimuli were read one
at a time and then immediately translated one at a time. As the central interest of
the project is clause-final verb clusters, the stimulus sentences were created in a
way that allowed the analysis of three linguistic factors: (a) the type of finite verb,
(b) the number of verbal elements and, for dependent and introduced clauses, (c)
the type of clause. The different cluster types are distributed over six main clauses
and four types of dependent clauses: ten restrictive relative clauses, ten preposed
conditional clauses, ten extraposed causative clauses and ten extraposed com-
plement clauses. All main verbs in the stimulus sentences are transitive, i. e. they
govern an ObjNP/PP. Some sentences additionally contained an adverb.
190 of the 313 informants claimed MLG as the language they knew best
(60.7 %). Another 31 informants indicated a comparable knowledge of MLG and
one of the majority languages (9.9 %; 10 in Brazil; 9 in the USA), twelve of MLG
and SHG (3.8 %; 9 in Paraguay). 63 informants claimed that one of the major-
ity languages was their strongest language (20.1 %; 31 in the USA; 19 in Brazil);
seventeen allotted this status to SHG (5.4 %; 16 in Paraguay). Only six of the
313 informants can be classified as semi-speakers with regard to MLG. This means
that most speakers with a dominant language other than MLG still speak MLG well
or even very well. Analyzing the informants’ language dominance with regard
to the marked variant in (1a), we see that 29 of the 59 tokens are produced by
informants whose dominant language is MLG (49.2 %), six (10.2 %) by informants
equally competent in MLG and one of the majority languages, and 24 (40.7 %)
by informants dominant in a majority language, mostly English. Only one token
is produced by one of the six possible semi-speakers; none is produced by an
informant (co-)dominant in SHG. It is also important to realize that the marked
variant is not reducible to a “deviant” grammar of just a few informants. A total of
46 Mennonites (14.7 % of the 313 informants) produced the 59 tokens.
5 103 informants come from Mexico, 67 from the USA, 56 from Brazil, 42 from Menno (Paraguay),
37 from Fernheim (Paraguay) and eight from Bolivia.
118 Göz Kaufmann
In the translated complement, conditional and relative clauses with one verbal
element (with or without a verbal particle), there are 59 tokens (3.2 % of 1,837
tokens) in which the internal complement selected by the verb surfaces after this
verb. In 53 of the 59 cases the verb occupies – on the surface – the second position
of the clause. Examples (3) through (6) present two complement clauses, one con-
ditional and one relative clause. As in (1) and (2), the (a) examples illustrate the
rare sequence verb-ObjNP/PP, whereas the (b) examples represent the unmarked
sequence ObjNP/PP-verb.
stimulus <4> English: Can’t you see that I am wearing a new dress?
b. kos dü daut nich sehen daut ik en nüet Kleid anha (USA-29; f/19/MLG)
can you that not see that I a new dress on-PARTICLE-have-VERB
stimulus <5> P
ortuguese: O Enrique não sabe que ele pode sair do país
English: Henry doesn’t know that he can leave the country
(4) a. Hein weit daut nich daut hei darf [0.4] üt dem [0.3] Laund rüter (Bra-5; f/22/MLG+Port)
Henry knows that not that he may-VERB […] out the […] country out-PARTICLE
b. Hein weit nich daut hei üt dem Laund rüterdarf (Bra-52; m/30/MLG)
Henry knows not that he out the country out-PARTICLE-may-VERB
stimulus <12> English: If he does his homework, he can have some ice cream
(5) a. wann der dät den sine Arbeit dann kaun her etz some ice cream eten (USA-77; f/42/MLG)
if he does-VERB the his homework then can he now some ice cream eat
b. wann her sinen [1.1] homework dät dann kaun her ice cream han
(USA-64; f/41/Engl)
if he his.MASCULINE […] homework does-VERB then can he ∅ ice cream have
stimulus <32> P
ortuguese: As estorias que ele está contando para os homens são muito
tristes
English: The stories that he is telling the men are very sad
(6) a. Die Geschichte waut hei vertahlt für de Manner is sehr trürig (Bra-37; m/34/Port)
the story that he tells-VERB for the men is very sad
b. die Geschichte waut hei to de Männer vertahlt sind sehr trürig (Bra-6; f/23/MLG)
the stories that he to the men tells-VERB are very sad
Rare phenomena revealing basic syntactic mechanisms 119
The examples given are structurally not uniform: (i) 21 of the relevant tokens
(35.6 %) feature a verb with a particle (cf. (3a) and (4a)), the rest are verbs with
(cf. (6a)) or without a non-separable prefix (cf. (5a)). This difference does not have
any measurable influence on the frequency of the rare phenomenon. (ii) In most
cases, the ObjNPs/PPs in the tokens with the sequence verb-ObjNP/PP are definite
(with a definite article as in (4a) or a possessive article as in (5a)), only seventeen
complements are indefinite (28.8 %; mostly with an indefinite article as in (3a); cf.
Section 4.4.2). (iii) 22 of the tokens (37.3 %; cf. (4a) and (6a)) feature a PP as com-
plement (cf. Section 4.4.1). As there is a certain tendency in Continental West Ger-
manic varieties such as Dutch to extrapose ObjPPs into the postfield and as such
a movement would undermine our line of argumentation, some tokens which
seem to belong to the variant represented by (1a) were excluded. Translations
of stimulus sentence <5> Henry doesn’t know that he can leave the country, for
example, were only accepted if the particle surfaced at the end of the clause after
the ObjPP (cf. (4a)). In such a case, extraposition of the ObjPP into the postfield
is not a possible analysis. Conversely, translations such as (7) were not included
in the analyses because the particle rüt (‘out’) surfaces in a non-final position,
strongly suggesting an extraposed ObjPP:
stimulus <5> Spanish: Enrique no sabe que puede salir del país
English: Henry doesn’t know that he can leave the country
(7) Heinrik weit daut hei nicht kann rüt [0.6] üt diese- [0.4] üt det Land (Mex-45; m/59/MLG)
Henry knows ∅ that he not can-VERB out-PARTICLE […] out this- […] out the country
stimulus <43> Spanish: Antes de irme de casa, siempre apago las luces
English: Before leaving the house, I always turn off the lights
(8)
immer wann ik weggo von Hüs dann du ik immer daut Lich ütmeaken
(Mex-82; m/52/MLG)
always when I away-PARTICLE-go-VERB from home then do I always the light out-make
We will come back to examples like (8) at the end of Section 4.4.1 in order to show
that the informants producing them have different syntactic preferences than
the informants producing the marked variant in the (a) examples in (1) and (3)
through (6). Obviously unlike in (4a), the structural position of the indirect ObjPP
in (6a) is not clear either; in principle, extraposition into the postfield might be
a possible derivation. The difference to the probably extraposed directive ObjPPs
120 Göz Kaufmann
in (7) and (8) is that für in für de Manner (‘to the men’) is not selected by the verb
but marks an indirect object. This means that für in (6a) is semantically vacuous
and more importantly, it is optional – most Mennonite informants do not mark
indirect objects prepositionally. Unlike this, the prepositions üt (‘out’) and von
(‘from’) in (7) and (8) add semantic value to the verbal proposition. As für de
Manner is syntactically closer to indirect ObjNPs than to directive ObjPPs like in
(7) and (8) and as indirect ObjNPs cannot be extraposed in MLG, extraposition
does not seem to be a possible explanation for (6a).
6 In these cases, we are dealing with a dependent main clause with the finite verb occupying
the head position of CP. The Mennonite data does also contain many verb-second causal clauses
with one verbal element, especially in the North American colonies (cf. Table 9 and Kaufmann
2003: 188–189). Due to this, these tokens are not included in the analyses. We will, however, deal
with causal clauses from the South American colonies in section 4.4.3.
Rare phenomena revealing basic syntactic mechanisms 121
Table 1: Distribution of the two variants in dependent non-causal clauses with one verbal
element in all colonies separated according to the informants’ origin (obj = ObjNPs/PPs;
part = particle)
In the highly significant, but with 0.14 only weakly associated distribution of
Table 1,7 we can distinguish three types of colonies with regard to the phenomenon
in question. The informants in the United States show by far the highest share of
the non-verb-final variant (8.4 % of their tokens; 24 instead of 9.2 expected tokens).
The other extreme is represented by the Paraguayan colonies which only produce
three instead of 18.2 expected tokens (0.5 % of the 567 Paraguayan tokens). The
other three colonies range from 3.1 % to 4.3 %. The difference between these three
groups of colonies seems to be connected to the different competence levels in
SHG. Much contact with SHG, as in the Paraguayan colonies, correlates with very
few non-verb-final tokens; hardly any contact with SHG, as in the US-American
colony, correlates with a much higher number of non-verb-final tokens.
7 Shading in tables is only used when the distribution in a line represents a significant dif
ference. For token frequencies, Pearson’s Chi-Square is used. As this test is sensitive to the
number of tokens, tests for the strength of association are also carried out (Cramer’s V or Phi).
The number of cells with less than five expected tokens in the distribution is always given (in
especially vulnerable distributions with one degree of freedom, the result of Fisher’s Exact is
added). For interval scale variables such as age or the indexes for verb projection raising and
scrambling, a One-Way ANOVA is used. The level of statistical significance is given with its pre-
cise value. One asterisk * means that SPSS calculates the probability for a Type I-error between
1 % and 5 % (0.01 ≤ p < 0.05), two asterisks ** that the probability is smaller than 1 % (0 < p < 0.01),
and three asterisks *** that it is virtually 0 % (p = 0). We are aware of the fact that this value can
never be reached, but follow the indication of SPSS. One asterisk in brackets (*) indicates a statis-
tical tendency where the error margin lies between 5 % and 10 % (0.05 ≤ p < 0.1).
122 Göz Kaufmann
Table 2: Distribution of the two variants in dependent non-causal clauses with one verbal
element in all colonies separated according to the type of clause (part=particle)
The distribution in Table 2 is highly significant, but again the association strength
is weak. The extraposed complement clauses show a much higher share of the
marked variant (5.8 %; 41 instead of expected 22.9 tokens) than conditional and
relative clauses (2 % and 1.1 %, respectively; 13 and 5 instead of expected 21.1
and 15 tokens). This does not change if we only consider clauses with definite
ObjNPs, i. e. the significant concentrations of indefinite ObjNPs/PPs in comple-
ment clauses and ObjPPs in relative clauses skews the result to some degree, but
it does not change the general picture. As already mentioned, almost all tokens
of the marked non-verb-final variant are superficially verb-second and thus share
a central characteristic with main clauses. In view of this, one could assume an
iconic relationship between this surface shape and a low degree of embedding
since the surface shape may remind the speakers (and the hearers) of indepen-
dent verb-second main clauses. If this were the case, one could use the share of
superficial verb-second clauses in MLG as an indicator of the degree of embed-
Rare phenomena revealing basic syntactic mechanisms 123
ding of dependent clauses. We do not have the space to pursue this line of reason-
ing further, but the analyses of the variation in dependent clauses with two verbal
elements also points in this direction.
stimulus <15> Portuguese: Se ele tiver que vender a casa agora, ele vai ficar muito triste
English: If he has to sell the house now, he will be very sorry
All four variants can be derived by three movements assuming that MLG is head-
final with regard to both VP and IP. Disregarding the higher clausal positions and
not including adverbs9 in the structural description in (13a–c2ii), the proposed
derivation works like this (moved elements are underlined):
8 The labels for the different variants follow tradition (NR = non-raising; VPR = verb projection
raising; VR = verb raising).
9 With regard to the precise position of the ObjNP, we do not differentiate between the two pos-
sible sequences of ObjNP and adverb in the V(P)R-variants (11) and (12). With regard to the VPR-
124 Göz Kaufmann
(13) a. basic structure [CP … [IP … [VP1 [VP2 daut Hus verköpe] mut] ∅(3PS)]]
b. gaining finiteness [CP … [IP … [VP1 [VP2 daut Hus verköpe] ta] muta-∅]]
(head movement from V10 to I0 resulting in the NR-variant I)
c1. verb projection raising [CP … [IP [IP … [VP1 tb ta] muta-∅] [VP2 daut Hus verköpe]b]]
(raising and adjunction of VP2 to IP resulting in the VPR-vari-
ant)
c2i. scrambling10 [CP … [IP daut Husc [IP … [VP1 [VP2 tc verköpe] ta] muta-∅]]]
(scrambling of ObjNP out of VP2 to IP resulting in the
NR-variant II)
c2ii. verb projection raising [CP … [IP [IP daut Husc [IP … [VP1 tb ta] muta-∅]] [VP2 tc verköpe]b]]
(raising and adjunction of remnant VP2 to IP resulting in the
VR-variant)
This approach to verb clusters can be found in the older literature (cf. Den Besten
and Broekhuis (1989) as quoted in Haegeman 1994: 51211), but except for Kauf-
mann (2007) it does not meet with a lot of support nowadays. Its central claim is
that although on the surface it looks as if the VR-variant implied the movement
of less material than the VPR-variant (just moving the verbal head instead of the
complete verb phrase), it actually implies more movement, namely scrambling
plus verb projection raising (moving the complete verb phrase, not just the verbal
head). The consequence of the derivational assumptions put forward in (13a–c2ii)
is that verb clusters are nothing more than an epiphenomenon of two indepen-
dent syntactic mechanisms. Table 3 illustrates the four possible crossings of verb
projection raising and scrambling exemplified by tokens (9) through (12):
variant in (11), the sequence ObjNP-adverb is merely a case of short movement of the ObjNP not
leaving VP2; in the case of the VR-variant in (12), the ObjNP is moved outside the VP2 regardless
of its landing site before or after the adverb. Unfortunately, we cannot be sure whether the scram-
bled ObjNP in the sequence ObjNP-adverb in (10) has really left VP2 or whether the supposedly
non-scrambled ObjNP in the sequence adverb-ObjNP in (9) is still in VP2, but the normalizing
technique used for grouping the informants minimizes this problem (cf. the explanation in this
section).
10 With regard to our hypothesis it is immaterial whether scrambling occurs before or after verb
projection raising. However, as many theoretical considerations speak against movement out of
a moved constituent (cf. Wexler & Culicover 1980), we opt for the application of scrambling be-
fore raising. This means that verb projection raising in the formation of the VR-variant is a case
of remnant movement (cf. (13c2ii)).
11 The English translation of Den Besten and Broekhuis’ central argument is: “[…] VR is inter-
preted as the limiting case of VPR, an instantiation of VPR where all nonverbal material has been
scrambled out of the adjoined VP”.
Rare phenomena revealing basic syntactic mechanisms 125
Table 3: Cross tabulation of the example clauses (9) through (12) by means of verb projection
raising and scrambling
(10) wann hei sin Hüs nu (12) wann hei daut Hüs nu mut
+scrambling
verköpe soll […] verköpe […]
The reader may have fewer problems in acceptwing (11) and (12) as raised than in
accepting (10) and (12) as scrambled since the term scrambling is used in a rather
loose way to cover non-prototypical cases of argument movement (the prototypical
case being the re-ordering of arguments in the midfield of a German clause; cf.
Haider 2010: 152 (property (vii)) and 184–185). Both the position of the ObjNP in
front of an adverb in a clause with the sequence verb2-verb1 (cf. (10), a case Zwart
(1996: 230–231) accepts as scrambling, but Haider (2010: 157–158) dismisses out-
right) and the non-adjacency of the ObjNP and its governing verb in a clause with
the sequence verb1-verb2 in (12) are supposed to be the consequence of scrambling.
The reason for this assumption is the fact that informants who prefer serialization
patterns represented by (11) to those represented by (12) also prefer serialization
patterns represented by (9) to those represented by (10). And informants preferring
(10) over (9) also prefer (12) over (11). These correlations obviously do not prove
that scrambling is the reason for (10) and (12), but they represent good evidence
for the claim that the two phenomena are of the same nature. Moreover, in Sections
4.4.1 and 4.4.2, we will see that the position of ObjNPs/PPs in our tokens is indeed
sensitive to typical characteristics facilitating or barring scrambling.
Assuming that different serialization patterns in verb clusters are the con-
sequence of the (non-)application of verb projection raising and scrambling, the
313 informants were characterized according to their syntactic behavior in depend-
ent clauses with two verbal elements. In order to do this, we not only counted the
informants’ tokens representing each of the variants in (9) through (12) (as done
in Kaufmann 2007). Instead, a normalized measure for the probability of verb
projection raising and scrambling was calculated for each of the ten clauses for
which enough good-quality translations were available. The informants’ actual
behavior was then compared to the expected behavior for each clause.12 With this
12 For verb projection raising 1,905 translations of nine stimulus sentences could be used (an
average of 6.1 tokens per informant); for scrambling 1,167 translations of ten stimulus sentences
(an average of 3.7 tokens per informant) (cf. the description of this methodology in Kaufmann
2011: 209–210).
126 Göz Kaufmann
method, it no longer mattered whether informants translated all ten clauses (most
did not) and which clauses they actually translated. Metrical index values for
both raising and scrambling could be allotted to 282 of the 313 informants. These
indexes served for the grouping of these informants into four types of speakers by
means of a cluster analysis.
We can now focus on the decisive question, namely how the four types of
informants behave with regard to the phenomenon in question. Table 4 furnishes
this information; the distribution is highly significant and this time it shows a
somewhat higher strength of association:
Table 4: Distribution of the two variants in dependent non-causal clauses with one verbal
element in all colonies separated according to the informants’ behavior in dependent non-
causal clauses with two verbal elements (vpr = verb projection raising; part = particle)
informants
NR-variant I NR-variant II VPR-variant VR-variant total
preferring the
-vpr -vpr +vpr +vpr
characteristics
-scrambling +scrambling -scrambling +scrambling
is correct, the other two clusters should show intermediate performances, since
both coincide with the most productive VPR-informants in one of the two dimen-
sions. Table 4 confirms this hypothesis, showing intermediate shares of 2.7 % and
3.9 %.
The high concentration of the marked variant in one particular group of
informants precludes the possibility of accounting for this variant by means
of priming since priming should have a comparable effect on all informants.
Another conclusion can be drawn from the fact that the group of informants who
prefer the VPR-variant has a share of the marked variant almost four times as big
as the group of informants who prefer the VR-variant. If we probe a little bit more
into this difference, we can see how important it is to characterize the informants
according to their general syntactic behavior. The group that prefers the VPR-vari-
ant has fewer US-American informants than the group that prefers the VR-vari-
ant. Half of the informants of the latter group come from the US, the colony with
the strongest concentration of the marked variant (8.4 %; cf. Table 1). In spite of
this, the group’s share of the marked variant is only 3.9 %. Among the informants
who prefer the VPR-variant, however, this share is 15.5 % although only one third
of these informants come from the United States.
In spite of the telling distribution in Table 4, the statistical analysis can
still be refined. As we calculated a value for the informants’ probability for verb
projection raising and scrambling, we can check the average values for these
two dimensions for the Mennonites that produce the unmarked variant with the
sequence ObjNP/PP-verb and the ones that produce the marked variant verb-
ObjNP/PP. Besides being able to use parametric tests (the values can be consid-
ered quasi-intervals), we can also include more tokens, since some informants
have a value for one dimension, but not for the other one. These informants had
to be excluded in Table 4. In Table 5, their value for either raising or scrambling
can be included.
Table 5: Average values for verb projection raising and scrambling for the informants
who produced the two variants in dependent non-causal clauses with one verbal element
(obj = ObjNPs/PPs; part = particle)
n 1788 1730 58
verb projection raising +0.027 +0.015 +0.374
n 1689 1631 58
scrambling +0.007 +0.015 -0.202
128 Göz Kaufmann
The highly significant difference between the raising values of the informants
producing the two variants is 0.359 (F (1,1786) = 72.8, p = 0***). With regard to
scrambling, the difference is also highly significant, but it is somewhat smaller
at 0.217 (F (1,1687) = 40.6, p = 0***). We can therefore conclude – now with even
more confidence – that a tendency towards raising VP2 in dependent clauses
with two verbal elements and towards not scrambling the ObjNP/PP out of VP2
promotes the occurrence of the non-verb-final variant in dependent clauses with
one verbal element. Informants who produce clauses like
(14) a. basic structure [CP … [IP … [VP diesen Contrato unterschriew] t(3PS)]]
c1. verb projection raising [CP … [IP [IP … tb unterschriefat] [VP diesen Contrato ta]b]]
(raising and adjunction of VP to IP)
13 In SHG, we find a comparable phenomenon: Although verb projection raising does not exist
with two verbal elements, let alone with one, it does exist with three or more verbal elements
when certain morphological conditions are met (e. g., the so-called IPP-effect). The consequence
is that even SHG has a partially right-branching, supposedly more parsing-friendly verbal
sequence, namely (ObjNP/PP-)verb1-(ObjNP/PP-)verb3-verb2.
Rare phenomena revealing basic syntactic mechanisms 131
stimulus <2> Spanish: Juan no cree que conozcas bien a tus amigos
English: John doesn’t think that you know your friends well
(15) [eh] Johann gleuf nich daut dü: gut kenns sine Frend (Mex-26; m/34/MLG)
[eh] John believes not that you good-ADVERB know-VERB his friends
If verb-second really was the reason for the marked sequence verb-ObjNP/PP, we
would assume that these six non-verb-second tokens were not produced by the
informants who prefer the VPR-variant. This, however, is not the case: Three of
the six tokens are produced by these informants, i. e. they partake in tokens like
(15) to the same extent as in all other tokens with the marked sequence (the other
3 tokens are distributed among the other 3 groups of informants).
The last point to discuss here concerns the informants that prefer the VR-
variant in dependent clauses with two verbal elements. If we continue applying
the derivations as in (13b–c2ii), we obtain (16b–c2ii) for clauses with one verbal
element. We start from the second step, i. e. from head movement from V0 to I0:
(16) b. gaining finiteness [CP … [IP … [VP diesen Contrato ta] unterschriefat]]
(head movement from V0 to I0)
c2i. scrambling [CP … [IP diesen Contratoc [IP … [VP tc ta] unterschriefat]]]
(scrambling of ObjNP out of VP to IP)
c2ii. verb projection raising [CP … [IP [IP diesen Contratoc [IP … tb unterschriefat]] [VP tc ta]b]]
(raising and adjunction of a completely emptied VP to IP)
(17) a. basic structure: [CP … [IP … [VP [SC en nüet Kleid an] ha] ∅(1PS)]]
b. gaining finiteness [CP … [IP … [VP [SC en nüet Kleid an] ta] haa-∅]]
(head movement from V0 to I0)
c1. verb projection raising [CP … [IP [IP … tb haa-∅] [VP [SC en nüet Kleid an] ta]b]]
(raising and adjunction of VP to IP)
c2i. scrambling [CP … [IP en nüet Kleidc [IP … [VP [SC tc an] ta] haa-∅]]]
(scrambling of ObjNP out of the small clause and of VP to IP)
c2ii. verb projection raising [CP … [IP [IP en nüet Kleidc [IP … tb haa-∅]] [VP [SC tc an] ta]b]]
(raising and adjunction of an almost completely emptied VP
to IP)
If informants who prefer the VR-variant apply the same derivational steps in
dependent clauses with a particle verb, we should find clauses like (17c2ii) […]
daut ik en nüet Kleid ha an, which we do not. There is only one translation of
stimulus sentence <33> which may constitute a possible exception to this:
(18) det‘s die Reis wo ik mine [0.4] Mutter friar mit (USA-40; m/36/MLG)
this-is the journey where I my […] mother ?lead?-VERB with-PARTICLE
‘This is the journey on which I am taking my mother.’
14 This is a kind of metathesis frequently happening. Think, for example, of German fragen
(‘ask’) and forschen (‘investigate’), both connected to Old High German forsca (‘question’). The
same story can be told for the Latin cognate percontari which developed into Spanish preguntar,
while Portuguese maintained the original sequence perguntar.
Rare phenomena revealing basic syntactic mechanisms 133
15 Particle verbs only appeared in the translations of stimulus sentences <4> and <5> – two com-
plement clauses which rendered many tokens with the marked sequence. The informants who
prefer the VPR-variant produced 28.9 % of the marked variant in these two clauses (11 out of 38
tokens). As the marked variant is the fitting variant for these informants and the sequence of (18)
would be the fitting variant for informants who prefer the VR-variant, we may – if we assume a
comparable share of 28.9 % – expect the structure of (17c2ii) to yield 20.2 out of seventy tokens
by the VR-friendly informants in clauses <4> and <5>.
134 Göz Kaufmann
clauses with just one verbal element seems to be only possible provided the com-
plement is not scrambled.
In this section, we will present three distributional facts supporting the assump-
tion that the lack of scrambling is crucial in the generation of the marked variant.
In the first two parts, we will take advantage of the fact that there is some mor-
phological and semantic variation with regard to the complements in the trans-
lations. Section 4.4.1 will deal with the difference between ObjNPs and ObjPPs;
Section 4.4.2 explores the difference between definite and indefinite ObjNPs/
PPs. The last section will then extend the analysis to causal clauses in the South
American colonies to determine whether the informants behave in a comparable
way with regard to this type of clause.
If the lack of scrambling is the decisive factor for the generation of the marked
variant with one verbal element, this has an effect on the analysis of verb clusters,
too. In Section 4.2.3, it was shown that there is a clear relationship between the
distributional facts for dependent clauses with one verbal element and the distri-
butional facts for dependent clauses with two verbal elements. We can therefore
extrapolate that if the lack of scrambling is responsible for the marked variant
with one verbal element and if this variant is predominantly produced by inform-
ants who prefer the VPR-variant, then the VPR-variant itself should be the con-
sequence of a lack of scrambling. We thus assume that the application of syn-
tactic preferences can differ quantitatively (cf. the first point in Section 4.3.2), but
not qualitatively: Once a scrambler, always a scrambler; once a non-scrambler,
always a non-scrambler. The converse argument also applies: Informants who
prefer the VR-variant produce the marked non-scrambled variant with one verbal
element far less frequently than informants who prefer the VPR-variant. This dif-
ference does not come as a surprise because we have shown that the VR-variant
is the consequence of scrambling of the complement out of the verb phrase (i. e.
VP2) (cf. table 3).
than ObjNPs. One may see an indirect indication for less scrambling in the fact
that Haider (2010: 140) writes that “[p]repositional objects are the lowest ranking
objects”, i. e. they constitute the most deeply embedded argument in the verb
phrase. It is difficult, however, to say whether more deeply embedded objects
scramble less than less deeply embedded ones. Schmitz’ (2006: 44) comment is
more telling in this respect; she notes that the scrambling of prepositional objects
is possible in German, but that its acceptability is lower than the acceptability of
scrambled non-prepositional objects. This is also the situation in MLG, not only
with regard to acceptability but also with regard to production. In the MLG clauses
analyzed here, scrambling ObjPPs is much less frequent than scrambling ObjNPs
(a fact which we cannot elaborate on here due to lack of space). Therefore, we can
formulate the following expectation with regard to this variable: If the marked
variant with one verbal element is indeed the consequence of a lack of scram-
bling, we expect this variant to occur more often with ObjPPs than with ObjNPs
since ObjPPs do not scramble as easily as ObjNPs in MLG. There are two clauses
which can be used in order to check this expectation, since these clauses show
enough variation with regard to the marking of the indirect object and with regard
to its position. They are <32> The stories that he’s telling the men are very sad and
<37> I have found the book that I have given to the children. We have already seen
examples for <32> (cf. (6a–b)), so here we will give examples for <37>:
(19) a. ich ha det Bük gefunge waut ik de Kinder gev (Men-11; f/18/MLG)
I have the book found that I the children give-VERB
c. ik hat de Bük gefunge waut ik ge- [0.4] waut ik gov to de Kinder (Men-3; f/38/MLG)
I had the.REDUCED book found that I gi- […] that I gave-VERB to the children
We cannot add an example with an ObjNP for the marked variant since there is
not a single token for the sequence verb-ObjNP. Translation (19c) is especially
interesting because of the repair contained in it. One might have expected the
informant to restart in order to put the ObjPP in the expected preverbal position,
but she actually restarts in order to correct the tense, i. e. past tense instead of
present tense. Other than this, she sticks firmly to her serialization. Table 6 offers
the distributional facts for clauses <32> and <37> (only definite ObjNPs/PPs):
136 Göz Kaufmann
Table 6: Distribution of the two variants in two relative clauses with one verbal element
separated according to the morphological type of object (only definite ObjNPs/PPs)
stimulus <32> Spanish: Las historias que les está contando a los hombres son muy tristes
English: The stories that he is telling the men are very sad
(20) a. die: Geschichte waut hei nem Mann vertahlt is sehr trürig (Men-39; f/36/MLG)
the story that he a.REDUCED man tells-VERB is very sad
b. die Geschichten waut hei [0.5] vertahlt an em- an ne Männer16 sin sehr trürig
(Mex-93; f/39/MLG)
the stories that he […] tells-VERB to a- to such.REDUCED men are very sad
stimulus <38> S
panish: El hombre que provocó el accidente desapareció
English: The man who caused the accident has disappeared
(21) a. De Omtje waut da en accident hat [0.8] der is furtgekummen (Mex-51; m/22/MLG)
the man that there an.REDUCED accident has-VERB […] he is away-gone
b. der Omtje det hat einen accident [0.5] is wajch (USA-17; f/14/Engl)
the man that-RELATIVE PARTICLE has-VERB an accident […] is away
Table 7 presents the distributional facts. The tokens are split between ObjNPs and
ObjPPs (cf. Section 4.4.1). For the ObjNPs a further separation had to be intro-
duced since there is a highly significant difference in the share of complement
clauses promoting the marked variant (cf. Section 4.2.2). Such a difference does
not exist with regard to the ObjPPs.
16 Informant Mex-93 starts out with an em (‘to a’), which she then repairs into an ne Männer
(‘to such men’). As (20b) is the only token where an indefinite ObjPP surfaces after the verb, it
is important that this complement is indeed indefinite. For an em, the categorization of em as
a reduced form of the indefinite article is unproblematic since cliticization of definite articles
is not present in the Mennonite data set. The semantically singular ne in an ne Männer is more
problematic: Ne seems to be a reduced form of the complex plural determiner sone (‘such’, a
portmanteau of solch eine ‘such a’, cf. Duden 2006: 330–331), which does occur several times
as a full form in the data set. In spite of the partially “definite” quality of solch in sone, the
characterization of the entire ObjPP as indefinite is justified – firstly because the more important
first attempt an em contains a clearly indefinite article and secondly because it is precisely the
“definite” part solch which is missing in an ne Männer.
138 Göz Kaufmann
Table 7: Distribution of the two variants in five dependent non-causal clauses with one verbal
element in all colonies separated according to the definiteness of the ObjNP/PP and according
to the type of clause (obj=ObjNPs/PPs; part=particle)
ObjNPs ObjPPs
complement clauses relative clauses <32> complement clause <5>
<1> and <4> and <38> and relative clause <32>
+definite -definite +definite -definite +definite -definite
We cannot present an example of the marked variant for clause <3> because there
is not a single one. Conversely, clause <4> is among the sentences with the highest
share of the marked variant. Table 8 gives the distributional facts for the two
clauses. The distribution is highly significant; as expected, only the indefinite
ObjNPs are found in the marked variant:
Table 8: Distribution of the two variants in two dependent complement clauses with one verbal
element in all colonies separated according to the definiteness of the ObjNP (stimulus sen-
tences <3> with definite ObjNPs; stimulus sentence <4> with indefinite ObjNPs; obj = ObjNPs;
part = particle)
In Section 4.3.2, we have given a parsing-related explanation for the generally low
number of raised VPs in dependent non-causal clauses with one verbal element.
In this section and in Section 4.4.1, we have shown that morphological and
semantic characteristics of the complements in these clauses – characteristics
known for their suppressing effect on scrambling – clearly promote the occur-
rence of the marked and raised variant. We therefore conclude (i) that the lack
of scrambling causes the rare and highly marked sequence verb-ObjNP/PP and
(ii) that Den Besten and Broekhuis (1989) may be right after all when claiming
that the VR-variant (but not the VPR-variant) is the consequence of scrambling,
at least with regard to MLG (cf. their quote in footnote 11).
Importantly, the analysis of dependent clauses with two verbal elements con-
firms our hypothesis. Clauses with indefinite ObjNPs/PPs co-occur significantly
more frequently with the VPR-variant (no scrambling of the ObjNP/PP) than
clauses with definite ObjNPs/PPs. These latter clauses appear significantly more
frequently with the VR-variant (scrambling of ObjNP/PP). Checking the difference
between definite ObjNPs and definite ObjPPs in dependent clauses with two
verbal elements also shows the expected distribution, i. e. a significant affinity of
ObjPPs with the VPR-variant and of ObjNPs with the VR-variant. Obviously, one
question which has to be answered is what causes the movement we have labeled
scrambling. For verb projection raising, the avoidance of parsing-difficult left-
140 Göz Kaufmann
branching structures was seen as cause (cf. Section 4.3.2). For scrambling, which
is often seen as an optional, pragmatically motivated movement, the answer is
less clear. One must not forget that all analyses in this article are based on the
translations of context-free sentences where pragmatic considerations are of sec-
ondary importance at best. Besides this, the co-occurrence of either two scram-
bled or two unscrambled variants in clauses with one and two verbal elements
do not fit well with the nature of an optional movement, but suggest structural
causes like, for example, feature checking. We do, however, not have an answer
to this question yet and leave it to further research.
Table 9: Distribution of the two variants in causal clauses with one verbal element in all
colonies separated according to the informants’ origin (obj=ObjNPs; part=particle)
If almost three quarters of all tokens in Mexico (and even more in the United
States) show the non-verb-final pattern, reanalysis seems to be the only pos-
sible explanation (cf. Kaufmann 2003: 188–189). We therefore have to reduce
the scope of the following analysis to the South American colonies. Even there,
however, the lowest share of the marked variant (Menno with 8 %) is comparable
Rare phenomena revealing basic syntactic mechanisms 141
with the highest share for the other types of clauses (the US-American Mennon-
ites with 8.4 %; in that category, Menno has a share of 0.7 %; cf. Table 1). If our
hypothesis with regard to the iconic relationship between V2-clauses and more
syntactic independence is correct (cf. Section 4.2.2), this relatively high share of
superficial V2-causal clauses correctly indicates that extraposed causal clauses
are indeed more independent and thus less deeply embedded than extraposed
complement clauses, let alone (non-extraposed) relative and conditional clauses.
A possible consequence of such a constellation is reanalysis: the higher the share
of V2-clauses, the higher the probability that superficial V2 turns into structural
V2. This has apparently happened in the MLG of most informants in Mexico and
the USA. In the South American colonies, the process has not yet reached the
point of no return, but even in the SHG-competent Paraguayan colonies, causal
clauses have to be analyzed independently. In these colonies, raising is frequent
in clusters with three verbal elements (cf. Section 4.3.2), but rare in clusters with
two verbal elements and virtually absent in dependent non-causal clauses with
one verbal element (cf. Table 1). Causal clauses in Paraguay, however, are gener-
ated much more frequently with the marked sequence verb-ObjNP and with the
VPR-variant in clusters with two verbal elements. Two examples are given in (23)
and (24). The examples in (23) feature a particle verb, the ones in (24) a simple
verb:
stimulus <23> S
panish: No te puede escuchar porque está sacando las cosas de la maleta
English: He can’t listen to you because he is unpacking his luggage
(23) a. dei kaun di nich hiere wiels dei riemt grad die Koffer üt (Fern-34; m/25/SHG)
he can you not hear because he packs-VERB just the suitcases out-PARTICLE
b. hei kaun [0.7] di nich hiere wiels hei sinen Koffer ütpackt
(Fern-11; m/44/SHG)
he can [...] you not hear because he his suitcase out-PARTICLE-packs-VERB
stimulus <24> S
panish: No está aquí porque está ayudando a tu padre
English: He is not here because he is helping your father out
(24) a. her is nicht hier wejens hei helpt sin [0.3] Voda (Men-24; m/25/MLG)
he is not here because he helps-VERB his […] father
b. her is nich hier wejens hei dinen Voda halp (Men-15; f/20/MLG)
he is not here because he your father helps-VERB
Table 10 shows the distribution of South American causal clauses with regard to
the different types of informants.
142 Göz Kaufmann
Table 10: Distribution of the two variants in dependent causal clauses with one verbal element
in the South American colonies separated according to the informants’ behavior in dependent
non-causal clauses with two verbal elements (vpr = verb projection raising; obj = ObjNPs;
part = particle)
The distribution shows a highly significant difference and the same concentra-
tions we have found in complement, relative and conditional clauses (cf. Table 4).
Informants who prefer the VPR-variant in dependent clauses with two verbal ele-
ments produce twelve of the 58 tokens of the marked variant (20.7 %; 3.8 expected
tokens) although they only have a share of 6.6 % of all tokens (27 out of 411). The
other extreme is again found among the informants who prefer the NR-variant
II. They produce 25 instead of 36.4 expected tokens of the marked variant (43.1 %
of the 58 tokens) although they are responsible for 62.8 % of all tokens (258 out
of 411). The other two types of informants again show intermediate shares of the
marked variant. The results for the quasi-interval index values for the informants’
raising and scrambling behavior are given in Table 11:
Table 11: Average values for verb projection raising and scrambling for the informants who
produce the two variants in dependent causal clauses with one verbal element in the South
American colonies
n 454 392 62
verb projection raising -0.157 -0.174 -0.053
n 414 354 60
scrambling -0.028 -0.013 -0.117
Rare phenomena revealing basic syntactic mechanisms 143
The results are highly significant for both raising (F (1,452) = 18.4, p = 0***) and
scrambling (F (1,412) = 7.7, p = 0.006**). This means that the raising-friendly and
scrambling-lazy informants again produce more tokens of the variant in question
than any other combination of these two dimensions. This adds one more piece of
independent evidence because (i) we have now analyzed causal clauses and not
complement, relative or conditional clauses and (ii) we have now analyzed the
data of less than half of all Mennonite informants. Only 143 of the 313 informants
come from the four South American colonies. In spite of these differences with
regard to origin and type of clause, the distributional facts are identical. To put
it more precisely: The behavior of the linguistically more conservative, i. e. more
SHG-like South American informants in less embedded causal clauses is exactly
like the behavior of the linguistically more progressive, i. e. less SHG-like North
American informants in more embedded complement, relative and conditional
clauses.
5 Concluding remarks
We have accrued massive empirical backing for a rather old hypothesis put
forward by Den Besten and Broekhuis who claimed in 1989 that the VR-variant
in clusters with two verbal elements is the consequence of verb projection raising
and scrambling (cf. footnote 11). This assumption implies that the VPR-variant
is the consequence of raising without scrambling. This implication is supported
by our data at least with regard to MLG, a variety whose speakers are confronted
with the VPR-variant and the VR-variant all the time. This constant exposure to
both variants allows Mennonites to “consider” scrambling the distinguishing
factor between them. It does not necessarily imply, however, that the VR-variant
in a language like Standard Dutch is explainable in the same way (cf. Kaufmann
2007: 202), since Dutch speakers are not exposed to the VPR-variant to the same
degree as the Mennonite informants are. The findings of this article can be sum-
marized in the following way:
(i) Mennonite informants who prefer the VPR-variant in dependent clauses with two verbal
elements tend to produce the marked sequence verb-ObjNP/PP.
(ii) We have found much evidence for the assumption that this marked sequence is the con-
sequence of raising without scrambling.
(iii) As these informants prefer the VPR-variant and as syntactic preferences most probably do
not depend on the number of verbal elements, we assume that the VPR-variant in MLG is
also the consequence of raising without scrambling.
(iv) Mennonite informants who prefer the VR-variant in dependent clauses with two verbal
elements produce the marked sequence verb-ObjNP/PP to a much lower extent.
144 Göz Kaufmann
(v) As these informants share their proneness for raising with the informants who prefer the
VPR-variant (their clusters are only differentiated by the position of the ObjNP/PP, not by
the sequence verb1-verb2), the significant difference in the share of the marked variant must
be caused by a differing scrambling behavior.
(vi) Thus we conclude that the VR-variant in MLG is the consequence of raising plus scrambling.
Combining the analysis of the marked variant with the analysis of the more
frequent variation in MLG verb clusters with two verbal clusters follows Rijkhoff’s
(2010: 223) wish according to which “[r]are linguistic features should play in [sic]
important role in grammatical theory, if only because a theory that can account
for both common and unusual grammatical phenomena is superior to a theory
that can only handle common linguistic properties”. Our analysis does exactly
this, it explains a common and a rare phenomenon by means of the same theo-
retical assumptions.
The last point to discuss is whether the variation found in MLG is better
explained by a more system-based or a more usage-based approach to language
variation. In our view, the occurrence of the rare variant is the consequence of the
infrequent and innovative17 overgeneralization of a system-based preference with
regard to two syntactic movements, namely verb projection raising and scram-
bling. If the number of the marked ObjNP/PP-final variant rises in the future,
an effect on the formation of new linguistic systems may follow. One possible
consequence could be a change from OV to VO as in Old English. It is not sure,
however, whether we will need a usage-based approach in order to explain such
a frequency effect. It may be enough to fall back on Lightfoot’s transparency prin-
ciple in order to account for such a reanalysis. After all, Lightfoot (1999: 156) con-
nects reanalysis with quantification.
Due to the results found we judge that the combination of modern syntactic
theory and variation linguistics is a rather fruitful one. Newmeyer (2005: 160)
might be right when he says: “But it is a long way from there [importance of sta-
tistical data from corpora] to the conclusion that corpus-derived statistical infor-
mation is relevant to the nature of the grammar of any individual speaker”. In
spite of this, we should not discount the possibility that although “corpus-derived
statistical information is [perhaps not] relevant to the nature of the grammar”, it
may be decisive for detecting “the nature of grammar”. Be this as it may, we do
17 The marked variant can be called innovative because the Brazilian informants who use it
are on average 8.2 years younger than the ones producing the unmarked variant (F (1,350) =
3.3, p = 0.069(*)). In Mexico, the difference is 7.7 years (F (1,582) = 6.4, p = 0.012*); in the USA, the
difference of 3.4 years is smaller and not significant.
Rare phenomena revealing basic syntactic mechanisms 145
hope to have come somewhat closer to a state of the art which Haider (2007: 389)
rightly demands for modern linguistics:
References
Auer, Peter (1998): Zwischen Parataxe und Hypotaxe: ‚abhängige Hauptsätze’ im gesprochenen
und geschriebenen Deutsch. Zeitschrift für Germanistische Linguistik 26: 284–307.
Barbiers, Sjef (2000): The right periphery in SOV languages: English and Dutch. In: Peter
Svenonius (ed.), The Derivation of VO and OV, 181–218. Amsterdam/Philadelphia: John
Benjamins.
Bennis, Hans (1992): Long Head Movement: The position of particles in the verbal cluster
in Dutch. In: Reineke Bok-Bennema and Roeland van Hout (eds.), Linguistics in the
Netherlands 1992, 37–47. Amsterdam/Philadelphia: John Benjamins.
Besten, Hans den and Hans Broekhuis (1989): Woordvolgorde in de werkwoordelijke eindreeks.
GLOT 12: 79–137.
Duden (2006): Die Grammatik: Unentbehrlich für richtiges Deutsch. Mannheim: Dudenverlag.
Eisenberg, Peter (in collaboration with Rolf Thieroff) (2013): Grundriss der deutschen
Grammatik – Volume 2: Der Satz. Stuttgart/Weimar: Metzler.
Haegeman, Liliane (1994): Verb raising as verb projection raising: some empirical problems.
Linguistic Inquiry 25/3: 509–522.
Haider, Hubert (2007): As a matter of facts – comments on Featherston’s sticks and carrots.
Theoretical Linguistics 33/3: 381–394.
Haider, Hubert (2010): The Syntax of German. Cambridge: Cambridge University Press.
Kaufmann, Göz (2003): The verb cluster in Mennonite Low German. In: Klaus J. Mattheier
and William Keel (eds.), German Language Varieties Worldwide: Internal and External
Perspectives, 177–198. Frankfurt a. M.: Peter Lang.
Kaufmann, Göz (2007): The verb cluster in Mennonite Low German: A new approach to an old
topic. Linguistische Berichte 210: 147–207.
Kaufmann, Göz (2011): Looking for order in chaos: Standard convergence and divergence in
Mennonite Low German. In: Mike Putnam (ed.), Sudies on German-Language Islands,
187–230. Amsterdam/Philadelphia: John Benjamins.
Keller, Rudi (1993): Das epistemische weil. Bedeutungswandel einer Konjunktion. In: Hans
Jürgen Heringer and Georg Stötzel (eds.), Sprachgeschichte und Sprachkritik. Festschrift
für Peter von Polenz zum 65. Geburtstag, 219–247. Berlin/New York: de Gruyter.
146 Göz Kaufmann
Larrew, Olha (2005): Norm, Normen, Normabweichungen: Eine historische und empirische
Untersuchung zur wissenschaftlichen Bewertung morphosyntaktischer Konstruktionen im
Deutschen. Hamburg: Dr. Kovač.
Lightfoot, David (1999): The Development of Language: Acquisition, Change, and Evolution.
Oxford/Malden, MA: Blackwell Publishers.
Newmeyer, Frederick J. (2005): Possible and Probable Languages: A Generative Perspective on
Linguistic Typology. Oxford: Oxford University Press.
Rijkhoff, Jan (2010): Rara and grammatical theory. In: Jan Wohlgemuth and Michael Cysouw
(eds.), Rethinking Universals: How Rarities Affect Linguistic Theory, 223–239. Berlin/New
York: Mouton de Gruyter.
Schmitz, Katrin (2006): Zweisprachigkeit im Fokus: Der Erwerb der Verben mit zwei Objekten
durch bilingual deutsch-französisch und deutsch-italienisch aufwachsende Kinder.
Tübingen: Narr.
Wexler, Kenneth and Peter Culicover (1980): Formal Principles of Language Acquisition.
Cambridge, MA: MIT Press.
Zwart, Jan-Wouter (1996): Verb Clusters in Continental West Germanic Dialects. In: James
R. Black and Virginia Motapanyane (eds.), Microparametric Syntax and Dialect Variation,
229–258. Amsterdam/Philadelphia: John Benjamins.
Leonie Cornips, Meertens Instituut/Maastricht University
The no man’s land between syntax and
variationist sociolinguistics: The case of
idiolectal variability1
Abstract: The aim of this paper is to focus on the so-called no man’s land where
sociolinguistics and grammatical theory interact. It is argued that E-language
as a social and I-language as a psychological construct do not exist indepen-
dently, but influence each other. In other words, syntactic variation and change
are driven by social factors but constrained by the nature of possible grammars.
The interaction between the social meanings of linguistic forms on the one hand
and grammar on the other brings about complex and multi-layered relationships
between the individual and the group’s or societal grammar. This paper empha-
sizes how individuals are restricted by grammar but, at the same time, able to
overcome these restrictions in specific situated contexts through interactions.
This combined approach enables us to predict why some structures are more
resistant or vulnerable to syntactic variation and change than others and the
route(s) individuals may take to overcome syntactic “restrictions”. In this process
of interdependent relations between the I- and E-languages, the interpretation
and evaluation of linguistic forms through interaction is of crucial importance in
the realization of so-called “impossible” or “unrealized” constructions.
1 Introduction
According to the editors of this volume – Aria Adli, Marco García García and Göz
Kaufmann – the system-usage issue remains controversial in modern linguistics,
and a single axis with the endpoints “system-based” and “usage-based” does
not do justice to the complexity of linguistic reality. In this paper, I will try to go
beyond the system-usage dichotomy perspective by viewing the phenomenon of
syntactic variation as a crossroad where sociolinguistics and generative grammar
meet. Syntactic variation could be considered a multilayered phenomenon that
is the result of cognitive capacities (cf. Kayne 1994) and is strongly influenced by
social or linguistic practices (cf. Labov 1972, 1994; but more specifically Eckert
2000, 2008; Silverstein 1985). The attempt to let these two approaches interact
with each other (Adger 2006; Adger and Trousdale 2007; Adger and Smith 2010;
Cornips and Corrigan 2005a, 2005b; Wilson and Henry 1998, among many others)
should give us insight into how syntactic variation is driven by social factors but
constrained by the nature of possible grammars (cf. Wilson and Henry 1998: 82).
The aim of this paper is to emphasize that there are two crucial issues regard-
ing linguistic complexity that must be tackled within the grammar and usage
debate: the issue of idiolectal or intraspeaker variation, and the complex and mul-
tilayered relationships between the individual and the society that bring about
mutual effects on both the individual and the group grammar. In my opinion it is
at the level of individual speakers where we can best examine the locus and limits
of syntactic variation. It is here that we encounter the largest possible variation
space and its boundaries for syntactic variation and that we find answers to the
questions of whether, why and how speakers cross these boundaries.
I will present four case studies that will deepen our insight into why and
how speakers in situated contexts cross syntactic boundaries, showing complex
and multilayered relationships between the individual and the group grammar.
These case studies will show that speakers do not only differ from each other but
that they show intraindividual variation as well. Speakers select and incessantly
combine linguistic forms producing multilayered clusters of linguistic elements
in social and geographical space, but this only comes about because speakers (re)
interpret these forms continuously (Eckert 2008: 463). Thus, linguistic forms are
always available for reinterpretation and carry various social meanings through
discourse. However, the process of reinterpretation and, hence, giving social
meaning to linguistic forms does not entail that speakers can select and combine
linguistic forms randomly or arbitrarily. However, it does imply that ungrammat-
ical or unacceptable constructions may become acceptable and realized in the
process of interaction since speakers make meaning out of producing and inter-
preting linguistic forms together.
The first case study, which focuses on word order alternations in a particular
three-verb cluster in Dutch dialects, shows that speakers prefer more than one
order in this type of cluster. However, the specific combinations of the possible
word orders are not distributed randomly – that is, there are combinations that
are categorically absent while others are distributed but restricted in geographi-
cal space. The second case study presents variation between overt and null forms
of the determiner. It reveals to what extent the individual speaker may accom-
modate and identify with his/her interlocutor resulting in intraindividual vari-
ation without any noticeable effort. The third case study deals with contemporary
urban vernaculars and is particularly interesting since syntactic restrictions are
loosened in these varieties. As a result, new types of constructions emerge as an
The no man’s land between syntax and variationist sociolinguistics 149
integral part of grammar even if the phenomenon in question does not seem to
allow for variation, as in the case of verb-second word order in Continental West
Germanic varieties. Youngsters in large cities may override verb-second word
order in situated contexts, but again this variation is not random since the syntac-
tic route to overcoming it is the same for every speaker. Finally, the case study on
the overuse of common gender determiners in Dutch shows that language is not
a neutral medium for communication, but rather a medium through which social
acts are accomplished. A speaker using common definite determiners instead of
the required standard neuter one explicitly argues that he cannot use the stand-
ard form because it would make a dumb impression on his peers.
The paper is organized as follows: The second and third section present a
brief overview of the history of the generativist and sociolinguistic approaches
to variation and tackle the question of why they diverged and why they seem to
be converging now. These sections are important since generativist and socio
linguistic theories have important interfaces explaining why not all linguistic
forms vary to the same degree (grammar component), and why speakers may
cross ‘boundaries’ regarding so-called ungrammatical or unacceptable construc-
tions, resulting in language variation and change (sociolinguistic component).
The fourth section elaborates on this interplay between I- and E-languages. The
fifth section presents the four case studies illustrating the intriguing interplay
between linguistic and social organization.
Cornips and Corrigan (2005a, 2005b) were of course not the first to point out
that researchers who espouse the frameworks encapsulated by the umbrella
terms “grammar” as in generative theory and “usage” as in quantitative or vari-
ationist sociolinguistics diverge quite rigidly in terms of both their methodo-
logical approaches and their theoretical persuasions. Although certain formal
resonances between the paradigms have endured since the early days of their
inception, the fundamental differences between them created a schism that has
persisted through most of the later twentieth century (see references in Cornips
and Corrigan 2005a, 2005b).
In the 1960s both sociolinguistic and formal syntax models contained formal
rules that could be applied obligatorily or optionally.2 Formal rules in the earliest
Chomskian model were transformations that connected “deep” structures with
“surface” structures on the basis of rewrite rules. Optional rules, for example,
derived passive, negative or question sentences from declarative sentences.
Labov introduced the concept of the variable rule as an extension of this optional
rule to include social and stylistic dimensions of language use along with linguis-
tic dimensions. However, both paradigms soon followed their own avenues, i. e.
the successive transformational models assume the existence of categorical rules
only, while variationist sociolinguistics has maintained the notion of the optional
rule. The two perspectives on the nature of formal rules reflect deep-seated differ-
ences between the two models. Variationists claim that the output of a linguistic
rule can be probabilistic rather than discrete and that a linguistic constraint can
have a quantitative rather than deterministic effect on the outcome of the process.
However, Labov based his research on the generative linguistic model by
putting forward the variable rule as a means of accommodating interspeaker and
intraspeaker variation. The variationist sociolinguistic practice that has evolved
from studies of language variation and change since then assumes the principle
of accountability as a given (cf. Sankoff 1990: 296). This principle states that vari-
ants belonging to the same syntactic (linguistic) variable must be specified by the
total number of occurrences and the potential occurrences or non-occurrences
in the variable environment, i. e. it ranges between 0 % and 100 % (cf. references
in Cornips and Corrigan 2005b). This guarantees that the entire range of variable
and categorical occurrences present in the data are taken into account. The notion
of the syntactic variable as a structural unit and the question of which variants
belong to this unit were based on the earliest generative assumption that vari-
ants have an identical underlying structure or representation which is subject to
variable surface realizations (Winford 1996: 177). The alternation between active
and passive constructions is an example of two different surface manifestations
of the same underlying or deep structure – that is, two variants belonging to the
same sociolinguistic variable (variable rule). The definition of the syntactic vari-
able as a structural unit inevitably follows the synonymy principle. This principle
is the prerequisite for variants to be assigned to the same linguistic variable; in
other words, only syntactic variants that are equivocal with regard to referential
meaning, i. e. variants that are “alternate ways of saying the ‘same’ thing” (Labov
1972: 118) belong to the same variable. However, the assignment of meaning or
function of syntactic variants was considered problematic soon after its introduc-
tion (Lavandera 1978). Moreover, it has been suggested that some types of mor-
phosyntactic variation rarely serve to differentiate social groups because of their
dependence on pragmatic and semantic conditioning (cf. references in Cornips
and Corrigan 2005b).
In the meantime, in subsequent generative models, the idea of a derivational
model was abandoned in favor of a configurational model (most recently Mini-
The no man’s land between syntax and variationist sociolinguistics 151
3 N
eeded: Reconciling approaches to account
for linguistic complexity
The next two sections form the backbone of this paper. Here, it is argued that
an ideal speaker-hearer provides us with a decontextualized view on syntactic
variation and change, while a speaker-hearer relationship is intrinsically social
by nature. We need both approaches to account for the fact that intra- and inter-
individual variation do not constitute a rare phenomenon but occur in normal,
daily situations. A combined approach is also needed in order to account for the
fact that syntactic variation and change happens all the time. A first task for a
combined approach of sociolinguistics and theoretical syntax to tackle linguistic
complexity would be to find out: (i) which grammatical considerations are rel-
evant to the definition of syntactic variables; (ii) whether grammatical theory can
predict the differential vulnerability of diverse linguistic forms for variation and
change; and (iii) how specific situated contexts influence individual grammars
(Eckert 2000; Eckert and McConnel-Ginet 1992, 1999; Meyerhoff 2002).
Of course, with respect to theoretical syntax and sociolinguistics, Wilson and
Henry (1998: 2) already noted that “there have been few real attempts to marry
152 Leonie Cornips
these seemingly divergent positions” and Meechan and Foley (1994: 63) likewise
suggest that the two fields “rarely, if ever, cross paths”. However, since the nineties
and especially after the turn of the millennium, sincere attempts have been
undertaken to integrate grammar and usage.3 Wilson and Henry (1998: 82) argue
that syntactic variation is driven by social factors but constrained by the nature of
possible grammars. Cornips and Corrigan (2005a) emphasize that the generative
approach has much to gain from a perspective in which the organization of the
grammar is seen as somehow reflected in patterns of usage. Moreover, usage may
lend strong support to a structural analysis and usage may reveal “a glimpse of
grammatical structure” (Meechan and Foley 1994: 82). The core findings of socio
linguistics are that usage data are not amorphous. Rather, as Guy (2005: 563)
notes, “linguistic diversity is well mannered and orderly, following observable
principles of social and linguistic organization” and “[g]rammar and usage both
exhibit structure and order, some of it probabilistic, some of it categorical”. The
central point here is that intuitions and usage data differ, but overlap as well.
Usage data differ from intuitions in that they do not provide direct insight into
which syntactic variants are ungrammatical. Moreover, they do not necessarily
occur in the contexts that are relevant for specific theoretical concerns. On the
other hand, generative practices also demonstrate that intuitions of individuals
who claim to speak the same variety may differ and that those individual intui-
tions may differ with respect to context as well as over time. Further, usage data
coincide with intuitions to a large extent and, even more relevantly, they may
contain entirely new phenomena not predicted by theory. Consequently, theory is
enhanced by analyzing and accounting for non-expected data and patterns (see
for example Jensen and Christensen 2013).
In fact, Adger (2006) proposes to account for variation in agreement patterns
not only formally, but also functionally: Variation is not only restricted to lin-
guistic representation but also related to language in use. In his view, variation in
agreement patterns is ultimately a matter of the properties of the lexicon of func-
tional categories. The variants that make up the linguistic variable are “not simply
determined by the linguistic context in which they appear, nor are they simply in
free variation. Rather, they are more or less likely to be selected depending on the
previous discourse, the speaker, the audience, and other psycholinguistics and
sociolinguistics factors” (Adger and Trousdale 2007: 268–269). Adger and Smith
(2010: 1109) argue that “the variability found in an individual speaker is two-
dimensional: it may involve varying featural specification of functional categories
and/or underspecification in the mapping between these categories and between
3 See brief overviews in Adger 2006; Adger and Trousdale 2007; and Cornips and Corrigan 2005a.
The no man’s land between syntax and variationist sociolinguistics 153
morphological forms; the former modeling the kind of variation usually thought
of as ‘parametric’ and the latter modeling the kind of variation usually captured
by the notion of linguistic variable”. In the lexical feature model, despite its
limited scope, the most recent formal insights are very well articulated. However,
although syntax is viewed as completely autonomous, sociolinguistic research
shows that syntax continuously varies and changes not only between generations
but also at the level of the individual speaker. The ways in which individuals speak
are constrained by grammar, thus syntactic variation is certainly limited and some
parts of grammar are more resistant to variation than others (for instance, V2 in
Germanic languages). But the other side of the story is that individuals are able to
overcome so-called syntactic restrictions. In order to detect this, we need to study
these individuals in situated contexts in interaction with others.
These contexts show how individuals divide up the world in which they live
and how these oppositions obtain their shape linguistically. According to Eckert
(2008, 2012), linguistic variation constitutes a robust social semiotic system that
is able to express the full range of social concerns in a given community; variation
does not simply reflect, but constructs social meaning, hence it is a force in social
change. Speakers recognize syntactic variants as stereotypes and these may be
activated (or avoided) in public performances or otherwise in highly stylized uses
of local-sounding speech (Eckert 2000; Rampton 1995). Speakers do not simply
reflect “grammars” or “social categories” but are agents as well. Consequently,
the question of which linguistic element(s) will become socially meaningful is
dependent on the individual and wider societal, political and ideological context
of interaction (Cornips 2014). This may lead to individuals crossing or stretch-
ing the borders of the variation space of syntax in specific situated contexts and
social practices.
Syntactic variation at the level of the individual speaker and the community is
not chaotic and distributed randomly but is governed by social rules (Labov 1972,
1994 and many others). Important questions to be addressed are therefore: How
and to what extent do individual and group grammars influence each other? Is
it possible to predict which linguistic variants will become socially meaning-
ful in specific situated contexts and which lexical features will be spelled out
ultimately?
The relation between an individual and societal “grammar” is not one of
a simple dichotomy between I- versus E-language. I-languages are the product
of genetic endowment and individual experience (Chomsky 1995). People who
154 Leonie Cornips
live closely together can understand each other because they share a common
genetic endowment (by virtue of being human) and a common (linguistic) expe-
rience. This experience, however, is not completely identical and, therefore, one
will always find some variation between the I-languages of people who claim to
speak the same dialect (see Adger and Trousdale 2007: 271). This corresponds to
Guy’s (2005: 562) summary of the principal findings of sociolinguistics in which
he states that individuals are grammatically (more or less) similar due to social
proximity. Hence, experience cannot take place in a social vacuum: It is shaped
in interaction with and by others. Consequently, the opposition between I-lan-
guage and E-language is not as watertight as suggested in the literature. Accord-
ing to Muysken (1999: 72, 2000: 41–43), the cognitive abilities which shape the
I-language determine the constraints found in E-language. Moreover, the norms
created within E-language make I-language coherent. Thus, it is untenable to
picture “grammar” as merely a transparent representation of inner mental events
since language is one of the most important mediums through which social acts,
including linguistic norms, are accomplished. Language use itself is a form of
social activity (Widdicombe and Woofitt 1995: 1).
Henry (2005) has already pleaded for more attention to the phenomenon of
idiolectal (intraspeaker) variation by both sociolinguists and theoretical syntacti-
cians. Within the classical grammar perspective, variation refers to structural dif-
ferences between individual grammars (interspeaker, cross-linguistic variation or
variation between closely related dialects) but not within the individual grammar
itself. Central questions in current syntactic research are: (i) What are the limits of
syntactic variation for the individual speaker and in general? and (ii) What is the
locus of syntactic variation in the grammar model? In my opinion, intraspeaker
variation is the most challenging kind of variation to examine, and therefore
this type of variation may be the key to gaining generative insights and answer-
ing these questions. After sixty years of theory development about the internal
organization of grammar, the idealized speaker-hearer environment should be
left aside and, instead, generative insights should be tested in the realm of lan-
guage use where this internal organization has its most complex output. Hence,
E-language as a social and I-language as a psychological construct do not exist
independently of one another, but their interaction influences the grammars of
speakers and the way they speak. Hence, the multilayered relationships between
language as a social product and language as “grammar” continuously influence
language norms. These norms are crucial since they determine which linguistic
elements are selected (or not selected) by speakers in which contexts and, con-
sequently, relate to the central question of how people use language in their daily
lives (social practices) and how grammar is organized. The norms, the selection
of linguistic elements and the daily practices of people influence one another
The no man’s land between syntax and variationist sociolinguistics 155
continuously (see the “total linguistic fact” by Silverstein 1985). This “holistic”
view of language is the only one that can explain how individual grammars
are restricted and at the same time how individuals are able to overcome these
restrictions in specific situated contexts. This combined approach enables us to
predict why some structures are more resistant to syntactic variation and change
than others and the route(s) individuals may take to overcome these syntactic
“restrictions”. In this process, the interpretation and evaluation of linguistic
forms through interaction is of crucial importance in the acceptation of so-called
“impossible” or “unrealized” constructions.
Very likely, the domain of micro-parametric variation – the syntactic differ-
ences between closely related individual grammars in social and geographical
proximity – is the most eligible domain for addressing the questions raised
above. Micro-parametric variation research clearly shows that each speaker has
his/her own grammar that minimally differs from the grammar of everybody else
(Cornips and Poletto 2005; Barbiers, Cornips, and Kunst 2007). In this empiri-
cal domain, there has been a clear methodological shift in generative grammar.
Intuitions or native-speaker introspections in an idealized environment used to
be considered the only suitable tool in macro-parametric variation research. This
often meant that the resultant analyses reflected the grammaticality judgments of
the theorist who may have been unaware of the considerable degree of syntactic
variation which potentially exists within the same speech community (Cornips
and Corrigan 2005b). However, recent micro-parametric variation research inves-
tigating dialects in many parts of Europe has drawn on acceptability judgments
of non-standard speakers and, hence, it is in this domain that generative gram-
marians and sociolinguists are converging.
Regarding micro-parametric variation in social space, data can be collected
by the so-called sociolinguistic interview – that is, systematic recordings of con-
versations between individuals. Due to this, analysis will consist of socially situ-
ated language samples. Of course, the setting of the sociolinguistic interview is
an experimental one (Labov 1972, 1975, 1994). The data for geographical micro-
parametric variation, which is the object of large-scale syntactic dialect atlases
such as the Syntactic Atlas of the Dutch Dialects (acronym SAND, cf. Barbiers,
Cornips, and Kunst 2007), the Northern Italian Syntactic Dialect Atlas (acronym
ASIS; cf. Poletto and Benincà 2007) and the Scandinavian Dialect Atlas (acronym
ScanDiaSyn; cf. Vangsnes 2007) must be systematically elicited from a sample
of community members (in a large geographical area) rather than derived from
linguists’ own introspections.4 This not only enhances the empirical basis of syn-
4 More information about the ASIS, SAND and ScanDiaSyn projects can be found at: http://asis-
156 Leonie Cornips
tactic theory, but also reduces the influence of prescriptive rules. The elicitation
methodology in micro-parametric variation research relies on prior knowledge of
variability within the speech community gained by observational methods, and it
is on this basis that hypotheses are formulated and tested. In the domain of micro-
parametric variation, systematic recordings of spontaneous speech and eliciting
acceptability judgments of speakers are both necessary and complementary.
In this section, I will discuss four case studies revealing intraindividual variation.
The recent view on micro-parametric variation challenges the traditional idea of
idiolects being sufficiently similar. Kayne (1996: XV) asks: “Can anyone think of
another person with whom they agree 100 % of the time on syntactic judgments
(even counting only sharp disagreements)? Or more precisely, are there any two
people who have exactly the same syntactic judgments without exception?”
(cited in Adger and Trousdale 2007: 266).
The four case studies will show that speakers differ from each other and that
they show intraindividual variation even when considering word order or gram-
matical gender agreement. The first case study focuses on word order in the MOD-
AUX-Vpart three-verb cluster in Dutch dialects. The second case study deals with
variation between overt and null forms of the determiner (D) and reveals that
the individual speaker may accommodate and identify with his/her interlocutor,
resulting in intraindividual variation. The third case study shows that contact set-
tings between youngsters in urban cities may override the so-called verb-second
constraint, and the last case study shows how important intraindividual vari-
ation is for the construction of a streetwise identity.
ferent types of three-verb clusters were presented to and evaluated by 370 native
speakers of local dialects throughout the Netherlands and Flanders (Belgium).
Social variables of the speakers were controlled for, thus homogenizing the
elicited data as much as possible with respect to age, mobility, the functional
domains in which the respective dialect is spoken, and education. Only in this
way it is possible to detect variation ascribed to geographical differences (cf.
Barbiers, Cornips, and Kunst 2007: 60). We administered among others the three-
verb cluster MOD-AUX-Vpart illustrated in (1):
(1) Jan weet dat hij voor drie uur de wagen moet hebben gemaakt 1-2-3 order
‘Jan knows that he before 3 o’clock the car must have made’
The first verb in this cluster contains the modal moet ‘must’ as the hierarchically
highest verb (1) (MOD), perfective hebben ‘have’ (2) as the infinitive (AUX) and the
lexical verb (3), the past participle gemaakt ‘made’, as the hierarchically lowest
embedded verb (Vpart). All six possible orders between the three verbs in (1) were
offered to the subjects in an indirect relative judgment task, as illustrated in
Figure 1, to collect data about variability in word orders in three-verb clusters. In
this task the 370 subjects were first asked to answer with ‘yes’ or ‘no’ whether they
encounter the orders (a) through (f) in their local dialect and, subsequently, to
rank these orders from most to least acceptable on a 5-point-scale (representing *,
?*, ??, ?, ok). Thus, the subjects were instructed to consider all possible orders in
Figure 1 (see also Barbiers 2005; Cornips 2009):
encounter uncommon-common
a. 1-2-3 …dat…moet hebben gemaakt yes/no 1–2–3–4–5
b. 1-3-2 …dat…moet gemaakt hebben yes/no 1–2–3–4–5
c. 2-1-3 …dat…hebben moeten gemaakt yes/no 1–2–3–4–5
d. 2-3-1 …dat…hebben gemaakt moeten yes/no 1–2–3–4–5
e. 3-1-2 …dat…gemaakt moet hebben yes/no 1–2–3–4–5
f. 3-2-1 …dat…gemaakt hebben moet yes/no 1–2–3–4–5
Figure 1: The six possible word orders in the verbal cluster MOD-AUX-Vpart
Table 1 presents the quantitative results of the judgment task for the whole area,
i. e. the Netherlands and Dutch-speaking Belgium. It shows that the MOD-AUX-
Vpart cluster allows four different orders (in bold) when we include only the
“yes = 5” answers above a threshold of 10 % (n > 37) (cf. Cornips 2009). The differ-
ences in the absolute number in the columns “yes = 5” through “yes = 1” are due
to whether the construction is acceptable in a smaller or wider geographical area.
158 Leonie Cornips
The 3-2-1 order, for example, appears to be restricted to the Frisian area, while
3-1-2 is quite common almost everywhere (in particular in the eastern part of the
Netherlands) (cf. Barbiers 2005).
Table 2 shows that individual speakers accept not only one but also two or even
three different orders. If we establish a threshold of 10 % again, the MOD-AUX-
Vpart cluster allows up to three orders per subject and the percentages of inform-
ants accepting two orders are highest, followed by informants accepting three
orders:
Table 2: Idiolectal variability: the number of orders for the MOD-AUX-Vpart cluster accepted by
the same subject regardless of the acceptability scale value (cf. Cornips 2009: 219–220)
The specific combination of two, three and four orders at the level of the individ-
ual speaker is not distributed randomly. There are combinations such as the pair
1-2-3/3-2-1 that are categorically not present (n = 0 in Table 3). With respect to two
orders, individuals prefer the combination 1-2-3 and 3-1-2 (n = 82). With respect to
three possible orders, the subjects prefer the combination 1-2-3/1-3-2/3-1-2 (n = 71)
significantly more than other possible three-combinations. The combination
1‑2‑3/1-3-2/3-1-2/3-2-1 is the most favorite among informants who accept four differ-
ent orders (n = 22):
The no man’s land between syntax and variationist sociolinguistics 159
Table 3: Combinations of accepted orders by individual speakers for the MOD-AUX-Vpart cluster
(more than 4 subjects, cf. Cornips 2009: 220)
Importantly, the same preferences in Tables 1 and 3 were reflected in the sponta-
neous speech of 67 adult speakers in one location in the southeast Netherlands,
namely the city of Heerlen (cf. Cornips 2009). These 67 speakers produced the
order 3-1-2 in the MOD-AUX-Vpart cluster most frequently and they also produced
the combination 1-2-3/1-3-2/3-1-2. In this respect, spontaneous speech data in one
location converge with acceptability judgments in a large geographical area, i. e.
the Netherlands and the Dutch-speaking part of Belgium.
This consistency in idiolectal variation calls for a combined “grammar” and
“usage” approach. What is needed in order to account for the complexity of the
data at the individual level is a principled answer to the question of why certain
verb clusters and certain combinations of three-verb clusters are almost categor-
ically absent. Moreover, we must analyze the size of the variation space, i. e. the
most frequent and geographically constrained distributions of the various combi-
nations of three-verb clusters. Hence, the order 3-2-1 in the MOD-AUX-Vpart cluster
is most present in the northwest (Friesland) area, the combination of orders 1-3-
2/3-1-2 primarily in Flanders, and the combination of orders 1-2-3/3-1-2 can be found
everywhere in the Netherlands.5 Thus on the one hand, the MOD-AUX-Vpart cluster
shows a bewildering variation with respect to different word order alternations,
which is a feature of linguistic complexity. On the other hand, this complexity
shows regular patterns between closely related individual grammars distributed
geographically and within one individual grammar; that is, they cluster together
in social and geographical space. Clearly, the specific geographical distributions
and specific word order combinations are the products of interactions between
different and changing groups of speakers. To be more specific, both the accepta-
bility judgments and spontaneous speech data (cf. Cornips 2009) provide such
orderly heterogeneity that speakers can be considered members of sociolinguistic
units (cf. Guy 2004). It is these clusters of individual speakers that may give rise to
labels such as “dialect” or “language”, both for linguists as well as for laypersons.
In the strongest form of Minimalism (cf. Chomsky 1995), the Universal
Grammar hypothesis states that syntactic variation does not exist. Apparent syn-
tactic variation such as the different orders in the MOD-AUX-Vpart cluster should
be reducible to the lexicon, that is, parameterization of morphosyntactic fea-
tures, and to phonological form, thus, different ways to spell out one and the
same syntactic structure phonologically (Barbiers, Cornips, and Kunst 2007).
The “grammar” perspective should be able to explain why some combinations
of orders are preferred above others (see Table 3) by identifying a grammatical
factor which is responsible for (i) the preferred combination of 1-2-3, 1-3-2 and 3-1-2
and (ii) the exclusion of 2-3-1 and 3-2-1 orders.6 The problematic issue, of course,
is how to account for the fact that individual speakers can use several variants
assuming that the different word orders all belong to one syntactic variable. This
issue is related to questions within the successive generative models about the
locus of syntactic variation, its restrictions and predictions. In the literature, two
alternative approaches to this “choice” are suggested (Muysken 2005): Either the
“choice” is put outside the grammatical mechanisms (Adger and Smith 2005,
2010; Adger 2006; Kroch 1989) or it is put inside the grammar by reintroducing
optional rules (Henry 2002; Wilson and Henry 1998). In the same vein as Henry
(1995), Barbiers (2005) argues that the categorical absence or presence of spe-
cific word orders in the verb clusters discussed above are the result of optional
movement in the syntactic component. An account in terms of optional move-
ment implies that various word orders differ only superficially from each other
(cf. Harris 1996: 32; Winford 1996; Cornips 1998). In a combined approach, it
should be examined how groups of speakers organize different word orders and
different combinations of word orders (possibly in combination with other lin-
guistic sets) as resources in ways that make sense for them under specific social
conditions (Jørgensen 2008: 167). Accordingly, this type of variation should also
be associated with particular “dialects”, social distinctions and values (Eckert
6 In Cornips (2009), I argued that the 1-2-3 order as a basic structure is capable of accounting
for the empirical results. The word orders in the MOD-AUX-Vpart cluster and their co-occurrences
show that verb raising of the participle, resulting in 1-3-2 and 3-1-2 order, depends on the verbal or
adjectival status of the participle (see Den Besten and Broekhuis 1989).
The no man’s land between syntax and variationist sociolinguistics 161
2000). Such an approach will give us a first glimpse into why inter- and intra
speaker variation is so overwhelmingly present. Such a combined approach
“challenges the separation of ‘variety’ and ‘practice’ approaches” (Rampton
to appear). Rampton insists that we look for the connections between dialect
systems, reflexivity and interaction. The selective targeting, isolation and formal
description of linguistic features remains an essential analytical task, but if we
are able to construe these features as ingredients in a style or register, then we
need to attend to the ways in which, with varying levels of awareness, their inter-
actional use contributes to participants’ agentive self-positioning in the social
world, aligning them with certain ideological typifications of language, speech
and ways of being (cf. references cited in Rampton to appear). Let us now turn to
a second case study which shows that acceptability judgments differ at the level
of the individual speaker.
In this section, variation between overt and null forms of a functional head
(D) will be considered. This phenomenon is of course of a very different nature
than word order alternations in verbal clusters. However, both types of variation
show inter- and intraspeaker variation both in written elicitation (verbal cluster
described above) and in oral interviews. In the phase of oral elicitation within
the Dutch syntactic atlas project (SAND), 250 locations were selected through-
out the Netherlands. We had a major problem in doing the fieldwork since the
large majority of the fieldworkers and Ph.D. students only speak the standard
variety (cf. Cornips 2006). Therefore, we let the subject recruit an acquaintance
as an “assistant interviewer” in order to be able to interview him/her in his/her
own dialect. This assistant interviewer was asked to translate a standard Dutch
structured elicitation task into his or her local dialect. These translations were
recorded. In a second session these recordings were played to the original subject.
In this second session, the entire conversation was restricted to the two dialect
speakers and the fieldworker did not interfere.
One of the locations involved in the oral interviews was Nieuwenhagen in the
southeast of Limburg province. In the local dialect, as elsewhere in that region,
proper names are obligatorily preceded by the definite determiner et or der ‘the’.
The presence of the definite determiner preceding a proper name, as in (2), is fully
ungrammatical in standard Dutch:
The recording of the first session between the standard Dutch-speaking field-
worker and the local assistant interviewer who translates standard Dutch into his
own dialect shows that the definite article in his translation is absent; the proper
names Wim and Els appear without it, as illustrated in (3):
(3) Ø Wim dach dat ich Ø Els han geprobeerd e kado te geve
Ø Wim thought that I Ø Els have tried a present to give
‘Wim thought that I tried to give a present to Els.’
In the second session, however, in which only the assistant interviewer inter-
views the dialect speaker in the local dialect, the latter utters the definite article
as “required”:
(4) Der Wim menet dat ich et Els e boek probeerd ha kado te geve.
the Wim thought that I the Els a book tried has present to give
‘Wim thought that I tried to give Els a book as a present.’
Hewitt 1986; Jaspers 2005; Rampton 1995, 2005, to appear; Quist and Svendsen
2010). These multilingual contact settings provide situated contexts in which syn-
tactic restrictions are loosened and new types of syntactic variation emerge as an
integral part of grammar. Freywald et al. (to appear) describe how in urban ver-
naculars in big cities in Norway, Sweden and Germany the so-called verb-second
constraint (V2) is overridden. Normally, in these languages (as also in Dutch and
other Germanic languages) only one constituent may precede the finite verb in
declarative clauses. Whenever a declarative clause begins with something other
than the subject, subject-verb inversion is required, as in the German example in
(5) (all examples from (5) through (7) are taken from Freywald et al. (to appear)):
The overall picture points to a systematic pattern from both a “grammar” and
a “usage” perspective. With respect to the former, the constituent that directly
precedes the finite verb is almost without exception the subject and these subject
constituents are in most cases pronominal. The adverbial in front of the subject
together with the finite verb situate the event in time; it helps to structure the
164 Leonie Cornips
(narrative) discourse. The most common adverbials are the equivalents to ‘then’,
‘afterwards’, and ‘after this’ (see (6) and (7)). With respect to the latter, these
“violations” of V2, or to be more precise, Adv-S-Vfin instances, occur only in peer
conversations. They are remarkably rare or even entirely absent in interviews and
written texts (cf. Freywald et al. to appear and references cited).
This phenomenon in vernacular urban Dutch is very rare. However, in the
sporadic occasions in which it emerges it obeys the same syntactic restrictions
as mentioned above: An adverbial and pronominal subject is placed before the
finite verb and the adverbial in front of the pronominal subject is a temporal one:
Definite determiners
Gender of noun Singular Plural
common de de
neuter het de
The results of experimental tasks have revealed that the monolingual acqui-
sition of the Dutch definite neuter determiner is a long process as children do
not acquire the target system before the age of seven (cf. Blom, Polišenskà, and
Weerman 2008; Polišenska and Weerman 2008). Instead, monolingual children
overgeneralize the definite determiner de and use it incorrectly with neuter nouns
that require the definite determiner het. It takes bilingual children even longer
to acquire grammatical neuter gender in Dutch. But although the overgeneral-
ization of the definite determiner de constitutes a linguistic resource for every
bilingual child (and even for monolingual acquirers, cf. Cornips and Hulk 2008),
it only becomes socially meaningful in a process of ethnic and age identifica-
tion (cf. Cornips 2008; Cornips and Hulk 2013). Nortier and Dorleijn (2008) point
out that youngsters of Moroccan descent are learning both linguistic norms and
norms of stylistic appropriateness. The overgeneralization of common gender is
one example of this process, as illustrated by the following quotation from a con-
versation with a Moroccan informant S. from Rotterdam:
S: Yes, like
Int1: that little girl [line 7: diecommon meisjeNeuter]
S: I then say that house [line 8: diecommon huisNeuter]. While I know I mean I certainly know
that it should actually be that house [line 9: datneuter huisNeuter] but it would make a dumb
impression if I would say
Int2: Yes
S: If I would say that house [line 11: datneuter huisNeuter] in the street]
Int2: Yes, yes
S: It is just that house [line 13: diecommon huisNeuter]. But when I speak with you two (the authors
both Dutch and middle-aged JN/MD) it is just that house [line 14: datneuter huisNeuter].
“The speaker in the quotation explicitly says that he has to make errors, deviations
from the standard norm, in order to be recognized as someone who is hanging out
with friends” (cf. Nortier and Dorleijn 2008: 132). This quote suggests that the
construction of identity takes place by means of a language which is governed by
linguistic norms of the group the speaker wants to belong to and identifies with
in contrast to others.7 Subsequently, the interplay between the individual and the
group is visible in the syntactic output, i. e. gender markings on forms bearing
gender, namely the determiner.
However, the overuse of common gender on determiners also shows up in
other “languages” whose grammatical gender system is similar to Dutch – that
is, languages that have (i) a distinction between neuter and common gender;
(ii) little evidence for grammatical gender, with gender not being marked on the
noun, (iii) robust frequency differences such that common nouns outnumber
neuter nouns and (iv) very few morphophonological cues for the different gender
forms of the determiner. Quist (2008) and Kotsinas (2001: 150) note that in urban
vernacular speech in Copenhagen and Stockholm neuter indefinite and definite
determiner are being replaced by common determiners but not vice versa and
that the overuse of common gender always takes place in peer group settings:
7 Of course, this example also raises interesting points concerning language awareness.
The no man’s land between syntax and variationist sociolinguistics 167
The interplay between individual grammar and group grammar is in line with
Widdicombe and Woofitt’s (1995: 36) claim that a mental state, which a grammar
is, is socially shared and therefore common to different members in a given
society or group. Cognition, in their view, is an individual’s mental reconstruc-
tion of shared physical and social environment. Therefore, the creation of norms
and mental reconstruction are agentive processes. In other words, language is
not a neutral medium of transmission of values, attitudes and opinions about a
world of events “out there”, but rather a medium through which social acts are
accomplished, thus language use is itself a form of social activity (Widdicombe
and Woofitt 1995: 1–2).
It is in this no man’s land between syntax and variationist sociolinguistics –
that is to say, in the border zone constituting the mutual relationships between
the individual and the social – that we can find the answer to the questions of
why and how society influences the individual (grammar) and vice versa.
6 Summary
The aim of this paper was to emphasize the point that there are two crucial issues
with regard to linguistic complexity that must be tackled within the grammar and
usage debate: the issue of idiolectal or intraspeaker variation, and the complex
and multilayered relationships between the individual and society that bring
about mutual effects on individual grammar and group grammar. It is at the level
of individual speakers that we can best examine the locus and limits of syntactic
variation. Here, we encounter the largest possible variation space and answers to
the questions of whether, why and how speakers enlarge this space, and in doing
so, overcome syntactic constraints.
I have discussed word order alternations in three-verb clusters in Dutch
dialects, the variable use of the definite determiner preceding proper nouns, the
instances of Adv-S-Vfin in main declarative clauses in Germanic V2 languages
and the use of common determiners instead of neuter ones in Dutch. Syntactic
variation can be studied from two perspectives: a theoretical-analytical one in
which grammar relates to genetic endowment and cognitive capabilities, and a
social perspective in which language is a social praxis. Of course, these two sides
of language can be studied in isolation from each other. However, an approach
combining crucial theoretical insights and methodologies from each discipline
brings us a step further in disentangling and understanding the phenomenon of
new and complex patterns in language use data.
A combined approach of the two disciplines is the only one that enters into
this no man’s land between syntax and sociolinguistics where variation is driven
168 Leonie Cornips
by social factors but constrained at the level of possible grammars. It will give us
insight into why inter- and intraspeaker variation is so overwhelmingly present
and why ungrammatical or impossible structures are realized. We get closer to
finding answers to the following questions: (i) What are the limits and loci of
syntactic variation? (ii) What is the reason for the fact that individual speakers do
not use all possible linguistic resources in the construction of regional and social
identities? (iii) How and why do speakers use ungrammatical constructions, i. e.
cross the borders of variation space?
References
Adger, David (2006): Combinatorial variability. Journal of Linguistics 42: 503–530.
Adger, David and Jennifer Smith (2005): Variation and the Minimalist Program. In: Leonie
Cornips and Karen P. Corrigan (eds.), Syntax and Variation. Reconciling the Biological with
the Social, 149–178. Amsterdam/Philadelphia: John Benjamins.
Adger, David and Jennifer Smith (2010): Variation in agreement: A lexical feature-based
approach. Lingua 120: 1109–1134.
Adger, David and Graeme Trousdale (2007): Variation in English syntax: theoretical
implications. English Language and Linguistics 11/2: 261–278.
Auer, Peter and İnci Dirim (2003): Socio-cultural orientation, urban youth styles and the
spontaneous acquisition of Turkish by non-Turkish adolescents in Germany. In: Jannis
K. Androutsopoulos and Alexandra Georgakopoulou (eds.), Discourse Constructions of
Youth Identities, 223–246. Amsterdam/Philadelphia: John Benjamins.
Barbiers, Sjef (2005): Theoretical restrictions on word order variation in three-verb clusters.
In: Leonie Cornips and Karen P. Corrigan (eds.), Syntax and Variation. Reconciling the
Biological with the Social, 233–264. Amsterdam/Philadelphia: John Benjamins.
Barbiers, Sjef, Leonie Cornips and Jan-Pieter Kunst (2007): The Syntactic Atlas of the Dutch
Dialects (SAND): A corpus of elicited speech as an on-line Dynamic Atlas. In: Joan C. Beal,
Karen P. Corrigan and Hermann Moisl (eds.), Models and Methods in the Handling of
Unconventional Digital Corpora. Volume 1: Synchronic Corpora, 54–90. Hampshire:
Palgrave-Macmillan.
Besten, Hans den and Hans Broekhuis (1989): Woordvolgorde in de werkwoordelijke eindreeks
[Word order in verb clusters]. Glot 12: 79–137.
Blom, Elma, Daniela Polišenskà and Fred Weerman (2008): Articles, adjectives and age of
onset: The acquisition of Dutch grammatical gender. Second Language Research 24:
297–332.
Blommaert, Jan (2011): Supervernaculars and their dialects. Tilburg Papers in Culture Studies 9:
54–90.
Cornips, Leonie (1994): Syntactische variatie in het algemeen Nederlands van Heerlen
[Syntactic variation in Heerlen Dutch]. Ph.D. dissertation, University of Amsterdam.
Cornips, Leonie (1998): Syntactic variation, parameters and their social distribution. Language
Variation and Change 10/1: 1–21.
Cornips, Leonie (2005): Variation and formal theories of syntax, Chomskian. In: Keith Brown
(ed.), Encyclopedia Language & Linguistics, 330–332. Oxford: Elsevier.
The no man’s land between syntax and variationist sociolinguistics 169
Norwegian, Swedish, German and Dutch. In: Jacomine Nortier and Bente A. Svendsen
(eds.), Language Youth & Identity in the 21st Century. Cambridge: Cambridge University
Press.
Guy, Gregory (2004): Dialect unity, dialect contrast: the role of variable constraints. Talk
presented at the Meertens Institute, Amsterdam August 9.
Guy, Gregory (2005): Grammar and usage: A variationist response. Language 81(3): 561–563.
Harris, John (1996): Syntactic variation and dialect divergence. In: Rajendra Singh (ed.),
Towards a Critical Sociolinguistics, 31–59. Amsterdam/Philadelphia: John Benjamins.
Henry, Alison (1995): Belfast English and Standard English: Dialect Variation and Parameter
Setting. Oxford: Oxford University Press.
Henry, Alison (2002): Variation and syntactic theory. In: Jack Chambers, Peter Trudgill and
Nathalie Schilling (eds.), The Handbook of Language Variation and Change, 267–282.
Malden: Blackwell.
Henry, Alison (2005): Idiolectal variation and syntactic theory. In: Leonie Cornips and Karen
P. Corrigan (eds.), Syntax and Variation. Reconciling the Biological with the Social,
109–122. Amsterdam/Philadelphia: John Benjamins.
Hewitt, Roger (1986): White Talk Black Talk. Inter-racial Friendship and Communication amongst
Adolescents. Cambridge: Cambridge University Press.
Jaspers, Jürgen (2005): Linguistic sabotage in a context of monolingualism and standardization.
Language & Communication 25: 279–297.
Jensen, Torben Juel and Tanya Karoli Christensen (2013): Promoting the demoted: The
distribution and semantics of “main clause word order” in spoken Danish complement
clauses. Lingua 137: 38–58.
Jørgensen, Jens Normann (2008): Polylingual languaging around and among adolescents.
International Journal of Multilingualism 5: 161–176.
Kayne, Richard S. (1994): The Antisymmetry of Syntax. Cambridge, Mass.: MIT Press.
Kayne, Richard S. (1996): Microparametric syntax: some introductory remarks. In: James
R. Black and Virgina. Motapanyane (eds.), Microparametric Syntax and Dialect Variation,
ix–xviii. Amsterdam/Philadelphia: John Benjamins.
Kotsinas, Ulla-Britt (2001): Pidginization, creolization and creoloids in Stockholm, Sweden. In:
Norval Smith and Tonjes Veenstra (eds.), Creolization and Contact, 125–156. Amsterdam/
Philadelphia: John Benjamins.
Kroch, Anthony (1989): Reflexes of grammar in patterns of language change. Language
Variation and Change 1: 199–244.
Labov, William (1972): Sociolinguistic Patterns. Philadelphia: University of Pennsylvania Press.
Labov, William (1975): What is a Linguistic Fact? Lisse: Peter de Ridder Press.
Labov, William (1994): Principles of Linguistic Change. Internal Factors. Oxford: Blackwell.
Lavandera, Beatriz (1978): Where does the sociolinguistic variable stop? Language in Society 7:
171–182.
Meechan, Marjory and Michele Foley (1994): On resolving disagreement: linguistic theory and
variation – There’s bridges. Language Variation and Change 6: 83–85.
Meyerhoff, Miriam (2002): Community of practice. In: Jack Chambers, Peter Trudgill and Natalie
Schilling-Estes (eds.), Handbook of Language Variation and Change, 526–548. Malden:
Blackwell.
Muysken, Pieter (1999): Talen. De Toren van Babel [Languages. The Tower of Babel].
Amsterdam: Amsterdam University Press.
The no man’s land between syntax and variationist sociolinguistics 171
Muysken, Pieter (2000): Radical modularity and the possibility of sociolinguistics. Paper
presented at the Sociolinguistics Symposium 2000, 27–29 April, University of the West of
England, Bristol.
Muysken, Pieter (2005): A modular approach to sociolinguistic variation in syntax: the gerund
in Ecuadorian Spanish. In: Leonie Cornips and Karen P. Corrigan (eds.), Syntax and
Variation. Reconciling the Biological with the Social, 31–54. Amsterdam/Philadelphia:
John Benjamins.
Nortier, Jacomine and Margreet Dorleijn (2008): A Moroccan accent in Dutch: A sociocultural
style restricted to the Moroccan community? International Journal of Bilingualism 12:
125–142.
Pintzuk, Susan (1995): Variation and change in Old English clause structure. Language
Variation and Change 7: 229–260.
Poletto, Cecilia and Paola Benincà (2007): The ASIS enterprise: a view on the construction of a
syntactic atlas for the Northern Italian dialects. Nordlyd 34(1): 35–52.
Quist, Pia (2008): Sociolinguistic approaches to multiethnolect. International Journal of
Bilingualism 12(1–2): 43–62.
Quist, Pia and Bente Ailin Svendsen (eds.) (2010): Multilingual Urban Scandinavia: New
Linguistic Practices. Bristol: Multilingual Matters.
Rampton, Ben (1995): Styling the Other: Introduction. Journal of Sociolinguistics 3(4): 421–427.
Rampton, Ben (2005): Crossing: Language and Ethnicity among Adolescents. Manchester:
St. Jerome Publishing.
Rampton, Ben (to appear): Contemporary urban vernaculars. In: Jacomine Nortier and Bente
A. Svendsen (eds.), Multilingual Urban Sites. Structure, Activity and Ideology. Cambridge:
Cambridge University Press.
Sankoff, Gillian. (1990): The grammaticalization of tense and aspect in Tok Pisin and Sranan.
Language Variation and Change 2: 295–312.
Sells, Peter, John R. Rickford and Thomas Wasow (1996): Variation in negative inversion in
AAVE: an optimality theoretic approach. In: Jennifer Arnold, Renee Blake, Brad Davidson,
Scott Schwenter and Julie Solomon (eds.), Sociolinguistic Variation: Data, Theory and
Analysis, 161–176. Stanford: Center for the Study of Language and Information.
Silverstein, Michael (1985): Language and the culture of gender. In: Elizabeth Mertz and Richrd
J. Parmentier (eds.), Semiotic Mediation, 219–259. New York: Academic Press.
Vangsnes, Øystein Alexander (2007): Scandinavian dialect syntax (before and after) 2005.
Nordlyd 34(1): 7–24.
Widdicombe, Sue and Robin Wooffitt (1995): The Language of Youth Subcultures: Social Identity
in Action. New York: Harvester Wheatsheaf.
Wilson, John and Alison Henry (1998): Parameter setting within a socially realistic linguistics.
Language in Society 27: 1–21.
Winford, Donald (1996): The problem of syntactic variation. In: Jennifer Arnold, Renee Blake,
Brad Davidson, Scott Schwenter and Julie Solomon (eds.), Sociolinguistic Variation: Data,
Theory and Analysis, 177–192. Stanford: Center for the Study of Language and Information.
Aria Adli, University of Cologne
What you like is not what you do:
Acceptability and frequency in syntactic
variation
The two most important sources of evidence in grammar research are acceptabil-
ity judgments and corpus data. They are closely associated with specific theoreti-
174 Aria Adli
cal frameworks and traditions. Acceptability judgments are seen as the royal path
in generative grammar. They are considered a direct reflection of the real object
of interest: ILanguage or competence. Corpus data of actual language use are in
the center of interest in sociolinguistics and usage-based approaches. Apart from
theoretical differences between the frameworks, various approaches can also be
incompatible at the methodological level, illustrated by the following quotations.
On the one hand, Chomsky (1965: 191) has already claimed half a century ago:
“To maintain, on grounds of methodological purity, that introspective judgments
of the informant (often, the linguist himself) should be disregarded is, for the
present, to condemn the study of language to utter sterility”. This position is still
up-to-date for today’s generative syntacticians. On the other hand, Labov (1996:
83) states that “when the use of language is shown to be more consistent than
introspective judgments, a valid description of the language will agree with that
use rather than with intuitions”. The majority of sociolinguists share his critical
stance towards introspection.
Yet, these antagonistic positions have slightly softened – a development
sustained by generative studies on diachronic syntax and language acquisition.
Many linguists would agree that the choice of the type of evidence depends more
on the research question than on some inherent quality criterion of the data type
itself. Just as corpus data can be extracted in a more or less meaningful way,
introspection can be collected in a more or less convincing manner. Whatever
method we use, focus should be given to careful methodology and data handling.
Nevertheless, one important issue remains open: What is the relation between
introspection and language use and how can we model it? It is very important to
find out whether both empirical sources lead to the same answer on one theoret-
ical question. In many circumstances it is certainly advantageous to corroborate,
if possible, theoretical hypotheses in linguistics by means of different types of
empirical data. However, the road to such a complementary approach needs to
be better paved. More specifically, we need to have more precise knowledge on
the relation between acceptability and frequency to better interpret the results of
a study working with both types of data.
In order to do so, this study presents empirical findings on syntactic variation
in French wh-questions, using frequency as well as gradient acceptability data.
The unique aspect of the present study is the fact that both types of data were
collected from the same speakers.
The relation between acceptability and frequency is an under-studied issue.
We have only few empirical studies thus far.
Backus and Mos (2011) compare gradient acceptability judgments and
corpus data, however not with regard to word order but to two ways of expressing
potentiality, namely by a derivational morpheme equivalent to English -able or
Acceptability and frequency in syntactic variation 175
occurring items (e. g. multiple questions) which native speakers nevertheless can
give surprisingly stable and even nuanced, gradient judgments about. What is
more, many constructions that are qualified by most speakers as natural and fully
acceptable can be fairly scarce in usage (e. g. wh-indirect object questions, see
below; see also the discussion in Sampson 2007).
Bader and Häussler (2010) also compare gradient judgments of corpus data
with respect to the order of subject and object and to verb-cluster linearization
in German. They observe a similar mismatch as Kempen and Harbusch (2008)
and Adli (2011c), namely that constructions with a high level of acceptability can
greatly vary with respect to frequency (this “ceiling effect” has previously been
underestimated by Featherston 2005). At the same time, extreme scarcity of a
construction does not allow us to predict its level of acceptability.
We propose in Adli (2011c) the concept of a latent construction to refer to
those fully acceptable but extremely scarce or non-occurring constructions. Fur-
thermore, we propose a possible scenario for certain types of diachronic change,
involving the following steps: (i) A construction X is not available in grammar, (ii)
a construction X becomes available in grammar but is not used, (iii) X is used as
part of a set of optional syntactic variants, (iv) cases of unstable optionality are
dissolved, leaving only X (Adli 2011c: 398).
We also mention the early studies by Greenbaum (1976, 1977). He showed
a correlation between acceptability judgments and judgments on the assumed
frequency of the same constructions. He showed that native speakers believe that
the more acceptable a construction is, the more often it occurs. Interestingly, this
is a misbelief of the speakers, as we know today. One could also take Greenbaum’s
(1976, 1977) result as an indication that speakers are probably mostly unaware of
the large degree of variation in frequency among acceptable constructions.
Given the challenge to explain why certain constructions are acceptable
but hardly occur, there is a close link between the issue of the relation between
acceptability and frequency and the issue of data scarcity of specific construc-
tions in corpora. In this context, we also mention Pullum (2007), who discusses
rarity in corpora. Similarly, Foster (2007) and Ayres-Bennett (1994) discuss neg-
ative evidence in corpora (see also Stefanowitsch 2008). The issue of rare typo-
logical features from a generative point of view has been discussed by Newmeyer
(2010) and Rijkhoff (2010). The underlying problems raised above are not new
(though they have rarely been discussed in methodological terms): The question
to ask is whether rare constructions are also marked constructions. This has been
explicitly stated by Baayen et al. (1997: 14), and goes back to Greenberg (1966) and
Trubetzkoy (1939). On this matter, Haspelmath (2006: 33) pleads in favor of using
directly the notions “rare” or “frequent” instead of the fairly polysemous notions
of marked or unmarked.
Acceptability and frequency in syntactic variation 177
(6b) qui les enfants voient -ils devant la fenêtre ? [complex inversion]
who the childreni see -theyi in front of the window
‘Who do the children see in front of the window?’
(9a) quand est-ce que c’est que tu fais le dessin ? [wh-ESQ cleft]
when est-ce que it is that you make the drawing
(9b) qui est-ce que c’est que tu vois devant la fenêtre ? [wh-ESQ cleft]
who est-ce que it is that you see in front of the window
Acceptability and frequency in syntactic variation 179
3 Methodology
Sgs is a multilingual database that we have been constructing since 2004 (see
Adli 2011b). It contains data on four languages – French, Spanish, Catalan and
Persian – that have been collected using the same methodological protocol. Every
person was first recorded, then participated in a gradient acceptability judgment
test, and finally filled out an extensive social questionnaire. Spontaneous speech
data were obtained by recording interviewer and interviewee while they played
a specifically designed game. Essentially, the interviewee had to solve a fictive
murder case by speaking freely with the (native and well-trained) interviewer.
Most interviewees chose a non-formal, rather colloquial register, encouraged
by a previous warm-up or “ice-breaker” phase. We favored this game task over
2 The construction with an inverted weak subject pronoun as in (4a) and (4b) is often called
subject-clitic inversion (Auger 1994; Elsig 2009). A construction with an inverted non-pronomi-
nal subject (e. g. Quand fait Jean le dessin?) is often referred to as stylistic inversion (Kayne and
Pollock 1978; Drijkoningen and Kampers-Manhe 2008).
180 Aria Adli
left for Paris?’ nor elliptical wh-questions such as quand? or quand ça? ‘When?’.
Furthermore, it only includes true information questions.3
variant n percent
Please note that the cells in the table are not fully comparable because they do not
represent an envelope of variation. Most notably, wh-subject questions (e. g. qui
fait le dessin ‘who does the drawing’), which can be assigned due to the surface
order to the wh-fronted as well as the wh-in-situ category, were by definition not
assigned to the wh-in-situ category. Furthermore, stylistic inversion (e. g. quand
dort Jean ‘when does Jean sleep’) and subject-clitic inversion (e. g. quand dort-il
‘when does he sleep’) are aggregated into one category because the table does
not differentiate between pronominal and non-pronominal subjects. Finally, it
includes one multiple wh-question.
Yet, Table 1 provides insights into the frequency of different wh-variants
in spontaneous speech: First, we observe that only four variants are really pro-
ductive: inversion questions (see (4a)/(4b)), the form with initial wh-element
followed by subject and verb (see (3a)/(3b)), the form with the est-ce que par-
ticle (see (2a)/(2b)), and – by far the most frequent variant – the wh-in-situ form
(see (1a)/(1b)). Second, we see that complex inversion (see (6a)/(6b)) is basically
absent – which is less surprising due to its high level of formality. Thirdly, we
observe that wh-cleft constructions (see (7a) to (9b)) are extremely scarce.
3 Yet, there is no echo question and only nine rhetoric questions in sgs – rhetoric in the sense
of utterances that are pragmatically equivalent to declaratives with the speaker knowing the
answer (see Prieto and Rigau 2007).
182 Aria Adli
Our goal is to obtain a comparable set of constructions for the following anal-
yses. To this end, we further restrict the overall set of wh-questions in Table 1
by limiting ourselves to sentences with a pronominal subject. We know that
sentences with pronominal and lexical subjects are analyzed quite differently in
French: Weak pronouns in spontaneous French are clitics that can be analyzed
as mere verbal affixes (under this assumption colloquial French might in fact be
attributed properties of a null subject language, see e. g. Culbertson 2010). Fur-
thermore, limiting ourselves to pronominal subjects removes cases of stylistic
inversion from the whVS order and excludes cases of complex inversion like in
(6a)/(6b) (as a result, postverbal subjects will only occur as subject-clitic inver-
sions such as (4a)/(4b)). The result of this restricted set is shown in Figure 1, in
which the non-occurring wh-ESQ cleft constructions such as (9a)/(9b) and multi-
ple wh-questions are no longer represented. Figure 1 shows the number of tokens
for each word order variant in the French part of sgs, also further distinguishing
between wh-adjunct and wh-object questions. One should bear in mind that the
SVwhO order with wh-objects is not a zero frequency but an empty cell, because
this order is only defined for wh-adjunct questions (with transitive or ditransitive
verbs).
Figure 1 reveals a very clear distributional difference between wh-adjuncts
and wh-objects, which will be discussed in more detail further below.
350 333
wh-adjuncts: abs. frequency wh-objects: abs. frequency
300
250
205
200 174
150
150
100
50 24 28 34 26
6 5 2 1 1
0
wh-in-situ wh-ESQ whSV whVS SVwhO wh-in-situ wh-cleft
cleft
Figure 1: Number of tokens of different word order variants of wh-adjunct and wh-object
questions with pronominal subject
occurrences of each variant in the entire corpus and then calculate their propor-
tions. This means that we treat the entire corpus as a single unity, disregarding
the level of individual speakers. This measure is called “single-text-value” in Adli
(2011a: section 6.2). Or we can first calculate the proportions for each speaker and
then calculate the mean value of the proportions for the sample (called “speaker-
sample-value” in Adli 2011a: section 6.2). The differences are shown in (10a) and
(10b). For example, if we calculate the relative frequency of our target-variant
whSV among the x = 7 variants of wh-adjunct questions of Figure 1 as a single-
text-value, we would first add up all occurrences of our target variant across
all n = 101 speakers and then divide this number by the sum of all x = 7 variants
across all n = 101 speakers. However, if we want to work with the speaker-sample-
value, we would first add up the relative frequencies of all n = 101 speakers for our
target-variant whSV and divide this number by n = 101.
n
∑N TARGET-VARIANTi
(10a) relative frequency (single-text-value): i =1
∑ ∑ ( N )
n x
VARIANTji
i =1 j =1
N TARGET-VARIANTi
n
∑ x
i =1
∑ N VARIANTji
j =1
( )
(10b) relative frequency (speaker-sample-value):
n
0.70
0.62 wh-adjuncts: rel. frequency wh-objects: rel. frequency
0.60
0.50
0.43 0.39
0.40
0.30 0.25
0.20
0.04 0.06 0.06
0.10 0.01 0.01 0.00
0.00
wh-in-situ wh-ESQ whSV whVS SVwhO wh-in-situ wh-cleft
cleft
[R2]
Tous ont regardé qui ?
– +
[49]
Quel pilote conduit quelle voiture
dans le championnat ?
– +
[31]
Tu emmènes qui en vacances ?
– +
Figure 3: Gradient acceptability judgment test
Figure 4: Relative frequency and gradient acceptability of different word order variants of wh-
adjunct and wh-object questions with pronominal subject
Acceptability and frequency in syntactic variation 187
4 C
omparison of frequency and acceptability
of French wh-variants
While all wh-variants score high enough in the judgment test to be considered
within the range of acceptable constructions (i. e. neither ungrammatical nor
marginal), with differences being gradual in nature, we do find categorical dif-
ferences on the frequency side: We can distinguish occurring from (nearly) non-
occurring forms.
wh-clefts such as (8a) and (8b) and the marked wh-in-situ order SVwhO such
as (5) essentially do not occur at all. Several other constructions occur very rarely,
namely the subject-clitic inversion whVS such as (4a) and (4b), the whSV order
with wh-objects such as (3b) and the wh-ESQ form with wh-adjuncts such as
(2a). Table 1 and Figure 2 have shown that the preferred order in usage is wh-in-
situ. This observation is unambiguous for ordinary, non-clefted questions, and it
also seems to apply to clefted questions (though the very low numbers of clefts
makes this last claim somewhat speculative).
Yet, the frequencies suggest two hypotheses to be pursued in future research:
The extreme scarcity of wh-cleft questions (of any type) is somewhat puzzling.
Either contrastive focus itself is a very scarce phenomenon in spontaneous speech
or contrastive focus is mainly expressed by prosodic and not syntactic means in
French wh-questions. What we can observe is that a syntactic device, namely
clefting, exists in French grammar but is hardly ever put to use. If we assume that
contrastive focus as such is not an extremely scarce phenomenon in spontaneous
speech interrogatives, we have to conclude that wh-cleft constructions are not a
standard form of expressing contrastively focused wh-questions in French, con-
tradicting Zubizarreta and Vergnaud (2005). These questions call for research at
the syntax-phonology interface where the context of each sentence would be care-
fully analyzed, too. The question of whether contrastive focus in French wh-ques-
tions is marked by prosodic rather than by syntactic cues remains an open one.
The scarcity of the marked wh-in-situ order SVwhO with postposed object
as in (5), which – as the judgments show – is within the range of acceptable con-
structions, is also somewhat surprising. One possible analysis would be that
a construction like (5), repeated as (11a), is derived from a construction with a
right-dislocated object as in (11b) by omitting the coreferential clitic pronoun in a
process similar to topic drop.
The preferences for certain wh-variants revealed by the judgments (recall that
all are nuances within the range of acceptable constructions) do not match the
pattern in usage. This overall acceptability-frequency mismatch is most salient
for the whVS form (4a)/(4b) (subject-clitic inversion receives the highest accepta-
bility scores and hardly occurs in usage), and is also fairly clear for the wh-in-situ
form (1a)/(1b) (its very high frequency is not reflected in the acceptability scores).
Interestingly, these two variants have a “non-neutral” register or style value, with
the wh-in-situ form being [+colloquial] and the subject-clitic-inversion [+formal].
Acceptability and frequency in syntactic variation 189
To make the picture complete: whSV (3a)/(3b) and SVwhO (5) are also [+collo-
quial], while wh-ESQ (2a)/(2b) is often described as “neutral” (Behnstedt 1973:
104; Coveney 1996: 98) in the sense that it fits into several registers. I represent the
register-neutrality of wh-ESQ by the presence of both [+colloquial] and [+formal].
Dufter (2008) shows that c’est clefts occur 2.5 times more often in corpus data of
spoken French compared to corpus data of written French. My interpretation of
his result is that wh-clefts, as scarce as they may be, are [+colloquial] (or better,
they tilt towards the [+colloquial] side). Hence, the two variants with the highest
acceptability values (wh-ESQ (2a)/(2b) and whVS (4a)/(4b)) are precisely those
forms which bear the [+formal] feature. What does this result mean for the rela-
tion between acceptability and frequency? It seems that speakers cannot not take
the normative perspective into consideration when making acceptability judg-
ments. Please recall that subjects were thoroughly instructed to leave aside the
normative perspective and to rely on colloquial language. I come back to this
point in Section 5.
reason questions, wh-time questions and other wh-adjunct questions. Each cat-
egory was further subdivided into questions with (phonologically lighter) simple
wh-words (e. g. pourquoi ‘why’, quand ‘when’, où ‘where’, comment ‘how’) and
(phonologically heavier) discourse-linked and/or prepositional wh-expressions
(e. g. pour quelle raison ‘for which reason’, dans quelle pièce ‘in which room’, de
quelle manière ‘which way’). The separation by these categories was motivated
as follows: First, wh-time questions match the test sentences of the acceptability
judgments. Furthermore, there are good reasons to believe that time adjuncts are
placed higher in the syntactic tree than many other adjuncts (e. g. manner, place)
(see e. g. Rigau 2002, who adjoins the former to IP and the latter to VP). Second,
wh-reason adjuncts show a particular behavior in many languages: For example,
only wh-reason questions allow preverbal subjects (as opposed to unmarked
postverbal subjects) in all Spanish varieties (Torrego 1984; Gutiérrez-Bravo 2006;
Adli 2010b). Stepanov and Tsai (2008) argue in a cross-linguistic study that wh-
reason (and wh-purpose) questions differ from other wh-adjuncts by their very
high position in the tree. They place them in a high layer of the CP-system. Third,
all other wh-adjuncts remain aggregated in order to minimize problems of data
scarcity.
With regard to wh-objects, we distinguished (as with wh-adjuncts) between
simple wh-words and D-linked and/or prepositional wh-expressions. In addition,
we distinguished between [+human] and [-human] wh-objects. Please note that
the fine-grained analysis of wh-objects is based on the data of a subsample of
N=48 speakers – unlike the rest of our quantitative analyses which builds on the
results of 101 speakers: Animacy of referential expressions was later added to the
annotations, but only for roughly half of the sample. Nevertheless, 48 speakers
still represent a sufficient sample size. Moreover, we compare further arithmetic
means of individual relative frequencies below, thus the different sample sizes
are also unproblematic from this technical perspective. We find 196 instances of
[-humain] wh-object questions within this subsample. However, [+human] wh-
objects (e. g. qui ‘who’ or quelle personne ‘which person’) only occur three times
and cannot be analyzed due to data scarcity.4 The extreme frequency discrepancy
between [+human] and [-human] wh-object questions might be – for reasons not
yet fully understood – a general yet surprising property of spontaneous speech:
our findings are in line with the distribution in the Ottawa-Hull corpus, where
que/quoi ‘what’ were identified by Elsig (2009: 157) 434 times compared to only
14 instances of qui ‘who(m)’.
4 We would like to add that wh-indirect questions would not be analyzable either because they
occur only three times in the entire corpus
Acceptability and frequency in syntactic variation 191
Table 2 reveals all in all 37 wh-reason questions, most of them with pourquoi.
We also count 171 wh-time questions, most of which are realized by non-simple
forms (e. g. à quelle heure ‘at which hour’). Furthermore, there are 221 other wh-
adjuncts: The most frequent one in this category is the manner adjunct comment
‘how’, the second most frequent one is the place adjunct où (‘where’). There are
also some non-simple wh-elements, mostly place (e. g. à quel étage, dans quelle
pièce, dans quelle domaine) and manner adjuncts (e. g. de quelle manière, dans
quelles circonstances, en quels termes).
Table 2: Relative and absolute frequencies of different types of wh-adjunct and wh-object
questions
wh-in- 0.03 (5) 0.01 (1) 0.12 (24) 0.43 (94) 0.41 (94) 0.1 (29) 0.26 (54) 0.22 (42)
situ
wh-ESQ 0.03 (3) 0 (0) 0.04 (11) 0.01 (3) 0.02 (7) 0 (0) 0.35 (79) 0 (1)
whSV 0.15 (19) 0.02 (3) 0 (1) 0.1 (26) 0.28 (73) 0.02 (7) 0 (0) 0.06 (13)
whVS 0.01 (5) 0.01 (1) 0 (0) 0.03 (10) 0.03 (6) 0.02 (3) 0.01 (3) 0.03 (4)
5 The rows for the SVwhO, wh-cleft and wh-in-situ cleft variants can be disregarded: With just
four tokens across all wh-types, their relative frequencies are mostly 0.00 (in two cells they are
0.01).
6 This reasoning in terms of true variation of course does not apply to the category of other wh-
adjuncts. The envelope in this case is more approximate.
192 Aria Adli
The coarse-grained analysis in Figure 4 above reveals that the wh-in-situ order is
the most frequent variant for both wh-adjunct and wh-object questions. However,
the fine-grained analysis in Figure 5 now exhibits two constructions with a differ-
ent pattern, having a dispreference for wh-in-situ: (i) the wh-REASON question
with pourquoi is preferred with the WhSV form and (ii), the wh-object question
with que/quoi is preferred with the wh-ESQ form.
might be instable phenomena, in which case a “small push” could bring them to
extinction. We conclude that est-ce que is limited to simple, i. e. phonologically
light, mostly monosyllabic wh-forms: The only clearly stable wh-construction
occurring with est-ce que is the wh-object question with que. In the course of its
grammaticalization since its appearance in the sixteenth century (Foulet 1921:
265), est-ce que has lost the meaning of the source construction in wh-questions
(Druetta 2003): Est-ce que wh-questions are no longer emphatic. This source
meaning only remains – to a minor extent – in present-day yes/no questions
(Mosegaard-Hansen 2001: 471). In wh-questions, est-ce que has thus become a
neutral, redundant interrogative particle (redundant because the interrogative
feature is already expressed by the wh-element). The fact that est-ce que is clearly
limited to que in modern spoken French suggests a further change: Est-ce que now
primarily functions as a morpho-phonological host for the wh-clitic que. Inter-
estingly, que differs from all other French wh-words in that it is not an indepen-
dent word but a proclitic requiring a host (Poletto and Pollock 2004): It can either
cliticize to a verb (qu’as-tu dit ‘what have you said’) – however, the whVS order
is, as we assume, not part of colloquial Modern French grammar. Alternatively, it
can cliticize to est-ce que – which is thus the only remaining option for fronting
que in the colloquial variety.
Finally, we notice that the mismatch between acceptability and frequency
for the whSV order shown in Figure 4 would be less pronounced if we took into
account the relative frequency of quand questions (see Figure 5). The patterns of
the very low relative frequency of quand and the low relative frequency of wh-
objects are similar to the acceptability values of whSV wh-adjunct and wh-object
questions: These judgment values are rather low within the overall picture of all
wh-variants. Furthermore, wh-adjunct questions have a slightly lower accepta-
bility value than wh-object questions in Figure 4, i. e. the directionality of the
acceptability-frequency mismatch between argument and adjunct questions dis-
appears if we restrict ourselves to quand questions in the corpus.
5 Discussion
being a clitization host for que in colloquial French, it remains a broadly avail-
able, optional interrogative particle in wh-questions in standard French. It is
interesting to note – also as an anecdote on standard French prescriptivism –
that est-ce que was only approved by the Académie Française in the 1930s – to
be “disapproved” again in 1987 (Grevisse 1993: 605/606). Nowadays, the wh-ESQ
question is neither considered elegant nor “popular” from a normative point of
view. Rather, it can be described as neutral. In Section 4.3, this register neutrality
has been represented as the coexistence of both a [+formal] and a [+colloquial]
feature. That being said, how does the normative influence act on the accepta-
bility judgments?
First, the fact that neutral wh-ESQ, like formal whVS, scores highest in
acceptability indicates that normative influence or bias on judgments does not
seem to act as a bonus for the standard variety, but rather as a malus/cost for con-
structions that are [+colloquial] only. Interestingly, colloquial wh-in-situ is not
among the constructions that scored highest, either.
The acceptability judgments on the highly formal complex inversion ques-
tion – which is not discussed in detail in the present study because it does not
belong to the envelope of variation – also support this assumption. wh-object
questions with complex inversion such as (6b) received a mean acceptability
score of 0.8, and wh-adjunct questions with complex inversion such as (6a)
received a score of 0.94 – yet they hardly ever occur (see Table 1). A compari-
son with the other ratings in Figure 4 reveals that (6a) receives a relatively high
score – irrespective of its particularly high level of formality. Importantly, norma-
tive influence does not explain any categorical difference in terms of acceptabil-
ity vs. unacceptability, but it is one of the factors behind the systematic nuances
within the range of acceptable constructions.
Second, we can observe a difference in the span of registers reflected by
acceptability and frequency data. While frequency data from spontaneous
speech (excluding highly formal speech contexts) provide insight into colloquial
language, acceptability data reflect the entire range of registers available to
a speaker. The results suggest that it is difficult for speakers to judge a variant
that does not exist in register x as unacceptable, as long as it exists in register y.
In other words, speakers seem to accept a construction if it belongs to any reg-
ister of their language. However, based on the present results it is still unclear
whether this phenomenon only occurs if register y is higher than register x, i. e.
whether speakers are only unable to disregard those constructions with a higher
stylistic value. The effect of register spanning in judgments is at least one part of
the explanation as to why certain constructions hardly occur although they are
rated as acceptable. It might also offer a diagnostic tool to distinguish between
diglossia and bilingualism. Bilingualism would be characterized by a better
Acceptability and frequency in syntactic variation 195
References
Adli, Aria (2004): Grammatische Variation und Sozialstruktur. Berlin: Akademie Verlag.
Adli, Aria (2006): French wh-in-situ Questions and Syntactic Optionality: Evidence from three
data types. Zeitschrift für Sprachwissenschaft 25(2): 163–203.
Adli, Aria (2010a): Constraint Cumulativity and Gradience: Wh-Scrambling in Persian. Lingua
120(9): 2259–2294.
Adli, Aria (2010b): The Semantic Role of the Wh-Element and Subject Position in Spanish and
Catalan. STUF – Language Typology and Universals 63(2): 103–117.
Adli, Aria (2011a): Gradient Acceptability and Frequency Effects in Information Structure: a
quantitative study on Spanish, Catalan, and Persian. Habilitationsschrift, Universität
Freiburg.
196 Aria Adli
Adli, Aria (2011b): A Heuristic Mathematical Approach for Modeling Constraint Cumulativity:
Contrastive Focus in Spanish and Catalan. The Linguistic Review 28(2): 111–173.
Adli, Aria (2011c): On the Relation between Acceptability and Frequency. In: Esther Rinke and
Tanja Kupisch (eds.), The development of grammar: language acquisition and diachronic
change – In honour of Jürgen M. Meisel, 383–404. Amsterdam/New York: John Benjamins.
Adli, Aria (2013): Syntactic Variation in French Wh-Questions: a quantitative study from the
angle of Bourdieu’s sociocultural theory. Linguistics 51(3): 473–515.
Auger, Julie (1994): Pronominal Clitics in Quebec Colloquial French: a Morphological Analysis.
PhD dissertation, University of Pennsylvania.
Ayres-Bennett, Wendy (1994): Negative Evidence: Or Another Look at the Non-Use of Negative
ne in Seventeenth-Century French. French Studies 48: 63–85.
Baayen, R. Harald, Cristina Burani and Robert Schreuder (1997): Effects of semantic
markedness in the processing of regular nominal singulars and plurals in Italian. In: Geert
Booij and Jaap van Marle (eds.), Yearbook of Morphology 1996, 13–33. Dordrecht: Kluwer.
Backus, Ad and Maria Mos (2011): Islands of (Im)Productivity in Corpus Data and Acceptability
Judgments: Contrasting Two Potentiality Constructions in Dutch. In: Doris Schönefeld
(ed.), Converging Evidence: Methodological and Theoretical Issues for Linguistic Research,
165–192. Amsterdam: John Benjamins.
Bader, Markus and Jana Häussler (2010): Toward a model of grammaticality judgments. Journal
of Linguistics 46(2): 273–330.
Bard, Ellen Gurman, Dan Robertson and Antonella Sorace (1996): Magnitude Estimation of
Linguistic Acceptability. Language 72(1): 32–68.
Behnstedt, Peter (1973): Viens-tu? Est-ce que tu viens? Tu viens ? Formen und Strukturen des
direkten Fragesatzes im Französischen. Tübingen: Narr.
Bybee, Joan L. and David Eddington (2006): A Usage-based Approach to Spanish Verbs of
‘Becoming’. Language 82(2): 323–355.
Chomsky, Noam (1965): Aspects of the Theory of Syntax. Cambridge: MIT Press.
Coveney, Aidan B. (1996): Variability in Spoken French. A Sociolinguistic Study of Interrogation
and Negation. Exeter: Elm Bank.
Coveney, Aidan and Laurie Dekhissi (2013): Variation dans l’emploi des interrogatives partielles
dans le cinéma de banlieue. Paper presented at ‘La syntaxe des interrogatives’, Neuchâtel.
Crocker, Matthew and Frank Keller (2006): Probabilistic grammars as models of gradience. In:
Gisbert Fanselow, Caroline Fery, Ralf Vogel and Matthias Schlesewsky (eds.), Gradience in
Grammar, 227–245. Oxford: Oxford University Press.
Culbertson, Jennifer (2010): Convergent evidence for categorial change in French: From subject
clitic to agreement marker. Language 86(1): 85–132.
Dekhissi, Laurie (in prep.): Un nouveau français populaire? PhD dissertation, University of
Exeter.
Déprez, Viviane, Kristen Syrett and Shigeto Kawahara (2013): The interaction of syntax,
prosody, and discourse in licensing French wh-in-situ questions. Lingua 124: 4–19.
Drijkoningen, Frank and Brigitte Kampers-Manhe (2008): On inversions and the interpretation
of subjects in French. Probus 20(2): 147–209.
Druetta, Ruggero (2002): Qu’est-ce tu fais? État d’avancement de la grammaticalisation de
est-ce que. Première partie. Linguae 2: 67–88.
Druetta, Ruggero (2003): Qu’est-ce tu fais? État d’avancement de la grammaticalisation de
est-ce que. Deuxième partie. Linguae 1: 21–35.
Druetta, Ruggero (2008): La question en français parlé : étude distributionnelle. Torino:
Trauben Edizioni.
Acceptability and frequency in syntactic variation 197
Dufter, Andreas (2008): On explaining the rise of c’est-clefts in French. In: Ulrich Detges and
Richard Waltereit (eds.), The Paradox of Grammatical Change: Perspectives from Romance,
31–56. Amsterdam: John Benjamins.
Elsig, Martin (2009): Grammatical Variation across Space and Time – The French Interrogative
System. Amsterdam/Philadelphia: John Benjamins.
Escandell-Vidal, Victoria (2002): Echo-syntax and metarepresentations. Lingua 112: 871–900.
Featherston, Sam (2005): The Decathlon Model: Design features for an empirical syntax. In:
Stephan Kepser and Marga Reis (eds.), Linguistic Evidence: Empirical, Theoretical, and
Computational Perspectives, 187–208. Berlin/New York: Mouton de Gruyter.
Foster, Jennifer (2007): Real bad grammar: Realistic grammatical description with grammat-
icality. Corpus Linguistics & Linguistic Theory 3(1): 73–86.
Foulet, Lucien (1921): Comment ont évolué les formes de l’interrogation. Romania 47: 243–348.
Freyd, Max (1923): The graphic rating scale. Journal of Educational Psychology 14: 83–102.
Funke, Frederik (2010): Internet-Based Measurement With Visual Analogue Scales: An
Experimental Investigation. PhD dissertation, Universität Tübingen.
Gadet, Françoise (2007): La variation sociale en français (2ème édition). Paris: Ophrys.
Greenbaum, Sidney (1976): Syntactic Frequency and Acceptability. Lingua 40: 99–113.
Greenbaum, Sidney (1977): Judgments of Syntactic Acceptability and Frequency. Studia
Linguistica: Revue de Linguistique Generale et Comparee/Journal of General and
Comparative Linguistics 31: 83–105.
Greenberg, Joseph H. (1966): Language universals: with special reference to feature
hierarchies. The Hague: Mouton.
Grevisse, Maurice (1993): Le bon usage: grammaire française. Paris: Duculot.
Gutiérrez-Bravo, Rodrigo (2006): Structural Markedness and Syntactic Structure: A Study of
Word Order and the Left Periphery in Mexican Spanish. New York/London: Routledge.
Hamlaoui, Fatima (2011): On the role of phonology and discourse in Francilian French
wh-questions. Journal of Linguistics 47(01): 129–162.
Haspelmath, Martin (2006): Against Markedness (and What to Replace It With). Journal of
Linguistics 42(1): 25–70.
Kayne, Richard S. and Jean-Yves Pollock (1978): Stylistic inversion, successive cyclicity and
move-NP in French. Linguistic Inquiry 9: 595–621.
Kempen, Gerard and Karin Harbusch (2005): The Relationship between Grammaticality Ratings
and Corpus Frequencies: A Case Study into Word Order Variability in the Midfield of
German Clauses. In: Stephan Kepser and Marga Reis (eds.), Linguistic Evidence: Empirical,
Theoretical and Computational Perspectives, 329–349. Berlin/New York: Mouton de
Gruyter.
Kempen, Gerard and Karin Harbusch (2008): Comparing Linguistic Judgments and Corpus
Frequencies as Windows on Grammatical Competence: A Study of Argument Linearization
in German Clauses. In: Anita Steube (ed.), The Discourse Potential of Underspecified
Structures, 179–192. Berlin/New York: de Gruyter.
Labov, William (1982): Building on empirical foundations. In: Winfred P. Lehmann and Yakov
Malkiel (eds.), Perspectives on Historical Linguistics, 17–92. Amsterdam/Philadelphia:
John Benjamins.
Labov, William (1996): When Intuitions fail. In: Lisa McNair, Kora Singer, Lise M. Dolbrin and
Micgelle M. Aucon (eds.), Papers from the Parasession on Theory and Data in Linguistics.
Chicago Linguistic Society 32, 77–106. Chicago: Chicago Linguistic Society.
198 Aria Adli
Lambrecht, Knud (2001): A framework for the analysis of cleft constructions. Linguistics 39(3):
463–516.
Manning, Christopher D. (2003): Probabilistic Syntax. In: Rens Bod, Jennifer Hay and Stefanie
Jannedy (eds.), Probabilistic Linguistics, 289–341. Cambridge: MIT Press.
Meisel, Jürgen M., Martin Elsig and Matthias Bonnesen (2011): Delayed Acquisition of Grammar
in First Language Development: Subject-Verb Inversion and Subject Clitics in French
Interrogatives. Linguistic Approaches to Bilingualism 1(4): 347–390.
Mosegaard-Hansen, Maj-Britt (2001): Syntax in interaction: Form and function of yes/no
interrogatives in spoken standard French. Studies in Language 25(3): 463–520.
Newmeyer, Frederick J. (2010): Accounting for rare typological features in formal syntax: three
strategies and some general remarks. In: Jan Wohlgemuth and Michael Cysouw (eds.),
Rethinking Universals, 195–221. Berlin/New York: Mouton de Gruyter.
Poletto, Cecilia and Jean-Yves Pollock (2004): On wh-clitics and wh-doubling in French and
some North Eastern Italian dialects. Probus 16(2): 241–272.
Poplack, Shana (1989): The care and handling of a mega-corpus. In: Ralph Fasold and Deborah
Schiffrin (eds.), Language Change and Variation, 411–451. Amsterdam: John Benjamins.
Prieto, Pilar and Gemma Rigau (2007): The Syntax-Prosody Interface: Catalan interrogative
sentences headed by que. Journal of Portuguese Linguistics 6(2): 29–59.
Pullum, Geoffrey K. (2007): Ungrammaticality, rarity, and corpus use. Corpus Linguistics &
Linguistic Theory 3(1): 33–47.
Reis, Marga (1991): Echo-w-Sätze und Echo-w-Fragen. In: Marga Reis and Inger Rosengren
(eds.), Fragesätze und Fragen. Referate anlässlich der Jahrestagung der Deutschen
Gesellschaft für Sprachwissenschaft, Saarbrücken 1990, 49–76. Tübingen: Niemeyer.
Rigau, Gemma (2002): Els complements adjunts. In: Joan Solà, Maria-Rosa Lloret, Joan Mascaró
and Manuel Pérez-Saldanya (eds.), Gramàtica del català contemporani, 2045–2110.
Barcelona: Empúries.
Rijkhoff, Jan (2010): Rara and grammatical theory. In: Jan Wohlgemuth and Michael Cysouw
(eds.), Rethinking Universals, 223–239. Berlin/New York: Mouton de Gruyter.
Sampson, Geoffrey R. (2007): Grammar without grammaticality. Corpus Linguistics & Linguistic
Theory 3(1): 1–32.
Sobin, Nicholas (2010): Echo Questions in the Minimalist Program. Linguistic Inquiry 41(1):
131–148.
Stefanowitsch, Anatol (2008): Negative entrenchment: A usage-based approach to negative
evidence. Cognitive Linguistics 19(3): 513–531.
Stepanov, Arthur and Wei-Tien Dylan Tsai (2008): Cartography and licensing of wh- adjuncts: a
cross-linguistic perspective. Natural Language & Linguistic Theory 26(3).
Torrego, Esther (1984): On inversion in Spanish and some of its effects. Linguistic Inquiry 15(1):
103–129.
Trubetzkoy, Nikolaus (1939): Grundzüge der Phonologie. Göttingen: Vandenhoeck & Ruprecht.
Zribi-Hertz, Anne (2010): Pour un modèle diglossique de description du français: quelques
implications théoriques, didactiques et méthodologiques. Journal of French Language
Studies FirstView: 1–26.
Zubizarreta, Maria Luisa and Jean-Roger Vergnaud (2005): Phrasal Stress, Focus, and Syntax.
In: Martin Everaert and Henk van Riemsdijk (eds.), The Syntax Companion, Vol. 3, 522–568.
Malden: Blackwell.
Acceptability and frequency in syntactic variation 199
wh-adjunct wh-object
WH-ESQ Quand est-ce que tu rends ton livre ? Qui est-ce que tu rejoins à la piscine ?
(2a)/(2b) Quand est-ce que tu récupères ta Qui est-ce que tu amènes à la maison ?
voiture ? Qui est-ce que tu invites au cinéma ?
Quand est-ce que tu prends ton
médicament ?
WH-cleft Quand c’est que tu remplis le Qui c’est que tu entends dans ce hall ?
(8a)/(8b) formulaire ? Qui c’est que tu conduis à l’aéroport ?
Quand c’est que tu écris ton livre ? Qui c’est que tu déranges à la bibliothèque ?
Quand c’est que tu répares la moto ?
items from the instruction phase items from the training phase
Qui ont-ils tous regardé ? Qui c’est que tu accompagnes à la gare ?
Qui tous ont regardé ? Que contrôle quel douanier à la frontière ?
Tous ont regardé qui ? Tu fermes la porte quand ?
Tous sont regardés qui ? Dis-moi, Jérémie a pas balayé quoi ?
Que sont-ils tous regardés qui ?
Part 3: Grammar, evolution, and diachrony
Hubert Haider, University of Salzburg
“Intelligent design” of grammars – a result
of cognitive evolution1
1 I am very much indebted to Fritz Newmeyer and Jon Ringen for a careful and critical reading of
a draft version of this paper (previous title: “Cognitive evolution – why language systems are …”).
Their criticism, encouragement and suggestions have been very instrumental. Thanks galore to
Göz Kaufmann for re-checking the final version. Any remaining shortcomings are to be blamed
on the author, of course.
204 Hubert Haider
1 Introduction
We always have been functionalists, from ancient times2 until today: Why do sea-
dwelling mammals have fins for limbs? – In water, fins are more useful than legs.
Why do languages employ acoustic signs? – In order to be independent of sight
contact. Why are we fond of functional explanations? – Because we are social
animals whose minds apparently have a predominant disposition for analyzing
complex situations in terms of actors, intentions and purposes.
Functionalism is a deeply entrenched and instinctively attractive common
sense perspective on complex design. The appropriate scientific perspective is
intuitively less accessible, as revealed in the history of science. A fairly recently
lost bastion of functionalism is life science. An initially entirely functionalist
standpoint was given up for a less anthropocentric but more explanatory account,
namely adaptation by evolution. The functionalist predisposition is character-
istic of our explanation-seeking mind but we must not project this on the object
of our scientific enquiry. Our understanding is functionalist; the ontology of the
objects of enquiry is not.
Linguistics is faced with the very same problem that Darwin solved. The
basic question was and is this: What explains functional design in the absence
of a designer? In Dawkins’ (1996) words, the watchmaker in evolution is blind,
but his products are working aptly. There is “intelligent” design, but there is no
intelligence that designed it. This reads like a paradox and the anti-Darwinian
camps regarded this as a fatal defect of Darwin’s idea of evolution as fed by
random variation. Intuitively, order out of random variation seems to contradict
experience and the second law of thermodynamics. How could random processes
produce order rather than chaos? What this intuition overlooks is this: Variation
(“mutation”) indeed enhances entropy, but selection is the antagonistic feature.
It eliminates most of the variation. “Natural selection acts as a sieve; it does not
single out the best variations, but it simply destroys the larger number of those
which are, from some cause or another, unfit for their present environment” (De
Vries 1909: 70).
If order and complexity emerge without an organizing ordering force, this
is a result of evolution. The order parameters that happen to emerge are a non-
random result of selection processes (see Heylighen (1999) on emerging com-
2 Why do celestial bodies move? In Aristotle’s view, the source of movement of the outer sphere
that triggers the movements of all inner spheres is not the causal “unmoved movent” as defined
in Physics (VIII). It is “aspiration” [sic] as a final cause (causa finalis); see Metaphysics 1074b, 34.
206 Hubert Haider
3 “Major transcendentalist figures include Johann Wolfgang von Goethe, Etienne Geoffroy St.-
Hilaire, Louis Agassiz, and Richard Owen. Each advocated the primacy of structure or form over
function, of the unity of type over the conditions of existence”. (Amundson 1998: 155; emphasis
mine)
4 “Many functionalists see rejection of generative theory as a fundamental component of func-
tionalism”. (Dryer 2007:245) Mainstream generative grammar with its numerous hypothetical,
hidden unities employed (e. g. overt and covert movement, covert lowering, remnant movement,
roll-up movement, strong or weak features, overt or covert checking of features, etc.) is a good ex-
amples of (transcendental) structural explanation attempts. With or without these transcenden-
tal (i. e. empirically untestable) amendments, an account in terms of usage-independent princi-
ples that determine structure and form would not be accepted as explanation by functionalists.
“Intelligent design” of grammars – a result of cognitive evolution 207
visible in language change (Newmeyer 2005: 175). For Newmeyer, atomistic func-
tionalism does not pass thorough empirical testing. He opts for the explanatorily
weaker theoretical position, namely holistic functionalism.
Dryer (2007: 247) objects to Newmeyer’s compromising withdrawal position
as it “seems to exclude an intermediate position, that in at least certain cases, a
property of a particular grammar is directly motivated by some functional con-
sideration, but that the locus of this functional explanation was at the level of
historical change. Such a position seems to be a coherent one and is likely to
be widely held by functionalists. […] In other words, Newmeyer’s characteriza-
tion obscures the distinction between two different issues; that is, it conflates
the questions whether there is a direct link between functional explanation and
grammatical properties and whether the locus of functional explanation is at the
level of historical change or somewhere else (such as at the level of language
usage)”.
What is meant by a direct link is a causal relationship between a functional
aspect and a grammatical property in a functional explication of the grammati-
cal property. Obviously, Dryer is willing to partially accept the stronger version,
i. e. atomistic functionalism, and considers it scientifically and empirically
appropriate.
Givón (undated: 7) is equally categorical on this issue, favoring atomistic
functionalism when he refers to “roughly isomorphic matching”: “The process of
change itself, the invisible teleological hand that guides the ever-shifting but still
roughly-isomorphic matching of structures and functions, is driven by adaptive
selection, i. e. by functional-adaptive pressures”.
Givón (undated: 1) presents Aristotle as the founding father of functional-
ism: “Aristotle outlined the governing principle of functionalism, the isomorphic
mapping between form and function”. He explicitly refers to, and emphasizes,
the teleological hand and functional-adaptive pressures. But, in reality, there is
no pressure and no pressure generating device, and there is no teleological hand,
as will be argued below.
The essential drawbacks of this kind of functionalism are exactly these two
leading ideas, namely the “invisible hand” that designs more adaptive forms and
the “pressure” on improving functionality by adopting these forms. An expla-
nation based on the notion – future functionality drives present changes – is
invalid, however. This insight was established in biology more than a century
ago. Functionalist “explanations” are appealing narratives that please our func-
tion-addicted mind, but these narratives are what they are, namely narratives
rather than scientific explanations.
The scientifically correct core of these narratives is the purpose free process
of evolution. There is no teleology, no final causes (i. e. in the sense of causa
208 Hubert Haider
finalis), and no pressure; there is merely variation and selection from constantly
being exposed to a given environment. Darwin’s break through is a scientific
theory of adaptation without any functionalist narratives, a theory of evolution
with purpose-free random variation and purpose-free non-random selection as
major components. What appears to be functionality driven is but the emergent
effect of adaptation by selection. Final causes are not part of this system; they are
merely part of the perceptual filter through which many researchers perceive their
simplified world.
Of course, when biologists tell popular short-cut versions of examples of
evolution, they often talk as if they are telling a functionalist narrative, but this is
merely a façon de parler. They implicitly understand that a functionalist render-
ing is easier to grasp. The basic story, however, is a causal explanation in terms of
variation and selection. In a profound study on Darwin and adaptive design, Ruse
(2003) analyzes the pitfalls of our commonsensical desire for functional under-
standing of complex adaptive design. Our common sense is creationist; it prefers
an engineering perspective on design and our favorite approach for understand-
ing it is (invalid) functionalist reasoning.
Darwin expelled the “argument from design” from biology. His theory does
not have adaptive design built in as a premise; it emerges when evolution does
its work. Evolution presupposes an independently structured system that is rep-
licating. This is the structural side.5 Adaptation by selection covers the apparent
functionalist side of evolutionary developments.
This paper contends that in linguistics, we foster the same fallacies as biol-
ogists did more than a century ago. The point is not structuralism vs. function-
alisms. It is “form meets function by means of evolution”. For grammar theory,
this mode of explanation is novel. But of course it is not novel at all, since it is the
standard mode of explanation in biology. The orthogonal viewpoints – structure-
geared vs. function-driven – are wrong if maintained in isolation or as opponent
positions. It is the synthesis in the concept of cognitive evolution that does justice
to the correct insights of each of the competing standpoints, without their respec-
tive drawbacks.
Here is a non-linguistic example to start with: The rhinovirus that success-
fully recruited me as a host organism follows no teleology and is not pushed by
any functional-adaptive pressure. It just happens to be a virus variant that my
immune system is not prepared for. It might have successfully blocked other vari-
5 Here is a simple example. Flying has been ‘invented’ several times in several distinct forms,
e. g. bats, birds, bumblebees, etc. In each case, the predecessor structures originally had served
different functions (e. g. thermoregulation).
“Intelligent design” of grammars – a result of cognitive evolution 209
ants, but not this one. The success of the rhinovirus family is its variability. This
feature guarantees its ability to regularly successfully infect a host and spread.
If one describes this as an armament race between the “attacking” virus and the
“defending” immune system, its description will be a functionalist narrative. This
narrative is easy to grasp but misleading in a crucial respect.
It suggests a causal functional relation that is not causal at all: Indeed, the
virus changes its appearance quickly and regularly, but not in order to be able to
outfox the immune defense and it is under no functional pressure to do so. The
functionality is not programmed in, it is emergent. If it did not have the property
of changing frequently it would not be a successful rhinovirus. In other words,
what we see is adaptation, and what we misapprehend by overinterpreting it is a
conjectured functional causality, which, according to Darwin is the exact oppo-
site of a functionalist narrative. There is adaptivity without teleology. The virus
does not change in order to spread; it spreads, because it changes.
The rhinovirus is successful and its recipe for success is rapid change. It is
this property which proves successful in the selection process that we interpret
as an armament race between virus propagation and virus elimination by the
potential hosts’ immune systems. The immune system, on the other hand, is the
cause for the particular property of the virus, since the virus is adapting to it due
to selection. Our immune system spurs the rhinovirus into becoming the kind of
virus it is now. Neither the virus nor my immune system anticipate each other.
Although this is an accurate account, it does not fit into our functionalist
narrative scheme. In this scheme, there is always an agent that is coerced into
doing this or that in order to gain this or that. In these narratives there is no role
for an immune system that is as it is (due to an independent line of evolution)
and a virus that changes frequently. Neither the immune system nor the virus
would qualify as teleological agents, but their interaction can often be described
as if they were. If a virus is said to be “under functional pressure” to change, our
commonsensical understanding of functional connections is happy with it, but
it is merely an appealing metaphor. If the virus did not change, it would not have
the chance to spread, and this would be the end of the story. What we see is a
virus population that spreads. Hence it happens to have this property, but not “in
order to” spread. Crucially, our functionalist tunnel vision does not perceive the
numerous other virus populations that lost their chance to spread because they
were sieved out. The “in order to”-supposition is the superfluous and misleading
ingredient we add. If the virus did not have the property of being able to outwit
human immune reactions by constantly changing, it would not spread. Hence,
our supposition is that it changes in order to be able to spread. This is the mis-
attribution of a teleological component to a complex situation, simply because it
is easiest for our common-sense understanding of adaptivity.
210 Hubert Haider
Haider (1998) addressed the issue with the intentionally ambiguous title: “Form
follows function fails”. This should be read as “form follows function” fails as
opposed to “form follows, function fails”. The combined message is this: The idea
that form follows from function is going to fail since form follows, but functions
may fail. Purpose or potential for future use does not explain a design, except
for intentionally designed tools. For self-replicating systems with “intelligent
design”, functions may be used to describe them but not to explain the causality
of their design. This has become a commonplace in the theory of science since
the classical work of Hempel and Nagel (Cummins 1975: 742). Functional explana-
tions are not causal explanations.
The grammars of natural languages are good sources for examples of alternative
means of implementing identical functions. For instance, “parts of speech” may
be identified by morphological means (affixes), particles, or word order. Infor-
mation structure properties, like focusing, may be coded by particles or word
order or both, or merely by intonation. In each case, an identification function is
implemented in different ways by different forms.
Why is a functional explanation not causal? In a causal explanation, we
hypothesize that a cause C produces an effect E. Whenever C applies, we expect E
(ceteris paribus). The inference from C to E is valid.6 On the other hand, from the
absence of E we infer the absence of C. Finally, if C applies and E is absent, the
particular causal explanation is wrong. In sum, we regard C as an explanation
for E. In a functional explanation we hypothesize that a system contains a func-
tionally characterized item F in order to produce a result R. Here, we regard R as
the cause for F. But in this case, R cannot be a causal explanation for F (because
this would amount to backward causation). The inference from R to F, given R, is
not valid. And there are indeed cases in which R is given, but F is not (because
R may be the effect of F´≠ F). We cannot readily falsify a functional explanation,
either. Modus tollens7 cannot be applied: If we correctly describe a given result R
and hypothesize a function F as the prerequisite for R, and it turns out that our
predicted F is empirically wrong, this has no consequence for R.8 Of course, the
inference in the other direction, viz. from F to R, would be the familiar causal
explanation of R as the effect of F, but its functionalist inverse is not. For func-
tional reasoning there is no logically valid mode of inferencing. Cummins (1975:
765), who tries to defend some version of functional analysis, offers this as his
conclusion:
A formal account of extraposition does not deny its functionality, but the
functionality is an epiphenomenon. The primary issue is explaining how a
grammar with extraposition differs from a grammar without extraposition.
The explanation of the extraposition phenomenon is the grammar device that
accounts for it. This is a causal explanation. Grammars that provide this device
will produce extraposition structures as a result. The grammar property is the
cause for extraposition as a language phenomenon. “Functionality” does not
play a role in the explanation. The issue of grammar-parser fit is a higher-order
question. It is relevant only when we compare grammars that provide extraposi-
tion with grammars that do not. The adaptivity of grammars is best accounted for
in a theory that does not assume a direct causal relationship between a process-
ing effect and a formal detail of a grammar. However, ease of processing plays an
important role in language change, and language change is part of the cognitive
evolution of grammars.
As described above, a functional explanation is not cogent. First, there are
“strict” OV languages that do not admit “extraposition” and have existed for
millennia (e. g. Japanese). This is a flat contradiction for a direct functional grip
on grammar. An even greater embarrassment for a functional explanation is the
fact that there are languages with extraposition (e. g. German, Dutch) that do not
allow extraposing a class of items in spite of their functional similarity to extra-
posable items:
(1) a. Ist [die Erklärung, [die uns hier von ihm angeboten wurde]] wirklich richtig?
‘is [the explanation [that (to)us here by him offered was]] indeed correct’
b. Ist [die [uns hier von ihm angebotene] Erklärung] wirklich richtig?
‘is [the [(to)us here by him offered] explanation] indeed correct?’
c. Ist [die Erklärung] wirklich richtig [die uns hier von ihm angeboten wurde]?
d. *Ist [die Erklärung] wirklich richtig [uns hier von ihm angebotene]?
e. *Ist [die Erklärung [uns hier von ihm angebotene]] wirklich richtig?
The relative clause in (1a) may be extraposed as in (1c), but the complex parti-
cipial modifier in (1b), which is functionally (i. e. discourse functionally) equiv-
alent to a relative clause, must not be extraposed, neither to the end of the clause
(1d) nor to the end of the NP (1e). Obviously, it is the grammar that regulates
extraposition and reducing center embedding does not dictate the particular
grammar design. Note that the participial attribute in (1b) is on a left branch in
the nominal phrase, in between the determiner and the head noun. Therefore it
is a much greater obstacle for the parser than the post-nominal relative clause in
(1a). However, extraposition is ruled out for (1b).
214 Hubert Haider
“[…] the only scientifically coherent account for the origin of any complexly organized
functionality is (ultimately) evolution by natural selection. […] All (non-chance)
functionality is ultimately attributable to the operation of adaptations, that is, naturally
selected innate aspects of the cognitive architecture. Cognitive science and the adaptationist
branches of biology are natural intellectual companions” (Tooby and Cosmides 1990: 761).
“It is magical thinking to believe that the “need” to solve a problem automatically endows
one with the equipment to solve it. For this reason, the invocation of social and practical
“needs”, pragmatic factors and acquired heuristics, or “functionalist” hypotheses to
explain language acquisition need to be reformulated in explicitly nativist terms”. (Tooby
and Cosmides 1990: 762)
The latter statement is correct, but too specific in one point, namely the reference
to “explicitly nativist terms”. This is an unnecessary restriction, and not a very
plausible one. It is far from clear that there was time enough in the biological
evolution of human brains to become endowed with a rich enough innate lan-
guage faculty as a consequence of biological brain evolution. All our language
capacities are parasitic on previously evolved capacities of the brain. The human
10 Example: Er hat nicht damit gerechnet ⇒ Er hat nicht gerechnet damit. (‘He did not reckon
with-it’). ‘Damit’ (‘with-it’) is a single (compound) word. Both variants are equivalent under com-
plexity measures.
11 Example: The destroyed city has been partly rebuilt. It was destroyed by a flood triggered by
Hurricane Katrina.
“Intelligent design” of grammars – a result of cognitive evolution 215
innovation is not so much one in terms of newly acquired brain mechanism but
rather in terms of evolutionarily improved brain capacities that already existed
and of more efficiently crosslinking them (see Rauschecker and Scott 2009).
Tooby and Cosmides think in terms of an obvious dichotomy for biologists,
namely invalid functionalism vs. valid evolutionary structuralism. In biology, a
structure is usually innately determined (by the genome). In our case of gram-
mars as cognitive information structures, the structure is a cognitive “organism”
that utilizes organically determined structures (i. e. our brain functions). There-
fore, the phenotype (i. e. language) is not exclusively dependent on innate qual-
ities. Cognitive evolution is an evolution that is principally independent of its
biological implementation and substrate.
Biological evolution on the genetic level is not the exclusive source of adap-
tive functionality.12 In the case of cognitive capacities, natural selection gets a
chance to operate on cognitive programs and their representations. The theory of
evolution as developed by Darwin is principally substance-neutral, although it
was developed and explicated as a theory of explaining the “origin of species by
means of natural selection” (Darwin 1859). Basically, all it requires is a replicating
system that produces enough variants that are constantly exposed to selection.
Biological selection (which is based on the reproductive success of the
phenotype) could not explain the intricate and biologically irrelevant grammar-
internal details of languages. Nevertheless, the idea of a piece-by-piece biological
evolution of grammar has been ventured by Pinker and Bloom (1990).13 However,
evolutionary success in biological evolution must be translated into reproductive
success. It is hard to see what an accidental change in a cognitively encapsu-
lated system of formal operations for symbol recombination could contribute to
the reproductive success of those whose brain supports the change compared to
those whose brain does not:14
12 “As a causal theory natural selection locates the causally relevant differences that lead to
differential reproduction. These differences are differences in organisms’ fitness to their environ-
ment. Or, more fully, they are differences in various organismic capacities to survive and re-
produce in their environment”. (Stanford Encyclopaedia of Philosophy <http://plato.stanford.
edu/entries/natural-selection/>)
13 In the early days, (Friedrich) Max Müller tried to make the strongest possible point against
an all-encompassing concept of evolution. He emphasized the impossibility of the biological
evolution of language as a strong argument against Darwin’s theory of evolution: “Language is
the Rubicon which divides man from beast, and no animal will ever cross it [...] The science of
language will yet enable us to withstand the extreme theories of the Darwinians, and to draw a
hard and fast line between man and brute”. (Müller 1862: lecture IX)
14 See Bierwisch (2000) for a detailed discussion of the conundra and paradoxa of attempts to
216 Hubert Haider
Why are grammars luxurious15 and diverse? They are luxurious because the neural
substrate freely provides processing capacities for this luxury. What appears to be
a superfluous complexity is but the costless exploitation of the system’s poten-
tial of the human brain that happens to be available for free. The “programmer”
of this potential is not an “invisible hand” and it is not a society-based net of
communicative needs.16 It is an ongoing process of cognitive evolution. Just like
17 The great-grand-mother of Icelandic and the modern Norwegian varieties is Old Norse. The
Icelandic settlers were Norwegians from North-West Norway.
218 Hubert Haider
and in fact the vertebrate eye. It suffers from an evident “constructional” defect.
Unlike the octopus (cephalopod) eye, its wiring design is the result of “tinkering”
design in Monod’s (1971) diction. The nerves approach the retinal cells from the
side at which the light arrives. The smarter “engineer” of the octopus eye correctly
placed the nerves on the dark sides of the cells. As a consequence of this design,
the human eye has a blind spot (scotoma).18
Functional reasoning may account for the advantage of (some) vision, but it
cannot explain the structures that enable vision and how they developed. Anal-
ogously, functional analysis may classify linguistic structures in terms of their
contexts of use, but this does not explain how they developed and why exactly
these structures are used and not others that would serve the same function even
better.
3 U
niform theory of evolution – different fields of application:
Biological or cognitive
Already in 1871, Darwin pointed out that the theory of evolution is not substance-
dependent and consequently, the developments in languages appear to be par-
allel to biological evolution in terms of adaptation and “struggle for life” as a
consequence of variation and selection. In this publication on human physical
and cultural characteristics, evolution of culture and differences between sexes,
to name but a few topics, Darwin (1871: 59) made it clear that his theory of evolu-
tion is substance-neutral: “The formation of different languages and of distinct
species, and the proofs that both have been developed through a gradual process,
are curiously parallel. […] The survival or preservation of certain favored words in
the struggle for existence is natural selection”.
Intriguingly, linguists in those days (and in fact today, if we do not count
metaphorical allusions), did not take this eye-opener seriously.19 Instead, some
linguists attacked Darwin precisely on linguistic grounds, in complete misjudg-
ment of the nature of the problem (see Müller’s fierce attack, quoted in footnote
18 There is a blind spot at the back of each eye at the place where the optic nerve passes through
the eyeball since in this region there is no room for receptor cells. The brain computationally
eliminates the blind spot and we are not aware of it.
19 But biologists of today do. See Fitch (2007) for quantitative relationships between how
frequently a word is used and how rapidly it changes over time.
“Intelligent design” of grammars – a result of cognitive evolution 219
i. Evolution as such: The objects of the theory are not seen as constant or recently created nor
perpetually cycling, but rather as steadily changing. Organisms transform over time.
Linguistics: Grammars of languages are steadily changing, if not impeded by normative
efforts (schooling, script culture, etc.). Changes are not cycling but follow drifts. Acquisition
and language contact are the primary sources of grammar change. Another source of vari-
ation is the drive for linguistic in-group differentiation.
ii. Common descent: This theory states that each group of organisms descended from a
common ancestor, and that all groups of organisms, including animals, plants, and micro-
organisms, ultimately can be traced back to a single origin of life on earth.
20 It was a thoroughly history-minded and text-focused science that was gradually growing out
of philological disciplines. Ironically, a corner stone of Indo-European studies tuned out to be
identical with one of Darwin’s sub-theories, namely the theory of common descent.
21 See de Vogelaer (2007) for a confrontation in his explication of a particular grammar change.
220 Hubert Haider
Linguistics: Indo-European studies are a success story that illustrates this point. Languages
that descended from a single proto-language have spread as far as to Iceland in the North-
West and to the province of Xinjiang in China’s North-West (Tocharic). Today these languages
are different beyond superficial recognition, but they all are descendants of a single “mother
language”, with a research depth of about 3500 years.
iii. Multiplication of species: This theory explains the origin of the enormous organic diversity.
It postulates that species multiply, either by splitting into daughter species or by “budding”,
that is, by the establishment of founder populations that evolve into new species, if geo-
graphically isolated.
Linguistics: “Species” and “subspecies” translate as “language” and “varieties of a lan-
guage”. Latin, or more precisely its regional varieties, is the ancestor of a number of languages
(= species).22 What biologists call “budding”, is dialect split in language change. Geographic
isolation and “cross-fertilization” (i. e. language contact in bilingual brains) are catalysts for
“budding”.
iv. Gradualism: According to this theory, evolutionary change takes place through the gradual
change of populations and not by the sudden (saltational) production of new individuals
that represent a new type.
In linguistics, again, this is commonplace. Languages change over generations. Language
change is gradual, usually taking several generations and time spans of centuries. Changes
typically develop out of communities with dialectal variants co-existing for a long time.
v. Natural selection: According to this theory, evolutionary change comes about through the
proliferation of genetic variation in every generation. The individuals who thrive thanks
to a particularly well-adapted combination of inheritable characters give rise to the next
generation.
This last sub-theory is the crucial point. Linguists who would subscribe to i.–
iv. would not simultaneously assume natural selection to be the mechanism of
language change and the emergence of new species (= languages). What would
it mean that “individuals” survive and become the founding individuals of a new
“species” of language? All we have to do here is to step back and rethink the
analogies carefully. Of course it is not a question of survival and reproductive
success on the level of the human phenotype. However, there is an exact parallel
to biological evolution on a different and relevant level which has been hitherto
overlooked. It is the level of the cognitive evolution of cognitive representations
of replicating cognitive algorithms, namely grammars.
22 As linguists we know that there are many more descendant languages of Latin than merely
the ‘official’ Romance languages and the already extinct ones (like Dalmatian), from Sicilian,
Neapolitan, Istriot, to Friulian and Piemontese, to name just a few languages on the Apennine
peninsula.
“Intelligent design” of grammars – a result of cognitive evolution 221
23 A parallel in biology would be the direct influencing of a genome, for instance by ionizing
radiation.
222 Hubert Haider
Let us now try to answer the relevant questions: Which entities are selected and
how does selection work in the case of language and grammar? We have to make
clear the “what-is-what” in terms of a theory of evolution, namely what the vehicle
is, what the replicators are, what the interactors are and who is benefitting. The
minimally necessary background for applying these notions is easily accessible
thanks to Lloyd’s (2012) online contribution on units and levels of selection that
the following exposition takes advantage of:
Dawkins (1978) introduced “replicator” and “vehicle” to stand for different
roles in the evolutionary process. “Vehicle”, is defined as “any relatively discrete
entity which houses replicators, and which can be regarded as a machine pro-
grammed to preserve and propagate the replicators that ride inside it” (Dawkins
1982b: 295). According to Dawkins (1982a: 62), most replicators’ phenotypic
effects are represented in vehicles, which are themselves the proximate targets of
natural selection. The term replicator, modified by Hull (1980), is used to refer to
any entity of which copies are made.
An “interactor” (Hull 1980: 318) denotes an entity which interacts, as a cohe-
sive whole, directly with its environment in such a way that replication is differ-
ential – in other words, an entity on which selection acts directly. The process of
evolution by natural selection is “a process in which the differential extinction
and proliferation of interactors cause the differential perpetuation of the replica-
tors that produced them” (Brandon 1982: 317–318).
24 For example: i. Auch sie lachte (declarative) ii. Lachte sie auch? (interrogative)
‘also she laughed’ ‘laughed she also?’
In this simple German example (which can be replicated in any of the Germanic V2-languages),
ii. as a yes-no question is the word-by-word mirror image of i., but no child would wrongly jump
to the conclusion that questioning means mirroring the declarative order. No processing routine
of the brain would support this operation required by a rule of grammar.
“Intelligent design” of grammars – a result of cognitive evolution 223
Next, let us clarify what the corresponding referents are in the domain of cog-
nitive evolution with respect to language. Let us start with “grammar”, regarded
as a cognitively represented program for processing a given language. It is the
program that our language processing capacity for the given language is based on.
The grammar is the “replicator”. It is the entity that is replicated by language acqui-
sition based on productions of the grammar (utterances in the given language).
The replicator is the grammar of a language understood as an information
structure. The parallel to biology is very close. The genome is the information
structure that governs the make-up of an organism. Grammar is the information
structure that governs the operations of the language usage system (in acqui-
sition, production and reception). The grammar of language is the system that
determines most properties of the utterances of a speaker of that language.
Next we have to identify the “interactor”. The interactor is the language as a
population of utterances. More precisely, it is the set of utterances the language
users produce and the set of utterances that are the input for language acquisition
by an individual. In other words, the grammatical properties of utterances that
serve as input for language acquisition are the basis for acquiring the grammar
that is responsible for the make-up of these utterances.
We now turn to the “vehicle”. It is the cognitive representation of the grammar
in the individual speaker’s brain. It is the cognitive “software” system that
enables us to produce and understand the language we have acquired. Impor-
tantly, we have to distinguish between grammar as an information structure and
its cognitive representation in the brain. Grammar as an information structure
is a cognitive virus, and the brain is the host. Like any virus, it needs the host
for reproduction: The cognitive grammar guides the brain in language process-
ing and acquisition. The produced language is the input for language acquisition
which carries the cognitive virus into the next language-acquiring brain.
Finally, we have to identify the selection environment. Remember that natural
selection acts directly on interactors and is the “process in which the differential
extinction and proliferation of interactors cause the differential perpetuation of
the replicators that produced them” (Hull 1980: 318).
The replicators are the grammars of a language in the population of users of
the particular language. It is not “the” grammar since the speech community of a
given language typically is not completely homogenous. There is always variation
and the variation corresponds with a set of grammar variants that differ mini-
mally. Selection has an effect on the different perpetuation of the replicators that
produced them. This is cognitive selection. Some grammar variants win, while
some loose and become extinct.
What is responsible for this selection? It is the system of brain functions
that are recruited for language processing. Let me clarify this with an analogy of
224 Hubert Haider
25 Note that at this level of abstraction it is not essential to decide whether there exists a do-
main-specific restriction on possible grammars (i. e. UG). It may exist or it may merely be the
reflection of just those restrictions that the neuro-cognitive architecture recruited for language
processing and imposes on the kind of ‘cognitive programs’ it is able to support.
“Intelligent design” of grammars – a result of cognitive evolution 225
native grammar variant. The selecting system reacts passively; it merely is the
sieve. It does not actively influence the grammar package. Crucially it does not
restructure the grammar system – during or after acquisition – by improving its
compatibility with the general system properties. Things that work better gain
admittance, while things that do not work that easily are likely to be rejected
during acquisition.
26 This requirement may be masked by the pro-drop property: VO languages with pro-drop may
drop the unstressed pronominal subject, but they do not tolerate subjectless clauses, that is,
clauses without a subject argument. In VO languages without pro-drop, clauses without a sub-
ject argument require an expletive subject (see Haider 2010a: 35–38, 2013: 221–222). Null-subject
languages are languages in which the pronominal subject argument may be phonetically null,
but it is syntactically present. VO languages permit null subject clauses, but they do not permit
subjectless clauses.
226 Hubert Haider
Even if grammars are highly efficient, they nevertheless contain quite a few dys-
functional traits. The search for an optimal grammar would be in vain, however,
just like the search for the optimal animal. Efficiency is a matter of degree because
the selectors in the environments correspond to independent and hence some-
times conflicting demands. What is optimal for production may be suboptimal
for perception, and vice versa. What is optimal on the phonological level (e. g.
cluster reduction), may be suboptimal on the morphological level (e. g. cluster
reduction that produces non-distinct forms). This is a well-known and typical
situation for adaptation by selection. It is localistic and may create globally dys-
functional local maxima.
A strong case for adaptation by evolution and against society-driven func-
tionalism is the irreversibility of change. Interestingly, the irreversibility is
acknowledged by functionalists and declared a consequence of functionalism
(Givón undated: 8, on the “unidirectionality of change”). Functionalism does not
provide a demonstrative causality, however. The needs of a society are not coher-
ent and they may come and go, like trends in fashion. Language change, however,
is generally irreversible. When case morphology is gone it is not re-introduced
“Intelligent design” of grammars – a result of cognitive evolution 227
This is true of evolution on the level of the biological genotype as well as evolu-
tion on the level of a cognitive representation (viz. the cognitive representation
of grammar, with grammar as the “genotype” of the language it determines). In
each case, a reproductive system produces variation and this pool of variation is
exposed to blind selection. Selection is an environmental property. In biology, it is
the environment where the phenotype finds its resources, e. g. food. Analogously,
in cognitive selection the environment provides the resources for the phenotype.
The environment for cognitive evolution is the ensemble of brain resources for
language acquisition, production and perception. The brain resources constitute
the “biotope” in which the grammar “virus” resides after it has won the “struggle
for life” in the course of language acquisition.
Here is a concrete case for the sake of providing a more vivid impression,
given the abstract points raised above. It is the split of the Germanic language
family into a VO and an OV group during the time of the development of the Ger-
manic V2 property. This is a sketch of the crucial points only. For a detailed dis-
cussion please consult Haider (2010a) and Haider (2013: chapters 1 and 5).
228 Hubert Haider
In the Old Germanic languages, verb positioning was variable. Its base posi-
tion could be VP-final, VP-initial or VP-medial. Old English (Fischer et al. 2000:
51) is representative here (Haider 2010b; 2013, chapter 5).
The three alternative base positions for (5a) are indicated in (6). What this
amounts to is a high degree of indeterminacy for identifying the filler-gap relation
of the fronted finite verb.
(6) XP Vfin YP ZP
a. XP Vfin [YP ZP -i]
b. XP Vfin [YP -i ZP]
c. XP Vfin [-i YP ZP]
verb in the VP. The Northern group is head-initial (VO); the continental Western
group is head-final (OV). This is a unique split within a language family, and it is
parallel with the grammaticalization of V2. In fact, the latter change invited the
former.
The advantage of the change is obvious. It replaces a grammar with a high degree
of indeterminacy with a grammar with an easy to determine filler-gap relation.
The simpler grammar variant wins, and since there are two possible implementa-
tions (namely OV and VO, masked by V2), it comes as no surprise that both found
their way into the brains of language learners and users. The simpler grammar
wins because it suffices for processing the given language structure and there
is no necessity (in terms of a large residue of patterns that cannot be analyzed
with this grammar) for a more complex analysis that would be imposed by the
previous grammar. This is selection on the level of cognitive representation of
alternative grammar variants.
rigid head-positioning. Arguably, this is due to the ad-strate effect of being embedded in Slavic
speaking communities. Slavic languages are languages with flexible head-positioning. They all
show the variation illustrated in (3).
230 Hubert Haider
5.2 The target of evolution is not the utterance, but the grammar
Croft’s “theory of utterance selection” may account for the on-going process of
lexical changes (since this is by and large a Lamarckian kind of change), but not for
changes in the procedural system of language (viz. grammar change). Changes in
the lexicon are ubiquitous and continuous. This is not language change, however.
Language change is not so much a change in the inventory; it is a change in the
computational system. The token frequency of an utterance can only explain the
fossilization of an utterance30 as an idiom or the adoption of a novel lexeme. It is
neither the type frequency nor the token frequency of an utterance that matters.
What matters is the availability of an alternative structuring of an utterance. This
is not a question of token frequency, but rather a question of the size of a type set.
Dryer (2007: 245) emphasizes the importance of frequency: “One of the central
ideas of functionalist linguistics, especially over the past fifteen or so years, is
that frequency of usage plays a central role in explaining why languages are the
way they are”. Given the discussion above, it should be clear by now that we are
facing a chicken-and-egg problem. Adaptive success means higher utility, and
higher utility is likely to have an effect on the frequency of use. It is not frequency
that drives functionality. Causality goes in the other direction.
Newmeyer (1998: 124) puts it this way: “All linguists would agree that text
frequency is a response to a variety of factors, from cognitive complexity to prag-
29 2.9 million pages for (7a) versus 394.000 pages for (7b) (google-search on April 17th, 2012).
30 ‘Vater unser’ (the Lords Prayer, lit. ‘father our’) is ungrammatical in German, but it is the
first verse of the prayer every Christian knows. It is the direct translation of ‘pater noster’. Its
extremely high token frequency in German has not had any effect on the grammar of German,
though. Obviously, token frequency had not had any effect on the Grammar of German, of
course. Token frequency merely has the effect of fossilizing forms, but not of generalizing and
establishing the type.
“Intelligent design” of grammars – a result of cognitive evolution 233
matic usefulness. The question is to what extent frequency itself can legitimately
be called upon as an “explanation” for whatever phenomena seem to be sensitive
to it”.
“Ease” or “naturalness” in processing by the language processing brain is the
explanandum and frequency is the effect.31 In language change, higher frequency
may be a correlate and it may be part of an explanation, but it is not the cause.
The cause is a grammar change that makes a variant available that prevails over
“competing” variants. Prevailing may be reflected in frequency.
Frequency is a topic in first language acquisition, too. Here, an obvious
problem with frequency as a causal factor becomes perspicuous. The usual argu-
mentation uses frequency considerations as evidence and as a basis for a sus-
pected functional connection and overlooks that this explanatory move would
work only post hoc. If you know which functionality to look at, you can count
frequencies. But crucially, observing frequencies would not tell the child what to
do with frequency gradients. This is clearly stated by Yang (2004: 452): “Although
infants seem to keep track of statistical information, any conclusion drawn from
such findings must presuppose that children know what kind of statistical infor-
mation to keep track of. After all, an infinite range of statistical correlations exists”
and “statistics requires UG”. Statistical learning “is constrained by what appears
to be innate and domain-specific principles of linguistic structures, which ensure
that learning operates on specific aspects of the input” (Yang 2004: 455).
external and grammatical properties. The influence of the former on the latter
is played out in language use and acquisition and (therefore) language change
is manifested only typologically” (Newmeyer 2005: 175). This is exactly what we
expect for system adaptivity under natural selection. There is no direct influence
of function on form but nevertheless the system of forms will end up showing an
adaptive design.
Cognitive evolution provides a causal relation and predicts this overall
picture. The result of cognitive evolution is a family of adaptive changes in
grammar systems that may be described, but not explained, from a vantage point
of holistic functionalism. Cognitive evolution of grammars explains adaptation
without any functionalist backward causation and without any direct linkage
between properties of particular grammars and conjectured functional motiva-
tions for each of these properties.
It is needless to emphasize that the precise understanding of a theory of cog-
nitive evolution of grammars as cognitive systems replicated by brains is at least
as far away as it was for Darwin. He developed his theory without any precise
understanding of the real source of variation (i. e. genetic mutations), without any
idea of the real target of selection (viz. the genome), and had to link natural selec-
tion crucially with the idea of heredity long before the basic concepts of genetics
and inheritance were discovered. It took several generations of researchers until
the theory had arrived at its standard form as the modern evolutionary synthesis.
But he dealt with palpable entities, namely animals and plants.
We linguists deal with grammars. Like in nineteenth century biology, the sit-
uation in linguistics will not make fundamental progress before neuro-cognitive
research offers us a more precise understanding of the selection environments
for various kinds of structurally different cognitive representations. For biolog-
ical evolution, the selection environment was much easier to estimate. Darwin
understood it to be everything that enhances or impedes reproductive success.
The environment for cognitive selection is not so easily accessible, although it is
very close to every theoretician since it is located somewhere between his right
and his left ear.
The other disadvantage in comparison to Darwin is equally basic, namely the
precise identification of the species and their formal distinctions. In biology, it
sufficed to observe, analyze and describe; in grammar theory, observation does
not help. Presently, all we have is less than a dozen sufficiently analyzed grammar
systems. Typological cross-linguistic descriptions of human language grammars
are comparable to poor-quality photographs that barely suffice for telling apart a
gnu and a zebra.
For a sufficient understanding of cognitive selection we need a much better
understanding of the neuro-cognition of language processing and language learn-
236 Hubert Haider
References
Amundson, Ron (1998): Typology reconsidered: two doctrines on the history of evolutionary
biology. Biology and Philosophy 13: 153–177.
Beckner, Clay, R. Blythe, J. Bybee, M. H. Christiansen, W. Croft, N. C. Ellis, J. Holland, J. Ke,
D. Larsen-Freeman, T. Schoeneman (2009): Language is a complex adaptive system.
Language Learning 59: 1–26.
Bierwisch, Manfred (2000): Evolutionstheorie und Linguistik – Drei Probleme. In: Klaus Richter
(ed.), Evolutionstheorie und Geisteswissenschaften, 129–189. University Erfurt: Acta
Academia Scientiarum 5.
Brastronndon, Robert. N. (1982): The levels of selection. Proceedings of the Philosophy of
Science Association 1: 315–323.
Croft, William (2000): Explaining language change: an evolutionary approach. London:
Longman.
Croft, William (2013): Language use and the evolution of languages. In: Kenny Smith and
Philippe Binder (eds.), The Language phenomenon, 93–120. Berlin: Springer.
Cummins, Robert (1975): Functional analysis. The Journal of Philosophy 72: 741–765.
Darwin, Charles (1859): On the origin of species by means of natural selection, or the
preservation of favoured races in the struggle for life. London: John Murray, Albemarle
Street.
Darwin, Charles (1871): The descent of man, and selection in relation with sex. London: John
Murray, Albemarle Street.
Dawkins, Richard (1978): Replicator selection and the extended phenotype. Zeitschrift für
Tierpsychologie 47: 61–76.
Dawkins, Richard (1982a): Replicators and vehicles. In: King’s College Sociobiology Group,
Cambridge (ed.), Current Problems in Sociobiology, 45–64. Cambridge: Cambridge
University Press.
Dawkins, Richard (1982b): The Extended Phenotype. Oxford: Oxford University Press.
Dawkins, Richard (1996) [1986]: The Blind Watchmaker. New York: W. W. Norton.
De Vogelaer, Gunther (2007): Innovative 2pl.-pronouns in English and Dutch – ‘Darwinian’ or
‘Lamarckian’ change? Studies van de BKL 2007 / Travaux du CBL 2007 / Papers of the LSB
(Linguistic Circle of Belgium) 2: 1–14.
“Intelligent design” of grammars – a result of cognitive evolution 237
De Vries, Hugo (1909): Variation. In: A. C. Seward (ed.), Darwinism and Modern Science, 66–84.
Cambridge: Cambridge University Press.
Dryer, Matthew S. (2007): Review of Newmeyer’s Possible and probable languages. Journal of
Linguistics 43: 244–252.
Fischer, Olga, Ans van Kemenade, Willem Koopman and Wim van der Wurff (2000): The syntax of
Early English. Cambridge: Cambridge University Press.
Fitch, W. Tecumseh (2007): An invisible hand. Nature 449: 665–666.
Givón, Talmy (undated): On the intellectual roots of functionalism in linguistics. Ms. University
of Oregon. URL <http://linguistics.uoregon.edu/wp-content/uploads/2013/03/
FuncLing11Rus.pdf> (18. 4. 2013)
Gould, Stephen Jay (2002): The structure of evolutionary theory. Cambridge, MA: Harvard
University Press.
Haider, Hubert (1997): Economy in Syntax is Projective Economy. In: Chris Wilder, Hans-Martin
Gaertner and Manfred Bierwisch (eds.), The Role of Economy Principles in Linguistic Theory
(Studia Grammatica 40), 205–226. Berlin: Akademie Verlag
Haider, Hubert (1998): Form follows function fails – as a direct explanation for properties
of grammars. In: Paul Weingartner, Gerhard Schurz and Georg Dorn (eds.), The Role of
Pragmatics in Contemporary Philosophy, 92–108. Vienna: Hölder-Pichler-Tempsky.
Haider, Hubert (2001): Not every why has a wherefore – Notes on the relation between form
and function. In: Walter Bisang (ed.), Aspects of Typology and Universals, 37–52. Berlin:
Akademie-Verlag.
Haider, Hubert (2010a): The Syntax of German. Cambridge: Cambridge University Press.
Haider, Hubert (2010b): Wie wurde Deutsch OV? Zur diachronen Dynamik eines Struktur-
parameters der germanischen Sprachen. In. Arne Ziegler (ed), Historische Textgrammatik
und Historische Syntax des Deutschen – Traditionen, Innovationen, Perspektiven, 11–32.
Berlin/New York: De Gruyter.
Haider, Hubert (2013): Symmetry breaking in Syntax. Cambridge: Cambridge University Press.
Haspelmath, Martin (1999): Optimality and diachronic adaptation. Zeitschrift für Sprachwis-
senschaft 18: 180–205.
Hawkins, John A. (1994): A Performance Theory of Order and Constituency. Cambridge:
Cambridge University Press.
Hawkins, John A. (2004): Efficiency and complexity in grammars. Oxford: Oxford University
Press.
Heath, Jeffrey G. (1984): Language Contact and Language Change. Annual Review of
Anthropology 13: 367–384.
Hempel, Carl (1959): The logic of functional analysis. In: Llewellyn Gross (ed.), Symposium on
Sociological Theory, 271–307. New York: Harper & Row.
Heylighen Francis, (1999): The growth of structural and functional complexity during evolution.
In: Francis Heylighen, Johan Bollen and Alexander Riegler (ed.), The Evolution of
Complexity, 17–44. Dordrecht: Kluwer.
Hull, David L. (2001): Science and selection: Essays on biological evolution and the theory of
science. Cambridge: Cambridge University Press.
Lightfoot, David W. (2002): The development of language: Acquisition, change and evolution.
Oxford: Blackwell.
Lloyd, Elisabeth (2012): Units and levels of selection. Edward N. Zalta (ed.) The Stanford
Encyclopedia of Philosophy. URL (18. 4. 2013) <http://plato.stanford.edu/archives/
win2012/entries/selection-units/>
238 Hubert Haider
Mayr, Ernst (1991): One long argument. Cambridge, MA: Harvard University Press.
Millstein, Roberta L. (2006): Discussion of “four case studies on chance in evolution”:
Philosophical themes and questions. Philosophy of Science 73(5): 678–687.
Monod, Jacques (1971): Chance and necessity: An essay on the natural philosophy of modern
biology. New York: Alfred A. Knopf.
Müller, Friedrich Max (1862): Lectures on the Science of Language. London: Longman, Green,
Longman, and Roberts.
Nagel, Ernest (1961): The structure of science: Problems in the logic of scientific explanation.
New York/ Burlingame: Harcourt, Brace and World Inc.
Newmeyer, Frederick J. (1998): Language form and language function. Cambridge, MA: MIT
Press.
Newmeyer, Frederick J. (2001): ‘Where is functional explanation?’ In: Mary Andronis,
Christopher Ball, Heidi Elston and Sylvain Neuvel (eds.), Papers from the thirty-seventh
meeting of the Chicago Linguistic Society. Part 2: The Panels, 99–122. Chicago: Chicago
Linguistic Society.
Newmeyer, Frederick J. (2005): Possible and probable languages. Oxford: Oxford University
Press.
Pinker, Steven and Paul Bloom (1990): Natural language and natural Selection. Behavioral and
Brain Sciences 13: 707–784.
Premack, David (1985): “Gavagai!” or the future history of the animal language controversy.
Cambridge, MA: MIT Press
Rauschecker, Josef P. and Sophie K. Scott (2009): Maps and streams in the auditory cortex:
nonhuman primates illuminate human speech processing. Nature Neuroscience 12:
718–724.
Ruse, Michael (2003): Darwin and design: Does evolution have a purpose? Cambridge, MA:
Harvard University Press.
Tooby, John and Leda Cosmides (1990): Toward an adaptionist psycholinguistics. Behavioral
and Brain Sciences 13(4): 760–763.
Trudgill, Peter (2011): Sociolinguistic Typology: Social determinants of linguistic complexity.
Oxford: Oxford University Press.
Winford, Donald (2003): Contact-induced changes – classification and processes. In: Ohio State
University Working Papers in Linguistics 57: 129–150.
Yang, Charles D. (2004): Universal Grammar, statistics, or both? Trends in Cognitive Sciences 8:
451–456.
Guido Seiler, University of Munich
Syntactization, analogy and
the distinction between proximate and
evolutionary causations1
1 I thank Aria Adli, Marco García García and Göz Kaufmann – the organizers of the Grammar,
Usage, and Society workshop held in Freiburg, Germany in November 2011 – for giving me the
opportunity to discuss my ideas. I thank the workshop audience for fruitful discussions. I am
particularly grateful to my Freiburg colleague Daniel Jacob and to Peter Culicover who visited my
students and me in September 2011. During a seminar on Simpler Syntax, Daniel and Peter dis-
cussed the idea that autonomy might have to do with change rather than with innateness. They
thus formulated a thought which became central for the present paper.
240 Guido Seiler
problem of the “syntax only” approach is the fact that many things which happen
in syntax cannot be explained syntax-internally. For example, in languages with
relatively free constituent order it is impossible to explain the concrete choice of
an ordering without reference to extrasyntactic, semantic (definiteness, animacy)
or pragmatic (information structure) factors, i. e. factors which do not necessarily
lead to grammaticality contrasts but nevertheless to clear statistical preferences.
These generalizations about syntax are lost in a “syntax only” approach. Central
aims of the formalist approach can be summarized as follows:
A functional approach to language means, first of all, investigating how language is used:
trying to find out what are the purposes that language serves for us, and how we are able
to achieve these purposes through speaking and listening, reading and writing. But it also
means more than this. It means seeking to explain the nature of language in functional
terms: seeing whether language itself has been shaped by use, and if so, in what ways – how
3 As a reviewer points out, one might ask whether sociolinguistic (or stylistic) variant competi-
tion and its respective frequency distribution can be regarded as epiphenomenal to communi-
cative function. At least for cases where the choice of a variant is related to social meaning we
would like to subsume the construction of social meaning under communicative function, too.
242 Guido Seiler
the form of language has been determined by the function it has evolved to serve. (Halliday
1973: 7; emphasis mine)
Again, one might ask to what degree particular theoretical beliefs (e. g. a motiva-
tionist view) justify a particular method of analyzing syntactic structure.
In practice, many linguists from both schools accept some assumptions or
insights from the other. For example, in the last two decades generative syntax
has made efforts to incorporate insights about information structure into the
analytical framework. However, the aspect which seems most difficult to compro-
mise on is the status of syntactic autonomy: Either you believe in autonomy or in
motivation. In the following we will argue that both autonomist and motivationist
explanations are right, but they explain different things.
The remainder of the present paper is structured as follows. In Section 2, I will
present the central argument which has at its core a novel interpretation of syn-
tactic autonomy and its relationship to mechanisms of change. The argument will
be illustrated in greater detail in Section 3, taking as an example prepositional
dative marking in Upper German dialects. In the concluding Section 4 we will
summarize the main insights and discuss a few enlightening analogies between
linguistics and evolutionary biology.
Syntactization, analogy and distinction between causations 243
4 Croft defines arbitrariness as follows: “The syntactic component contains elements and rules
of combination that are not derivable from semantic or discourse categories and their combina-
tion” (Croft 1995: 494). Self-containedness describes the assumption that “the rules of the system
interact with each other but do not interact closely with the rules existing elsewhere” (Croft 1995:
495).
5 Newmeyer (1998) defines the three hypotheses of autonomy as follows:
Autonomy of syntax: “Human cognition embodies a system whose primitive terms are non-
semantic and nondiscoursederived syntactic elements and whose principles of combination
make no reference to system-external factors”. (Newmeyer 1998: 23)
Autonomy of knowledge of language: “Knowledge of language (‘competence’) can and should
be characterized independently of language use (‘performance’) and the social, cognitive, and
communicative factors contributing to use”. (Newmeyer 1998: 24)
Autonomy of grammar as a cognitive system: “Human cognition embodies a system whose prim-
itive terms are structural elements particular to language and whose principles of combination
make no reference to system-external factors”. (Newmeyer 1998: 24)
244 Guido Seiler
b. heute regnet es
today rains it
Subject expletives are inserted regardless of the position in the clause. However,
expletive es may also be inserted in sentences already containing a subject, yet
this kind of insertion is restricted to the prefield position (SpecCP, the position
before the finite verb in declarative main clauses). SpecCP expletives occur sys-
tematically in the impersonal passive (4) but are not limited to it (5):
(6) Agent > Beneficiary > Experiencer/Goal > Instrument > Patient/Theme > Locative
Syntactization, analogy and distinction between causations 245
c. PATIENT:
die Tür geht auf
the door goes open
‘the door opens’
Crucially, subjects are not inherently linked to any specific thematic role. This
means that subjecthood does not contribute anything to semantic interpretation.
The subject condition is a constraint on clause structure as such. Similarly, sub-
jects are not inherently linked to any specific discourse function. If we adopt
Choi’s (1997: 75) distinction of Topic, Focus, and Tail in German (adapted from
Vallduví 1992), it follows that subjects may be linked to each of the discourse
functions:
b. Subject = Focus:
die nächsten Wahlen wird OBAMA gewinnen
the next elections will Obama win
c. Subject = Tail:
die nächsten Wahlen wird Obama SICHER gewinnen
the next elections will Obama surely win
SpecCP: Let us now turn to the other type of expletive insertion. German is a so-
called verb-second language (more precisely: a finite-second language). German
declarative main clauses obey a verb-second constraint: There is one (and only
one) constituent position (SpecCP, the German prefield position) before the finite
verb which must be filled. We have already seen that SpecCP is linked neither to
any syntactic function (subject, object, etc.) nor to any particular thematic role.
As regards information structure, the SpecCP position does not seem to be linked
to a particular discourse function: The relationship between discourse-pragmatic
functions and the respective means of expression is unusually complex, disparate
and partly contradictory in German (see Musan 2002 for an overview). However, it
is uncontroversial that the SpecCP position can be used for topicalization:
246 Guido Seiler
(9) a. [DP dieses Buch] hat Anna heute in die Bibliothek gebracht
this book has Anna today to the library brought
(9) d. [DP dieses Buch] [PP in die Bibliothek] hat Anna heute gebracht
this book to the library has Anna today brought
Case marking: Why does a language have case marking? Blake (2001: 1) defines
the function of case as “a system of marking dependent nouns for the type of rela-
tionship they bear to their headsˮ. Applied to transitive predicates, this means
that nominative and accusative (or ergative and absolutive) have the function of
morphologically distinguishing subjects from objects. In a predicate HIT(Hans,
Peter) it is functional to be able to distinguish between the hitting and the hit par-
ticipant. This can be achieved by means of word order, verbal agreement, or mor-
phological or adpositional case marking. However, in many transitive predicates
grammatical means of subject-object distinction are completely useless:
object marking) like German: Here, objects are required to be expressed in the
accusative case regardless of potential or actual ambiguities. In a sense, strict
case marking languages like German are by far too explicit with respect to the
subject-object distinction. It seems that these languages make use of the available
case morphology with no regard to communicative function but simply because
the structural configuration requires it. As for German, this observation is even
true if looked at from a different perspective. In Modern German, the nominative-
accusative distinction of noun phrases containing a determiner is very weak,
for the distinction is overtly expressed only in the masculine singular. In addi-
tion, and even worse, German notoriously lacks any case morphology in proper
names. Yet in the written standard variety proper names are not accompanied
by a determiner. Nominative and accusative of proper names are thus identical
in their form. But proper names are high in both animacy and definiteness, so if
we had the freedom to distribute case morphemes over parts of the lexicon, we
would certainly give them to proper names in the first place! In sum, German case
is dysfunctional in two ways. On the one hand, German uses too much case (in
unambiguous transitive predicates); on the other hand, case is too limited (as far
as proper names are concerned).
To conclude this section, let us briefly come back again to the general issue
of (absence of) extrasyntactic functionality of the discussed examples. We have
argued that each example poses great difficulties in terms of functional moti-
vation as it is generally understood in the functionalist literature. However, we
are convinced that expletives, verb-second, case marking, etc. do fulfill a certain
function in the linguistic system – just not a function at the level of semantics,
pragmatics, or iconic encoding. In order to identify this function it is necessary
to overcome the bilateral-semiotic view as it is practiced e. g. in Construction
Grammar. At the core of the bilateral-semiotic view lies the assumption that
grammar at all of its levels consists of a collection of constructional schemes, i. e.
form–meaning pairings of varying size. The intuition is that grammar has the job
of packing particular meanings into particular constructions. But grammar has
other functions than just that. It guarantees structural well-formedness. Whereas
particular patterns of well-formedness are not related to any particular meaning
or communicative function, grammar as a whole has a function indeed: It makes
processing easier. Thus, ultimately the purpose of structural well-formedness as
such is to make communication more efficient. This can be achieved in various
ways, but it has to be done somehow. How exactly well-formedness is realized is
to a great extent specific to a language (as long as the language remains within
the constraints of possible cross-linguistic variation); some languages put their
objects in front of the verb, others after it – the important thing is that they have to
do it somehow. If we think of typological variation in terms of different rankings
248 Guido Seiler
link between structure and function can be constructed only via diachrony, i. e.
processes of variation and selection (1999a: 187–189). What is new in the present
proposal is the claim that functional factors may become obsolete over time, thus
enhancing autonomy. Our central assumptions, which will be exemplified in
detail in Section 3, are the following:
Ad (i): In Section 2.1 above, we defined autonomy of syntax as a cover term for
those aspects of syntax which cannot be motivated by anything extrasyntactic
such as meaning, communicative function or general cognitive principles (e. g.
iconicity). We thus assume that there are aspects of syntax which are arbitrary
from a functional perspective. Arbitrariness of the linguistic sign is one of the
most fundamental design features of human language (Hockett 1960) and one
of the central insights of modern structural linguistics. It is a surprising fact that
arbitrariness has been disputed at all in the area of syntax, given the fact that arbi-
trariness is a standard assumption about vocabulary, but also about morphology
(cf. morphomes, Aronoff 1994). Even in phonology we find arbitrary traits, i. e.
traits which cannot be motivated in terms of articulation or perception, namely
opacity (Kiparsky 1973). So, if arbitrariness is a fundamental property of human
language structure as a whole, why should syntax be an exception?
Ad (ii): Having accepted that syntactic autonomy exists, one has to ask where
it comes from. I do not see any obvious reason to conclude from autonomy to
innateness of basic principles of syntactic organization. Rather, I propose again
that we should learn from phonology and morphology. Phonological opacity goes
back to formerly transparent, phonetically motivated alternations which persist
even at a time when the motivating factor is lost, thus gaining a certain degree of
autonomy, or phonetic arbitrariness. Morphomes, such as e. g. arbitrary inflec-
tional classes, are often the synchronic reflex of transparent, e. g. semantically
motivated distinctions at an earlier stage of the language. Again, the relevance
of the motivating factor has decreased or even been lost entirely. Phonological
and morphological autonomy have in common that they both emerge through
diachronic processes, namely diachronic processes of a special kind: loss of con-
ditioning environment (henceforth LOCE). LOCE is a very common pathway of
language change, and it would be surprising if syntax were an exception.
The loss of semantic or pragmatic conditioning in the development of syn-
tactic structure was a central observation in early grammaticalization research,
250 Guido Seiler
under the term “syntactization”. What Givón (1979) describes in the following
citation can be understood as LOCE at the syntactic level: “Loose, paratactic,
‘pragmatic’ discourse structures develop – over time – into tight, ‘grammatical-
ized’ syntactic structures. […] Language […] takes discourse structure and con-
denses it – via syntactization – into syntactic structureˮ (Givón 1979: 108; empha-
sis mine). From the perspective of LOCE, we can paraphrase syntactization as
follows: At some earlier time, the use of an expression was dependent on the
presence of a particular pragmatic context. It required an extrasyntactic trigger.
Later, the expression gained a certain degree of autonomy with regard to extra-
syntactic factors.
Ad (iii): Strictly speaking, LOCE does not necessarily mean that once an
old distributional pattern is lost a new one emerges: The distribution of expres-
sions may become totally random from a synchronic point of view. However, in
the interesting cases the new distribution of expressions also follows a certain
pattern, but one which no longer reflects the old motivating factor. Thus, in order
for syntactization to work a new distributional pattern must be established which
is syntactic in essence. It is often the case that an expression is formerly used only
under certain extrasyntactic contextual conditions which are then dropped such
that the syntactic environment alone triggers the use of that expression. That is, a
grammatical pattern is extended from a source environment to other cases. This
is, of course, the classical definition of analogical extension. Analogical exten-
sion starts from a source context and affects items in a (larger) target context
which shows some functional or structural similarity to the source context. Ana-
logical extension may affect all items within a given target context, in which case
we speak about obligatorization, or syntactization as far as a syntactic pattern is
concerned. I assume that analogical extension is the mechanism by which auto-
nomous syntactic structure is implemented diachronically.
The term “analogy” describes both a cognitive mechanism and a common
pathway of diachronic change, as Bybee (2010) emphasizes: “It is important to
note that analogy as a type of historical linguistic change is not separate from
analogy as a cognitive processing mechanism” (Bybee 2010: 72). The literature
on analogy is abundant. There is some agreement among authors that analogy
is a more general, domain-independent cognitive principle ( cf. Blevins and
Blevins 2009; Itkonen 2005; cf. also Gentner, Holyoak and Kokinov 2001, without
particular reference to language structure). Also, authors emphasize the impor-
tance of similarity relations in analogy (Itkonen 2005: chapter 1.1; Bybee 2010:
57; de Smet 2012: 603). Non-technically speaking, we might understand analog-
ical extension as an instance of the general tendency to use similar strategies
for similar tasks. If you have learned to eat spaghetti by rotating a fork you will
rotate the fork for linguine, too, thus eat linguine in analogy to spaghetti. More
Syntactization, analogy and distinction between causations 251
It is worth noting that the concept of syntactic analogy is not new at all,
although its role has perhaps been underestimated. According to Percival (1971),
it goes back to Neogrammarian concepts of change, in particular to Blümel (1914).
What does the proposed scenario mean for the relationship between struc-
ture and function, and for the division of labor between formal and functional
explanations? Morphosyntactic change is driven by forces which are well under-
stood and described in functionalist terms, such as reanalysis, grammaticaliza-
tion, iconicity and analogical extension. It seems that (the direction of) change
is a direct reflection of the ways language is used by speakers to achieve their
communicative goals. However, frequent use of grammatical patterns may
entrench their structural makeup to such a degree that functional motivations
(which enabled the process to get into play) become obsolete. What a concrete
example of such usage-driven syntactization (with autonomy as its result) might
look like will be discussed in greater detail in Section 3. If the proposal made here
is correct, it follows that functional explanations are especially powerful with
regard to patterns of variant selection and thus ongoing change, but too limited
for a deeper understanding of the resultant, synchronic grammatical structure. It
is formal theories of syntax in the first place that are suitable to predict grammat-
ical well-formedness (and this, of course, is exactly what they are designed for).
factors which may become crucial for the selectional success of a variant (cf. de
Vogelaer 2007 for a similar point).
So, why is cross-dialectal variation important in this context? New variants
emerge at some place at some time (often via processes of relatively mechanical,
“blind” structural reanalysis, as we will see in the following section). They then
gradually spread over larger areas. The first consequence of variant spread for the
infected grammars is just the addition of a new option, i. e. spread leads to variant
competition in larger areas. However, different dialects often deal with a given set
of competing variants in different ways, according to social, functional or struc-
tural factors (one might say that dialects do different things with the same set of
available expressions). Dialects may develop different functional arrangements
between those variants. A variant may become obligatory in dialect A under
certain contextual conditions, but not in dialect B, whereas in dialect C other con-
ditions are relevant than in dialects A and B, etc. In short, dialects differ not only
in their inventories of variants, but also in the ways variants are implemented in
their respective systems of grammar.6 Therefore, cross-dialectal variation offers
us the most direct insight into the rise and fall of functional motivations of variant
selection.
6 We assume that different dialects have different grammars. Dialect variation is just cross-lin-
guistic variation between closely related languages.
254 Guido Seiler
Other observations suggest that the dative marker is less independent than other
prepositions. For example, it does not allow scope over two conjuncts. In Seiler
(2003: 148) I analyze the dative marker as an element of the class of prepositions,
whereby it is a special property of the dative marker that it is not able to project
a prepositional phrase but is rather head-adjoined to the following determiner.
As for the emergence of the dative marker, it is argued in Seiler (2003: 215) that
reanalysis of article forms plays a crucial part. Already in Middle High German,
dative article forms, e. g. the singular masculine dëme, formed fusional morphs
with prepositions, whereby the initial dental of the article was dropped:
(14) obem 1280, uf(f)em 1270, am 1277, im 1258, underm 1276, us(s)em 1409,
vom 1277, vorem 1280, hinderm 1403, bim 1280, zem 1245
(Idiotikon XIII: 1191–1192).
In Upper German the form without an initial dental has been generalized over
all other contexts, also in dialects without prepositional dative marking. Thus
(with the exception of extremely conservative dialects) the article form became
əm, with some variation in the vocalism. There exists a whole paradigm of
fusional morphs <preposition_article>, some of which are homophonous with
the bare dative article in unstressed position (namely the equivalents of Stand-
Syntactization, analogy and distinction between causations 255
ard German im ‘in_the’, am ‘at_the’; cf. Seiler 2003: chapter 8.1 for details). It is
relatively obvious to reanalyze a form əm, which is etymologically just <article>,
as having the morphological structure <preposition_article>. This is exactly what
happened in a subset of Upper German dialects which developed prepositional
dative marking. But why should this reanalysis take place after all?
According to Nübling (1992: 221), the most frequent and thus prototypical
context for datives is post-prepositional anyway. More than 90 % of datives are
governed by a “true” preposition in Upper German. Developing prepositional
dative marking means that the prototypical context for dative forms is general-
ized even over those contexts where no other preposition is there already (e. g. in
indirect object function). We might interpret this process as analogical extension:
Formerly bare datives are realized in analogy to the more frequent, i. e. post-prep-
ositional occurrence type. Reanalysis as a process of mechanical structural vari-
ation produces an element without any particular meaning or function, but with
a category label: the dative marker as an expletive preposition.
In light of the evolutionary framework as outlined in Section 3.1, language
change is a two-step process. Reanalysis simply adds a new variant; indeed, prep-
ositional dative marking and bare datives still coexist in most dialects. However,
different dialects deal with this variant competition in different ways, i. e. they
show different patterns of variant selection. Moreover, the distribution of the bare
vs. prepositional dative can be attributed to more general functional (extrasyn-
tactic) principles. I will focus on the influence of information structure and icon-
icity here (see Seiler 2003: chapter 7 for other factors).
In Alemannic dialects of northern Switzerland there is a strong tendency to
insert the dative marker only if the dative noun phrase is focused and bears main
sentence stress. It is not inserted if another constituent is focused (cf. Seiler 2003:
177–186):
b. Dative = focus:
ich han s buech i dr MARTE ggëë
I have the book DM the:Dsf Martha given
‘I gave the book to MARTHA’ (Alemannic: Schaffhausen)
the more marked situation (dative = focus) is expressed by means of the more
marked variant (= prepositional dative marking). This distribution is (construc-
tionally) iconic. A similar point is made by Lambrecht (1994) about the correla-
tion between prosodic prominence and communicative importance:
Thus, in northern Switzerland, where both bare and prepositional datives coexist,
their distribution nicely reflects extrasyntactic factors such as information struc-
ture and sentence stress, the concrete realization of which corresponds to more
general cognitive principles such as iconicity.
However, in other dialects prepositional dative marking is obligatory in all
contexts. This is the case e. g. in the Muotathal valley of central Switzerland. Here,
all dative noun phrases are preceded by the dative marker or another preposition,
regardless of discourse function, stress pattern or other factors (distinctiveness
of dative morphology, thematic roles, position, determiner category, etc.). The
dative marker serves as an expletive which is inserted whenever no other prepo-
sition is there already, without respect to any other (in particular extrasyntactic)
factor. We interpret this state of affairs as full implementation of prepositional
dative marking. Diachronically speaking, dative marker insertion is analogically
extended to all datives. Recall that, according to Kiparsky (1982, 2012), analogical
extension can be understood as grammar simplification since contextual con-
straints are dropped.7 A strategy is extended to the whole of a certain context –
and in our case this context is purely syntactic, i. e. the target environment of
analogical extension is defined on purely structural grounds. Assuming that
analogy relies on a similarity relation, similarity here is based on a purely struc-
tural description without any reference to function or meaning.
How do we get from the Schaffhausen to the Muotathal variant of preposi-
tional dative marking? Is there a way of motivating the analogical extension of the
7 Whereas constraint removal can be understood as the impetus for analogy, its result may also
(and paradoxically) be a complexification of the system, as long as obligatorization is not yet
reached: “As every working historical linguist knows, analogical changes tend towards improv-
ing the system in some way (even if incomplete regularization may paradoxically end up com-
plicating it)”. (Kiparsky 2012: 21)
Syntactization, analogy and distinction between causations 257
variant that involves more phonological material? Perhaps it is due to the maxim
of “extravagance” which, according to Haspelmath (1999b), plays a central role
in grammaticalization processes. Haspelmath discusses why grammaticalization
is irreversible. Pursuing a usage-based approach to change in the spirit of Keller
(1994), he introduces a maxim of extravagance (“Extravagance: talk in such a
way that you are noticed”, Haspelmath 1999b: 1055), which may ultimately cause
grammaticalization processes as the unintended cumulative effect of communi-
cative actions: “Grammaticalization is a side effect of the maxim of extravagance,
that is, speakers’ use of unusually explicit formulations in order to attract atten-
tion” (Haspelmath 1999b: 1043). As an unintended side-effect of increasing use,
the more explicit expression may become obligatory. Increasing obligatoriness,
however, is nothing else than what we called syntactization earlier, i. e. applied
to our case: dative marker insertion due to a purely morphosyntactic constraint
on possible environments of dative forms.
In sum, every single step of the gradual implementation of prepositional
dative marking can be relatively easily motivated on the basis of very general,
extrasyntactic, highly usage-based mechanisms such as analogical extension,
iconicity and “extravagance”. However, the result of these processes cannot.
Muotathal speakers certainly do not focus their datives all the time. The example
of prepositional dative marking shows that functional factors provide a plausible
explanation for selectional preferences during a phase of variant competition and
for further implementation of the variant in question. At the same time, it is true
that functional motivations which promote the implementation of a variant may
become obsolete once the variant is implemented further. As for obligatory, fully
syntactisized prepositional dative marking, it is not only impossible to ascribe
it any extrasyntactic function: It is unnecessary. The dative marker is inserted
because the syntax wants it. Any search for a functional motivation within the
synchronic state of the language misses the generalization.
In this chapter it was argued that both formal and functional approaches in lin-
guistics are explanatory, but at different levels. It was shown that syntax con-
tains traits which cannot be motivated on the basis of extrasyntactic function
in any direct way. We called this class of phenomena syntactic autonomy. Meth-
odologically, it seems fully appropriate to us to make use of the analytical tools
provided by the formalist tradition in order to capture abstract, purely structural
regularities and relationships. Functionalist argumentations run the risk of over-
interpreting autonomous traits of syntax by searching for extrasyntactic motiva-
258 Guido Seiler
tions where none exist. Based on the example of prepositional dative marking
in Upper German, we have shown that the patterns of variant selection found in
some dialects can indeed be motivated extrasyntactically whereas in other dia-
lects dative marker insertion is purely syntactically triggered, which makes the
search for a functional motivation not only a difficult, but also a pointless task:
Here, prepositional dative marking is due to syntactic well-formedness. We have
hypothesized that well-formedness as such does have a communicative function
insofar as it makes communication more efficient, yet the concrete instantiations
of well-formedness in a particular language are often independent of concrete
functional motivations.
According to our hypothesis, autonomy of syntax is the result of diachronic
development – processes of changes in variant selection which often reflect more
general, i. e. extrasyntactic cognitive principles such as analogical extension,
iconicity or “extravagance”. These must be understood in functionalist or usage-
based terms. Paradoxically, analogical extension may lead to syntactization
which makes the functional factors formerly promoting the selection of a particu-
lar variant obsolete: Whereas pathways of change may be motivated by language
use and communicative function, these processes may ultimately enhance syn-
tactic autonomy. If this reasoning is on the right track, it means that functional
explanations are actually diachronic explanations. Extrasyntactic motivations
are at play especially as long as a variant is not yet fully syntactisized.
Another consequence is the fact that the synchronic structural makeup of
a syntactic pattern is not determined by its function. Knowing the function of
a construction tells us little or nothing about its formal structure. Interestingly,
a similar point can be made from the perspective of evolutionary biology. Ven-
omous snakes use their poison for hunting and digesting their prey in the first
place. It is functional for the snake not to waste the poison for defense. There are
two basic strategies which limit the use of poison for defense: camouflage and
warning. As for warning, different species display different patterns: warning
gestures (cobras), warning sounds (rattlesnakes), or warning colors (coral
snakes). Important in our context is the fact that the function of those patterns
does not determine their structural makeup and therefore leaves space for formal
variation.
Also, from a diachronic perspective, form may follow function only on the
basis of inherited traits. Languages can never invent things ex nihilo (even if that
would be extremely functional); they can only transform devices which are there
already. Most aspects of the structure of a language are determined by the fact that
they are inherited from the language spoken by the preceding generation, regard-
less of whether they are functional or not, whether they are good representatives
of a language universal or not, whether they reflect cross-linguistic preferences
Syntactization, analogy and distinction between causations 259
or not. Only changes in that structure are in a more direct way interpretable as
adjustments towards more general, structural or functional tendencies. Things
cannot be invented ex nihilo in biology, either. The predecessors of sea urchins
were sessile and had no limbs. Later, pre-sea urchins began to move, perhaps as
a reaction to a change in their environment. Evolution did not invent new limbs
because there was nothing which could be transformed into limbs, due to the
pentaradial-symmetric structure of the pre-sea urchin’s body. But pre-sea urchins
had spines, and indeed today’s sea urchins use their spines for motion (Knop
2008: 9).
Finally, if the synchronic grammar allows for a great degree of autonomy, i. e.
independence of functional motivations, one question remains: Is all syntactic
structure just historical contingency? Given our assumptions, shouldn’t it be the
case that anything goes in syntax, without respect to limitations of possible cross-
linguistic variation? Probably not. First, certain types of change are likely to occur
and produce certain kinds of synchronic structure. This idea has been elaborated
in great detail in the field of phonology (Blevins 2004). According to Blevins’
theory of evolutionary phonology, cross-linguistically recurrent patterns are not
so much due to (innate) language universals but rather due to the fact that these
patterns are the results of common types of phonological change. It is worth con-
sidering to what extent this approach is applicable to syntax as well (evolution-
ary syntax in analogy to evolutionary phonology). Second, even linguists who
are generally skeptical about the idea of Universal Grammar must acknowledge
the striking fact that the syntaxes of all languages have something to say about
constituent structure, recursion, grammatical function, lexical classes and basic
principles of case marking and agreement. Whereas Culicover and Jackendoff
(2005) refuse the concrete instantiation of Universal Grammar as suggested by
mainstream generativist linguistics in its technical detail, they nonetheless main-
tain the idea that limitations of cross-linguistic variation cannot be understood
without reference to a downsized version of Universal Grammar, which consists
exactly of the ingredients quoted above (Culicover and Jackendoff 2005: 40).
Let us now construct a last, more far-reaching analogy to evolutionary
biology. We have tried to show that both structure-driven and function-driven
explanations are justified in linguistics: Both structural and functional causa-
tions are at play in syntactic patterning, variation and change. Having accepted
that both explanations are necessary, a central question of theoretical linguistics
must be: In what ways do structure and function interact, and in what sense are
they independent of each other? How can we talk about structural and functional
causations in an objective, non-sectarian way? The answer is clear: by acknowl-
edging that they are complementary. Structural and functional approaches
explain different aspects of language. They are, so to speak, in complementary
260 Guido Seiler
distribution, and this is exactly the reason why they are ultimately compatible
with each other. Evolutionary biology could serve as a model for the integration
of different, but compatible levels of explanation.
According to Nesse (2009), biologists distinguish between two levels of
explanation – proximate and evolutionary – which coexist side by side and are
complementary of each other: “The most fundamental distinction in biology
is between proximate and evolutionary explanations. Proximate explanations
are about a trait’s mechanism […]. Evolutionary explanations are about how
the mechanism came to exist. These two kinds of explanation do not compete.
They are fundamentally different. Both are essential for a complete explanation”
(Nesse 2009: 158). Based on the fundamental distinction between proximate and
evolutionary explanations, Tinbergen (1963) even distinguishes between four
questions a biologist must deal with in order to arrive at a complete explanation
of a trait. Tinbergen’s questions enhance the proximate–evolutionary distinction
with the dimensions of ontogeny and phylogeny. They are “now nearly univer-
sal as a foundation for the study of animal behavior […]. Textbooks all begin by
explaining the need for all four kinds of explanation” (Nesse 2009: 159):
Every phenomenon or process in living organisms is the result of two separate causations,
usually referred to as proximate (functional) causations and ultimate (evolutionary)
causations. All the activities or processes involving instructions from a program are prox-
imate causations. [...] Ultimate or evolutionary causations are those that lead to the origin
of new genetic programs or to the modification of existing ones – in other words, all causes
leading to the changes that occur during the process of evolution. [...] It is nearly always
possible to give both a proximate and an ultimate causation as the explanation for a given
biological phenomenon. [...] Many famous controversies in the history of biology came
about because one party considered only proximate causations and the other party consid-
ered only evolutionary ones. (Mayr 1997: 67)
The debates in biology and linguistics do not, of course, match in detail. For
example, one might ask whether functional explanations in linguistics are anal-
ogous to evolutionary explanations of phylogeny in biology, to phenotypic plas-
ticity of organisms (van Buskirk and Schmidt 2000), or to both.8 However, the
fundamental structure of the debates in biology and linguistics is astonishingly
similar. In both disciplines, two schools defended their way of explaining aspects
of nature as the only possible one at their time: proximate vs. evolutionary in
biology, formal vs. functional in linguistics. The main difference between biology
and linguistics lies in the fact that the complementarity (and compatibility) of
the two kinds of explanation has been widely accepted by biologists since the
modern evolutionary synthesis some seventy years ago. A modern linguistic syn-
thesis is still yet to come.
For linguists, this is not exactly a reason to be proud of.
References
Aissen, Judith (2003): Differential object marking: Iconicity vs. economy. Natural Language and
Linguistic Theory 21: 435–483.
Aronoff, Mark (1994): Morphology by Itself: Stems and Inflectional Classes. Cambridge, MA: MIT
Press.
Bäbler, Heinrich (1949): Glarner Sprachschuel: Mundartsprachbuch für die Mittel- und
Oberstufe der Glarner Schulen. Glarus: Verlag der Erziehungsdirektion.
Blake, Barry (2001): Case. 2nd ed. Cambridge: Cambridge University Press.
Blevins, James P. and Blevins, Juliette (eds). (2009): Analogy in Grammar. Form and Acquisition.
Oxford et al.: Oxford University Press.
8 Phenotypic plasticity leaves room for direct interactions between traits and environment,
whereas in phylogeny this interaction is mediated by evolution. Nevertheless, the existence of
phenotypic plasticity simultaneously calls for proximate and evolutionary explanations: How
does it work, and how did it come into being?
262 Guido Seiler
Blevins, Juliette (2004): Evolutionary Phonology: The Emergence of Sound Patterns. Cambridge:
Cambridge University Press .
Blümel, Rudolf (1914): Einführung in die Syntax. Heidelberg: Winter.
Braune, Wilhelm (2004): Althochdeutsche Grammatik. Edited by Ingo Reiffenstein. Tübingen:
Niemeyer.
Bresnan, Joan (2001): Lexical-Functional Syntax. Malden, MA/Oxford, UK: Blackwell.
Bybee, Joan (2010): Language, Usage and Cognition. Cambridge et al.: Cambridge University
Press.
Choi, Hye-Won (1997): Optimizing Structure in Context. Scrambling and Information Structure.
Stanford: Center for the Study of Language and Information.
Croft, William (1995): Autonomy and functionalist linguistics. Language 71: 490–532.
Croft, William (2000): Explaining Language Change. Harlow et al.: Longman.
Culicover, Peter W. and Ray Jackendoff (2005): Simpler Syntax. Oxford et al.: Oxford University
Press.
de Smet, Hendrik (2012): The course of actualization. Language 88: 601–633.
de Vogelaer, Gunther (2007): Darwinian or Lamarckian change: innovative 2pl.-pronouns in
English and Dutch. In: Frank Brisard (ed.): Papers of the Linguistic Society of Belgium,
1–14. Bruxelles: Linguistic Society of Belgium.
Fanselow, Gisbert and Sascha W. Felix (1993): Sprachtheorie I: Grundlagen und Zielsetzungen.
3rd ed. Tübingen: Francke.
Gentner, Dedre, Keith J. Holyoak and Boicho N. Kokinov (eds.) (2001): The Analogical Mind:
Perspectives from Cognitive Science. Cambridge,MA/London: MIT Press.
Givón, Talmy (1979): On Understanding Grammar. New York: Academic Press.
Givón, Talmy (1984): Direct object and dative shifting: semantic and pragmatic case. In: Frans
Plank (ed.), Objects. Towards a Theory of Grammatical Relations, 151–182. London/New
York: Academic Press.
Halliday, Michael A. K. (1973): Explorations in the Functions of Language. London: Arnold.
Haspelmath, Martin (1999a): Optimality and diachronic adaptation. Zeitschrift für Sprachwis-
senschaft 18: 180–205.
Haspelmath, Martin (1999b): Why is grammaticalization irreversible? Linguistics 37:
1043–1068.
Hockett, Charles F. (1960): The origin of speech. Scientific American 203: 88–96.
Idiotikon (1881–): Schweizerisches Idiotikon. Wörterbuch der schweizerdeutschen Sprache.
Begonnen von Friedrich Staub und Ludwig Tobler und fortgesetzt unter der Leitung
von Albert Bachmann, Otto Gröger, Hans Wanner, Peter Dalcher, Peter Ott, Hanspeter
Schifferle. Frauenfeld: Huber.
Itkonen, Esa (2005): Analogy as Structure and Process: Approaches in Linguistics, Cognitive
Psychology and Philosophy of Science. Amsterdam/Philadelphia: John Benjamins.
Keller, Rudi (1994): Language Change: The Invisible Hand in Language. London: Routledge.
Kiparsky, Paul (1973): Abstractness, opacity and global rules. In: Osamu Fujimura (ed.), Three
Dimensions of Linguistic Theory, 57–86. Tokyo: Tokyo Institute for Advanced Studies of
Language.
Kiparsky, Paul (1982): Explanation in Phonology. Dordrecht: Foris.
Kiparsky, Paul (2012): Grammaticalization as optimization. In: Dianne Jonas, John Whitman and
Andrew Garrett (eds.), Grammatical Change: Origins, Nature, Outcomes, 15–51. Oxford
et al.: Oxford University Press.
Knop, Daniel (2008): Seeigel im Meerwasseraquarium. Münster: Natur und Tier.
Syntactization, analogy and distinction between causations 263
Lambrecht, Knud (1994): Information Structure and Sentence Form. Topic, Focus, and the
Mental Representations of Discourse Referents. Cambridge: Cambridge University Press.
Mayr, Ernst (1997): This is Biology. The Science of the Living World. Cambridge, MA/ London:
Harvard University Press.
Musan, Renate (2002): Informationsstrukturelle Dimensionen im Deutschen. Zur Variation der
Wortstellung im Mittelfeld. Zeitschrift für germanistische Linguistik 30: 198–221.
Nesse, Randolph M. (2009): Evolutionary and proximate explanations. In: David Sander and
Klaus R. Scherer (eds.), The Oxford Companion to Emotion and the Affective Sciences,
158–159. Oxford: Oxford University Press.
Nübling, Damaris (1992): Klitika im Deutschen. Tübingen: Narr.
Percival, Keith W. (1971): The Neogrammarian approach to syntactic change. Manuscript
presented at the Twenty-Fourth Annual University of Kentucky Foreign Language
Conference in Lexington, Kentucky, 22–24 April 1971. http://people.ku.edu/~percival/
NeogramSyntax.html.
Rosenbach, Annette (2008): Language change as cultural evolution: Evolutionary approaches
to language change. In: Regine Eckardt, Gerhard Jäger and Tonjes Veenstra (eds.),
Variation, Selection, Development : Probing the Evolutionary Model of Language Change –
Proceedings of Blankensee Colloquium 2005, 23–72. Berlin/New York: Mouton de Gruyter.
Schöpf, Johann Baptist (1866): Tirolisches Idiotikon. Innsbruck: Wagner.
Seiler, Guido (2002): Prepositional dative marking in Upper German: a case of syntactic
microvariation. In: Sjef Barbiers, Susanne van der Kleij and Leonie Cornips (eds.), Syntactic
Microvariation, 243–279. Amsterdam: Meertens Instituut. Available at: www.meertens.
knaw.nl/projecten/sand/synmic/.
Seiler, Guido (2003): Präpositionale Dativmarkierung im Oberdeutschen. Stuttgart: Steiner.
Seiler, Guido (2004): The role of functional factors in language change. An evolutionary
approach. In: Ole Nedergaard Thomsen (ed.), Competing Models of Linguistic Change.
Evolution and beyond, 163–182. (Current Issues in Linguistic Theory 279.) Amsterdam/
Philadelphia: John Benjamins.
Siewierska, Anna (1991): Functional Grammar. London: Routledge.
Vallduví, Enric (1992): The Informational Component. New York: Garland.
van Buskirk, Josh and Benedikt R. Schmidt (2000): Predator-induced phenotypic plasticity in
larval newts: trade-offs, selection, and variation in nature. Ecology 81: 3009–3028.
Rena Torres Cacoullos, Penn State University
Gradual loss of analyzability:
Diachronic priming effects
1 Introduction
How do grammatical units come about, and how can change in constituency be
observed? Reanalysis is widely invoked by linguists of otherwise different per-
suasions as a pivotal mechanism of syntactic change. Reanalysis is understood
to change underlying structure, including constituency and syntactic-category
labels (Campbell 1998: 284). For example, the English future auxiliary is said to
result from reanalysis of the purposive motion construction of main verb go with
a non-finite clause complement, as represented by rebracketing of some kind:
[BE going [to Verb]] > [BE going to Verb] (Hopper and Traugott 1993: 3). A material
indication of such reanalysis would be phonetic reduction of going to to gonna.
Reanalysis has been seen as abrupt, following from the view that each word
sequence must have a unique constituent analysis, which in turn follows from
the formalist (generative) view that proposed syntactic rules or constraints are
categorical and that syntactic categories, for example, main vs. auxiliary verb,
are discrete. But the facts of synchronic variation, as between going to and gonna
(e. g., Poplack and Tagliamonte 1999: 328–332), disturb an understanding of
grammatical change as abrupt reanalysis.
266 Rena Torres Cacoullos
Latin did not have a dedicated morpheme or construction for progressive aspect,
the simple Present serving this function among others (Allen and Greenough 1916:
293, § 465). Probably the most common source for progressives crosslinguistically
is locative expressions (Bybee, Perkins and Pagliuca 1994: 127–133; cf. Comrie
1976: 98–105). Beginning from the earliest Spanish texts, we find gerunds (-ndo
forms) combining with finite forms of spatial (locational, postural or movement)
verbs. Besides estar ‘be (at)’, these were usually ir ‘go’, andar ‘walk, go around’,
venir ‘come’, salir ‘go out’, quedar ‘remain, stand still’. Examples are (1a), with ir,
and (1b), with venir.
Allen and Greenough (1916: 819, § 507) give a medieval Latin example of this
general Spatial Verb + Verb-ndo (gerund) construction, cum una dierum flendo
sedisset, quidam miles generosus iuxta eam equitando venit (Gesta Romanorum,
66 [58]) ‘as one day she sat weeping, a certain knight came riding by’ (Gesta
Romanorum, 66 [58]).
In Torres Cacoullos (2000) I adduced evidence for the origins of Spanish
Progressive ESTAR (< Latin stare ‘stand’) + Verb-ndo (gerund) as a locative
expression ‘be located somewhere Verb-ing’ from its early distributions across
co-occurring locatives (most frequently with en ‘in’) and verbs (most frequently
hablando ‘talking’, other verbs of speech, esperando ‘waiting’, and verbs of body
activity). These co-occurring elements are consonant with being stationary in a
given place. A 13th century example appears in (2). In contrast, gerund combina-
tions with motion verbs ir ‘go’ and andar ‘walk (around)’ tended to co-occur with
other kinds of locatives (a ‘to’, por ‘along’) and verb classes (motion, process,
general activity).
The key construct in variation theory is the linguistic variable (Labov 1969), a set
of variants which “are used interchangeably to refer to the same states of affairs”
(Weiner and Labov 1983: 31), i. e. “alternative ways of saying the same thing”
(Labov 1982: 22). In the pair of examples from a 19th century play in (3), the “same
thing”, or grammatical function, is present progressive and the “alternative
ways”, or variants, are the Progressive and simple Present forms. In the English
translation, PROG designates the Progressive – ESTAR + Verb-ndo – as in (3a),
PRS the simple Present, as in (3b). Both forms here express a situation in progress
at the moment of speech.
Table 1: Data for the study of Progressive – simple Present variation in present temporal
reference contexts
No. texts 17 15 28
Word count 2,500,000 600,000 900,000
N Progressive 119 180 317
N simple Present 4291 564 663
All tokens of both forms were coded according to a number of hypotheses about
variant choice, operationalized as factors based on the presence or absence of lin-
guistic elements of the context in which the token occurs. Included in the factor
groups (independent or predictor variables, or constraints) are co-occurrence of
locative adverbials, aspectual reading and priming. The linguistic conditioning of
variant selection is instantiated in probabilistic associations of forms with con-
textual elements. A multivariate model of the variation is presented in section 4
ahead, after we first consider evidence from distributional analysis and frequency
counts, below.
Spanish Progressive ESTAR + Verb-ndo would seem a good candidate for change
via either reanalysis or loss of analyzability. The change in constituent structure
would be from a sequence of two independent units – a finite form of main verb
estar ‘to be (located)’ with a gerund -ndo complement – to a single periphrastic
unit, in which estar is an auxiliary and the gerund is the main verb (4).
1 For the 13th–15th century simple Present sample, tokens of lexical types appearing in the Pro-
gressive were not extracted from Grimalte y Gradissa and Crónica de los Reyes Católicos, for which
electronic versions were not available (three Progressive tokens each); also omitted were Present
tokens of frequent decir ‘say’ in Corbacho (of which there was one Progressive token). More on
the texts, the simple Present sampling and exclusions is given in Torres Cacoullos (2012).
270 Rena Torres Cacoullos
Whereas in going to the items are contiguous, here we have a schematic construc-
tion with an intervening slot for the open class of items, the Verb. In the absence
of surface phonetic reduction, as with English future gonna, what evidence could
be assembled for the status of ESTAR + Verb-ndo as a unitary constituent?
We may take the obverse of analyzability to be unithood, operationalized as
the proportion of the instances of the construction in which the adjoining items
behave as a single unit, i. e. as one word. In previous work (e. g., Torres Cacoullos
2006; Torres Cacoullos and Walker 2011) we developed unithood indices from dis-
tributional analysis, which tracks proportions of tokens of an expression across
its contexts of occurrence. Increasing unithood of ESTAR + Verb-ndo has been
inferred from a decreasing proportion of occurrences with elements intervening
between estar and the gerund, with more than one gerund per estar, or with the
gerund preceding estar (Torres Cacoullos 2000: 31–55; Bybee and Torres Cacoul-
los 2009: 201–203; Torres Cacoullos 2012: 79).
A more direct index of unithood is the positioning of object pronouns, which
precede finite verb forms in modern Spanish (Torres Cacoullos 1999b). In (5a) the
object pronoun (underlined) is postposed to the gerund (is telling him), in (5b)
it is preposed to estar (literally, ‘it are saying’). The latter configuration, known
as “clitic climbing”, has been viewed in generative syntax as a restructuring of a
series of verbs into a single verbal complex (e. g., Rizzi 1982). In a functionalist
view, “clitic climbing” has been seen as a manifestation of the grammatical-
ization of auxiliaries, as a verb comes to express grammatical (e. g., aspectual,
progressive) more than lexical (e. g., spatial, locative) meaning (Myhill 1988).
– P
ero que nosotros tampoco les vamos a dar cien días. Vamos a decir lo que nos parezca
desde hoy.
– Ya lo estamos dicie-ndo.
already acc.3sg be.prs.1pl tell-ger
‘But we’re not going to give them a hundred days. We’re going to say what we think
starting today.’
‘We [it] are already saying it’
(20th c., CORLEC, CDEB014A, p215–p216)
Gradual loss of analyzability: Diachronic priming effects 271
In the 15th century example in (5a) ESTAR + Verb-ndo is compatible with locative
meaning, indicated by co-occurring allá ‘there’ in the same clause and the motion
verb voy ‘I go’ in a previous clause: the speaker will go to where the person rep-
resented metonymically by his heart is located (está…allá ‘is…there’). There is at
the same time aspectual meaning, as conveyed by se me figura ‘I can imagine’: the
situation referred to by the gerund is in progress at speech time. In comparison,
spatial meaning appears at best attenuated in the 20th century example in (5b),
where most prominent is aspectual meaning, indicated by co-occurring temporal
adverbial ya ‘already’: the speaker asserts that the verbal situation (diciendo
‘saying’) is in progress.
In Table 2, though the count of all eligible cases is low, there is a clear trend
of increased rates of placement before estar (proclisis). Increasing placement of
object pronouns before the whole complex (as with single finite verbs), rather
than attached to the gerund, can be taken as an indication of enhanced unithood.2
2 The 19th and 20th century rate of preposed object clitics shown in Table 2 is higher than for
all tenses of ESTAR + Verb-ndo (respectively, 70 %, 54/77 in the same texts (reported in Bybee
and Torres Cacoullos 2009: 203) and 89 %, 103/115 in Mexico City “habla popular” (UNAM 1976)
(reported in Torres Cacoullos 1999b: 146). This is consonant with Progressive grammaticaliza-
tion advancing in present before past tenses (Torres Cacoullos 2012: 110, n. 3) (whereas habitual
markers are said to appear in past before generalizing to present temporal reference contexts
(Bybee et al. 1994: chapter 5)).
3 In χ2 tests, difference between 17th and 19th p < 0.06 (n. s.), between 19th and 20th p < 0. 05.
13th–15th and 17th century totals include object pronouns placed between estar and the gerund
(e. g., ell Aguila esta la remira<n>-do, GEII, fol. 189v) (N = 2, N = 3, respectively).
4 20th century = CORLEC (Marcos Marín 1992) “conversacional” portion (see Table 1 for remain-
ing data).
272 Rena Torres Cacoullos
operations such as “movement and “merge” (e. g., Roberts and Roussou 2003)
makes no predictions about frequency of use, under the assumption that usage
does not impinge on grammar (e. g., Newmeyer 2003).
Table 3 displays three frequency counts for ESTAR + Verb-ndo, one absolute,
i. e. the token frequency of the construction, and two relative, namely the propor-
tion it constitutes of gerund constructions vis-à-vis other spatial auxiliaries and
its rate relative to the simple Present. The first count, in the first row, is straight-
forward text frequency normalized per 100,000 words (based on the figures given
in Table 1 above), by which there is an evident rise (cf. Torres Cacoullos 2012: 77).
The second frequency measure is the proportion that ESTAR + Verb-ndo con-
stitutes as an instance of the general gerund construction with finite forms of
spatial verbs. Some of these gerund constructions (especially with ir and, in some
dialects, andar) remain robust in modern varieties of Spanish (Torres Cacoullos
1999a). Nevertheless, as Table 2 shows in the second row, the aspectual auxiliary
is increasingly more likely to be estar than another spatial verb (cf. Torres Cacoul-
los 2000: 55–60). We can think of this as a measure of string frequency, which
may indicate “chunk status” (Brown and Rivas 2011: 42–43).
5 20th century = CORLEC (Marcos Marín 1992) “conversacional” portion: word count 241K,
N ESTARPres + Verb-ndo 364.
6 Counts from a subset of texts in Table 1: for 3th–15th century from Calila, GE, Celestina, CRC
(andar (17), ir (39), venir (10)); 17th from Quijote (andar (12), ir (60), quedar (5)); 19th century, from
Pepita, Perfecta, Regenta, Pazos (andar (6), ir (39), seguir (5)); 20th century from CORLEC (ir (47),
seguir (20), venir (4)).
7 Count from 15th century Corbacho and Celestina only, furthermore not counting simple Present
decir ‘say, tell’ in the Corbacho.
Gradual loss of analyzability: Diachronic priming effects 273
The position of Weinreich, Labov and Herzog (1968: 101) is that “command of het-
erogeneous structures is not a matter of multidialectalism or ‘mere’ performance,
but is part of unilingual linguistic competence”. In this view, systematic variation
belongs to (a single) grammar (Cedergren and Sankoff 1974: 334).
However, variation and apparent gradualness may be attributed to com-
peting grammars underlying a given form. Abruptness and discreteness may be
upheld in the face of observed variation by viewing the aggregate data as reflect-
ing the coexistence of multiple (generative) grammars. Language change is then
modeled as modification in the distribution of competing grammars over time
(e. g., Yang 2000). For example, the spread of English do-support across differ-
ent sentence types (e. g., negative declaratives and affirmative wh-object ques-
tions) is seen as “surface manifestations of a single change in grammar” (such
as loss of Verb-to-Infl movement) (Kroch 1989: 199; but see Bybee 2010: chapter
7). Because change is postulated to be a single abrupt change in a parameter
setting and the gradual time course is seen as representing a shift from the old
274 Rena Torres Cacoullos
invariant homogenous grammar to the new one, the contexts of the change must
be uniform in their effects.
According to this scenario a new structure such as do-support may be favored
earlier in some contexts than others and begin in those contexts with a higher
rate, but the rate of change is constant across contexts; in terms of linguistic con-
ditioning, the effects remain fixed in magnitude and direction as the change is
propagated (Kroch 1989: 206).
Returning to ESTAR + Verb-ndo, if the change is abrupt and it is the propaga-
tion of change (e. g., across authors and texts) that is gradual, we should observe
that the frequency of the newer variant – the Progressive – relative to the older
one – the simple Present – increases at a constant rate, uniformly, across linguis-
tic contexts. But if the change itself is a gradual modification of the grammar – one
with inherent variability – it is possible that the linguistic contexts and conditions
could vary across the course of the change. That is, the rate of occurrence of the
Progressive could increase differentially across linguistic contexts.8 Does it?
2. direction of effect, with probabilities (or factor weights, shown in the bordered set of
columns) closer to 1 indicating a favoring, and closer to 0 a disfavoring, effect on ESTAR +
Verb-ndo. That is, the closer to 1 the probability, the greater the likelihood of the Progressive
in each of the contexts (factors) listed on the left;9
8 I thank Greg Guy for help in formulating the competing predictions about contextual effects.
Thanks to Shana Poplack and Catherine Travis for extensive comments on an earlier version of this
paper, and also to the editors of the volume, Aria Adli, Göz Kaufmann and Marco García García.
9 Factor weights for non-significant groups, from the first “step down” run, in which all groups
Gradual loss of analyzability: Diachronic priming effects 275
3. relative magnitude of effect, as assessed by the Range (shown in italics) between the favor-
ing and the disfavoring probability within each (binary) factor group.
For the Aspect factor group, I coded tokens of both the Progressive and simple
Present as ‘limited duration’ if the aspectual reading was one of progressive or
continuous (Comrie 1976: 33), as in the pair of examples in (3) and (6), above.
Limited duration also applies to stative predicates when the situation is tem-
porally circumscribed, or bound to speech time, again for both variants, as in (7).
‘Extended duration’, on the other hand, subsumes habitual aspect for dynamic
verbs, and states without temporal limits, which exist indefinitely, as in (8) (on
coding for aspect, see Torres Cacoullos 2012: 87–91).
are included in the regression, are provided within brackets to indicate direction of effect (Po-
plack and Tagliamonte 2001: 93–94).
Table 4: Three independent Variable rule analyses of linguistic factors contributing to selection of the Progressive10
276
Range 33 54 59
Priming
Preceding estar + X construction .76 56 % 9/16 .69 55 % 13/26 [.49] 30 % 14/46
Preceding ‘Other’ tenses .54 21 % 41/194 .53 27 % 70/256 [.56] 38 % 117/310
Preceding simple Present .46 19 % 55/296 .47 20 % 84/430 [.46] 27 % 133/486
Range 30 22
Polarity – Sentence type
Affirmative declarative .54 24 % 106/440 .57 27 % 164/598 .58 37 % 285/778
Negative, Interrogative .31 13 % 11/86 .18 8% 9/116 .18 13 % 20/160
Range 23 39 40
Temporal co-occurrence
Present .76 33 % 25/76 [.54] 28 % 29/103 .60 37 % 50/135
Absent .45 20 % 93/469 [.49] 24 % 150/639 .48 32 % 266/839
Range 31 12
Stativity
Dynamic predicate [.49] 21 % 93/435 [.52] 26 % 153/593 .51 34 % 282/840
Stative predicate [.56] 23 % 26/113 [.43] 18 % 27/151 .44 25 % 35/140
Range 7
10 Non-significant factors are shown within square brackets. Ns in some factor groups do not add up to total N because of excluded factors or uncodable
tokens.
Gradual loss of analyzability: Diachronic priming effects 277
11 This is not a strict mathematical rule. The goal of the stepwise procedure in Variable-rule
analysis is to find the set of factor groups which jointly account for the variation and is not pri-
marily meant to order these factor groups according to any criterion. In the analyses shown in
Table 4, the order of the factor groups according to the Range is consistent with that suggested by
278 Rena Torres Cacoullos
that in the 13th–15th century the Ranges are comparable (with a ratio of 33:43 =
0.8) but that in the 17th century the Range for Aspect is twice as great (54:29 =
1.9) and in the 19th century it is three times greater (59:19 = 3.1). Here we have the
answer to the question of whether the Progressive spreads differentially across
linguistic contexts. It does. This contradicts the hypothesis of gradualness in the
propagation of change but abruptness in grammatical change itself.
The weakening of the favoring effect of co-occurring locatives, as the prob-
abilities get farther from 1 over time, may be taken as a measure of the loss of
source-construction meaning, known as semantic bleaching (“depletion” (Givón
1975: 94)), in the course of grammaticalization. Furthermore, we can note that
ESTAR + Verb-ndo is increasingly disfavored in reference to extended duration
(habitual, indefinitely-existing-state) situations, which are becoming more the
province of the simple Present, as probability values get closer to zero (at .35,
.16 and .12, in the 13th–15th, 17th and 19th centuries, respectively). This means that
an aspectual opposition with the simple Present has developed gradually: the
originally more locative construction is used more and more as an aspectual
expression of limited, as opposed to extended, duration. The developing – but
not (yet) obligatory – Progressive – simple Present opposition is illustrated in (9),
sleeping in progress (9a) vs. a habitual mode of sleeping (9b). Thus, in the course
of speakers’ recurrent choices of variants (as in (3), (6)–(8)), among which dis-
tinctions in aspectual function can be neutralized in discourse (Sankoff 1988a),
the newer and older variant may gradually become aspectually more distinct
(Torres Cacoullos 2012).
the significance of the change in log likelihood, from the first step of the “step down” procedure
when the least important group gets “cut”: most important in Old Spanish are Aspect, Locative
and Temporal (all p = .000), followed by Polarity-Sentence type (p = .001) and Priming (p = .008);
in the 17th century Aspect, Locative and Polarity-Sentence type (all p = .000), followed by Priming
(p = .032); in the 19th century Aspect and Polarity-Sentence type (both p = .000), followed by
Locative (p = .014), Stativity (p = .019), and Temporal (p = .040). The same holds according to the
order of selection in the “step up” procedure, another indication of relative magnitude of effect,
except for the selection of Aspect before Locative co-occurrence in Old Spanish and of Stativity
before Locative co-occurrence in the 19th century.
Gradual loss of analyzability: Diachronic priming effects 279
The priming effect shown in Table 4 is different. Here the question is whether
non-Progressive estar, i. e., in other than a gerund construction, “triggers” the
Progressive. Thus, we consider preceding use of non-Progressive estar construc-
tions of the schematic form ESTAR + X, including locative (11), predicate adjective
(12) and resultative (13) constructions.
lexical repetition, other preceding periphrastic forms, and the distance at which
a preceding ESTAR construction has an effect.12
I propose that by considering whether other estar constructions prime ESTAR
+ Verb-ndo we obtain a measure of its analyzability. If ESTAR + Verb-ndo is “ana-
lyzable” – with internal structure and component parts that are recognizable
as individual words (section 1), namely a finite form of estar and the gerund of
another verb – it should also be primed by other constructions composed of estar
and another unit. If, on the other hand, the Progressive is no longer analyzable,
having become, in reanalysis parlance, a single constituent or periphrastic unit
(Section 3), no such priming effect should hold.
The multivariate analysis in Table 4 shows that the Progressive is favored
by a preceding non-Progressive estar construction in the 13th–15th and in the 17th
century. This corresponds to Szmrecsanyi’s (2005: 139) β-persistence. Such an
effect could be taken as lexical, estar to estar, or as structural, estar + X to estar +
X, priming (where X encompasses various word classes or syntactic roles). Either
way, we can think of this kind of priming as based on associations between sub-
units of constructions (for example, between English auxiliary go in the future
construction and lexical verb go in various motion constructions) as opposed to
priming based on syntactic identity of the entire unit (as when BE going to Verb
primes itself or, here, when ESTAR + Verb-ndo primes itself).
The priming by non-Progressive estar constructions is as expected, if the Pro-
gressive has an analyzable internal structure. It also suggests that, since estar has
independently increased in frequency to the detriment of copula ser ‘be’ in several
constructions (Silva-Corvalán 1994: 94–95 and references therein), priming may
be part of the explanation for the advancement of the Progressive in Spanish.
In the 19th century, however, which, as we saw, is when choice of the Progres-
sive is most strongly favored by limited duration aspect (section 4.2), the priming
effect is no longer significant (nor is there a discernable direction of effect). This
disappearance of the earlier priming effect – ESTAR + X no longer triggers ESTAR
12 On the length of discourse over which priming may operate, see Labov 1994: 567, Travis 2007:
110, 128–129 and references therein. Patterns seemed to be the same when I excluded the first
instance of the variable in a speaker turn (in the plays) or in a stretch of discourse attributed to
a character (in the novels) as when I included such tokens, if the preceding finite verb appeared
in speech directed by the interlocutor to the speaker (for example, -¿Dónde está? –Está … está
viajando ‘Where is she? / She is…is travelling’ (Acertar errando, Act 3, Scene XV)) or if the token
is separated from the same speaker’s preceding finite verb by an interlocutor’s turn having no
finite verb (for example, – […] ¿Estás ya contento? – (Va a arrodillarse para besarle la mano.)
¡Padre mío! – ¿Qué haces, Eduardo? ‘Are you content now? / Dear father! (kneeling to kiss his
hand) / What are you doing, Eduardo? (Amor de padre, Act 2, Scene I).
282 Rena Torres Cacoullos
6 Conclusion
Change is the process of replacement, not the outcome of that process. When we study the
process directly we are immediately confronted with the heterogeneous character of lin-
guistic systems. Change implies variation; change is variation [italics in original]. […] [The
progress of change] is rarely represented by the categorical replacement of one form by
another, but normally by changes in the relative frequencies of the variants and changes in
their environmental constraints [my italics].
References
Allen, Henry J. and J. B. Greenough (1916): Allen and Greenough’s new Latin grammar for
schools and colleges, founded on comparative grammar. Boston et al.: Ginn.
Bock, Kathryn J. and Zenzi M. Griffin (2000): The persistence of structural priming: Transient
activation or implicit learning. Journal of Experimental Psychology: General 129(2):
177–192.
Brown, Esther L. and Javier Rivas (2011): Subject-verb order in Spanish interrogatives: a
quantitative analysis of Puerto Rican Spanish. Spanish in Context 8: 23–49.
Bybee, Joan (2003): Mechanisms of change in grammaticization: the role of frequency. In: Brian
D. Joseph and Richard D. Janda (eds.), The handbook of historical linguistics, 602–623.
Oxford: Blackwell.
Bybee, Joan (2010): Language, usage and cognition. Cambridge: Cambridge University Press.
Bybee, Joan, Revere Perkins and William Pagliuca (1994): The evolution of grammar: Tense,
aspect, and modality in the languages of the world. Chicago: University of Chicago Press.
Bybee, Joan and Rena Torres Cacoullos (2009): The role of prefabs in grammaticization: How
the particular and the general interact in language change. In: Roberta L. Corrigan, Edith
A. Moravcsik, Hamid Ouali and Kathleen Wheatley (eds.), Formulaic language: Volume 1.
Distribution and historical change, 187–217. Amsterdam: John Benjamins.
Campbell, Lyle (1998): Historical linguistics: an introduction. Cambridge, MA: MIT Press.
Cedergren, Henrietta J. and David Sankoff (1974): Variable rules: performance as a statistical
reflection of competence. Language 50(2): 333–355.
Comrie, Bernard (1976): Aspect. Cambridge: Cambridge University Press.
Croft, William and Alan D. Cruse (2004): Cognitive linguistics. Cambridge: Cambridge University
Press.
Givón, Talmy (1975): Serial verbs and syntactic change: Niger-Congo. In: Charles N. Li (ed.),
Word order and word order change, 47–112. Austin: University of Texas Press.
Hay, Jennifer (2001): Lexical frequency in morphology: is everything relative? Linguistics 39:
1041–1070.
Hopper, Paul J. (1991): On some principles of grammaticization. In: Elizabeth Closs Traugott
and Bernd Heine (eds.), Approaches to grammaticalization (Volume 1), 17–35. Amsterdam:
John Benjamins.
Hopper, Paul J. and Elizabeth Closs Traugott (1993): Grammaticalization. Cambridge: Cambridge
University Press.
Kroch, Anthony (1989): Reflexes of grammar in patterns of language change. Language
Variation and Change 1: 199–244.
Labov, William (1969): Contraction, deletion, and inherent variability of the English copula.
Language 45: 715–762.
Labov, William (1982): Building on empirical foundations. In: Winfred P. Lehmann and Yakov
Malkiel (eds.), Perspectives on historical linguistics, 11–92. Amsterdam: John Benjamins.
Labov, William (1994): Principles of linguistic change: Internal factors (Volume 1). Oxford:
Blackwell.
Langacker, Ronald (1987): Foundations of cognitive grammar: theoretical prerequisites (Volume
1). Stanford, CA: Stanford University Press.
Marcos Marín, Francisco (dir.) (1992): Corpus de Referencia de la Lengua Española Contem-
poránea Peninsular (CORLEC), http://www.lllf.uam.es/ING/Info%20Corlec.html (accessed
April 2004).
284 Rena Torres Cacoullos
Myhill, John (1988): The grammaticalization of auxiliaries: Spanish clitic climbing. Berkeley
Linguistics Society 14: 352–363.
Newmeyer, Frederick (2003): Grammar is grammar and usage is usage. Language 79(4):
682–707.
Poplack, Shana and Elisabete Malvar (2007): Elucidating the transition period in linguistic
change: The expression of the future in Brazilian Portuguese. Probus 19: 121–169.
Poplack, Shana and Sali Tagliamonte (1999): The grammaticization of going to in (African
American) English. Language Variation and Change 11: 315–342.
Poplack, Shana and Sali Tagliamonte (2001): African American English in the diaspora: tense
and aspect. Oxford: Blackwell.
Rizzi, Luigi (1982): A restructuring rule in Italian syntax. In: Luigi Rizzi (ed.), Issues in Italian
syntax, 1–48. Dordrecht: Foris.
Roberts, Ian and Anna Roussou (2003): Syntactic change: a minimalist approach to grammati-
calization. Cambridge: Cambridge University Press.
Sankoff, David (1988a): Sociolinguistics and syntactic variation. In: Frederick J. Newmeyer (ed.),
Linguistics: The Cambridge survey (Volume IV), 140–161. Cambridge: Cambridge University
Press.
Sankoff, David (1988b): Variable rules. In: Ulrich Ammon, Norbert Dittmar and Klaus J. Mattheier
(eds.), Sociolinguistics: An international handbook of the science of language and society,
984–997. Berlin/New York: Walter de Gruyter.
Sankoff, David, Sali Tagliamonte and Eric Smith (2005): GOLDVARB X: A multivariate analysis
application for Macintosh and Windows. <http://individual.utoronto.ca/tagliamonte/
Goldvarb/GV_index.htm>.
Scheibman, Joanne (2000): I dunno but... a usage-based account of the phonological reduction
of don’t. Journal of Pragmatics 32: 105–124.
Silva-Corvalán, Carmen (1994): Language contact and change. Spanish in Los Angeles. Oxford:
Clarendon Press.
Szmrecsanyi, Benedikt (2005): Language users as creatures of habit: A corpus-based analysis
of persistence in spoken English. Corpus Linguistics and Linguistic Theory 1(1): 113–150.
Torres Cacoullos, Rena (1999a): Variation and grammaticization in progressives: Spanish -ndo
constructions. Studies in Language 23–1: 25–59.
Torres Cacoullos, Rena (1999b): Construction frequency and reductive change: diachronic
and register variation in Spanish clitic climbing. Language Variation and Change 11(2):
143–170.
Torres Cacoullos, Rena (2000): Grammaticization, synchronic variation, and language contact:
a study of Spanish progressive -ndo constructions. Amsterdam: John Benjamins.
Torres Cacoullos, Rena (2006): Relative frequency in the grammaticization of collocations:
nominal to concessive a pesar de. In: Timothy L. Face and Carol A. Klee (eds.), Selected
proceedings of the 8th Hispanic Linguistics Symposium, 37–49. Somerville, MA: Cascadilla
Proceedings Project.
Torres Cacoullos, Rena (2012): Grammaticalization through inherent variability: The
development of a progressive in Spanish. Studies in Language 36(1): 73–122.
Torres Cacoullos, Rena and James A. Walker (2011): Collocations in grammaticalization and
variation. In: Bernd Heine and Heiko Narrog (eds.), Handbook of Grammaticalization,
225–238. Oxford: Oxford University Press.
Travis, Catherine E. (2007): Genre effects on subject expression in Spanish: Priming in narrative
and conversation. Language Variation and Change 19(2): 101–135.
Gradual loss of analyzability: Diachronic priming effects 285
UNAM (1976): El habla popular de la Ciudad de México: Materiales para su estudio. Mexico City:
Centro de Lingüística Hispánica.
Weiner, Judith E. and William Labov (1983): Constraints on the agentless passive. Journal of
Linguistics 19(1): 29–58.
Weinreich, Uriel, William Labov and Marvin Herzog (1968): Empirical foundations for a theory of
language change. In: Winfred P. Lehmann and Yakov Malkiel (eds.), Directions for historical
linguistics, 95–188. Austin: University of Texas Press.
Yang, Charles D. (2000): Internal and external forces in language change. Language Variation
and Change 12(3): 231–250.
Appendix: Texts
Author/Title Source/Edition
Listed in chronological order Unless otherwise indicated, electronic
texts were downloaded from Biblioteca
Virtual Miguel de Cervantes, http://www.
cervantesvirtual.com/
Author/Title Source/Edition
Listed in chronological order Unless otherwise indicated, electronic
texts were downloaded from Biblioteca
Virtual Miguel de Cervantes, http://www.
cervantesvirtual.com/
Hernando del Pulgar, Crónica de los Reyes Juan de Mata Carriazo (ed.). Madrid:
Católicos Espasa Calpe, 1943
Fernando de Rojas, Fernando, La Celestina Dorothy S. Severin (ed.), Madrid: Cátedra,
1987
Íñigo López de Mendoza, marqués de Santillana,
Sonetos
Juan del Enzina [14 Auctos, Eglogas, Represen-
taciones]
Miguel de Cervantes Saavedra, Don Quijote de
la Mancha; Comedia famosa de La casa de los
celos y selvas de Ardenia
Lope de Vega, La dama boba; Comedia del Prín-
cipe Ynocente; La vengadora de las mujeres
Guillén de Castro, El amor constante
Ruiz de Alarcón y Mendoza, La Amistad cas-
tigada
Tirso de Molina, Don Gil de las calzas verdes;
Por el sótano y el torno; Amor y celos hacen
discretos [Act I]; La villana de Sagra
Gaspar de Ávila, La boca y no el corazón o Fingir
por conserver
Calderón de la Barca, Casa con dos puertas mala
es de guarder; La dama duende; Amor honor
y poder
Leandro Fernández de Moratín, La comedia J. Dowling & R. Andioc (eds.). Madrid: Cas-
nueva; El sí de las niñas talia, 1975
Manuel Bretón de los Herreros, A Madrid me
vuelvo
José María de Carnerero, El afán de figurar
Ventura de la Vega, Acertar errando, o El cambio
de diligencia
José de Mariano Larra, Los inseparables
Duque de Rivas, Don Álvaro o la fuerza del sino
Francisco Martínez de la Rosa, La boda y el
duelo; Amor de padre
Pablo Alonso de la Avecilla, Los presupuestos
Luis Mariano de Larra, La paloma y los halcones;
La primera piedra
Manuel Tamayo y Baus. No hay mal que por bien
no venga
Juan Valera, Pepita Jiménez
Gradual loss of analyzability: Diachronic priming effects 287
Author/Title Source/Edition
Listed in chronological order Unless otherwise indicated, electronic
texts were downloaded from Biblioteca
Virtual Miguel de Cervantes, http://www.
cervantesvirtual.com/
Abstract: This paper evaluates the relationship between usage and systemati
city in language from the perspective of usage-based linguistics. In particular,
it investigates the diachronic effects of the phenomena of entrenchment and
persistence on the development of morphosyntactic alternations. Both entrench-
ment and persistence depend on a language user’s experience with language:
They lead to a (temporary) strengthening of the cognitive representation of a
linguistic item. For this reason, both processes can lead to the conservation of
disappearing grammatical constructions. In order to evaluate this hypothesis, a
quantitative analysis of the historical changes in Spanish auxiliary selection is
proposed. There is a higher probability for speakers to select ‘be’ over ‘have’ as a
perfect auxiliary if ‘be’ + participle (PtcP) has already appeared in the preceding
co-text. Over time, this effect becomes stronger. The greater dependence of ‘be’
selection on persistence effects in later stages of the process by which ‘be’ was
replaced with ‘have’ suggests that the cognitive mechanism of persistence can be
understood as a type of weak entrenchment with a conserving effect.
1 Introduction
From the perspective of usage-based linguistics (UBL, cf. Langacker 1987; Bybee
2006, 2007, 2010), there is a strong relationship between usage and systematicity.
Whereas many traditional approaches assume that linguistic structure is system-
atic in order to allow for communication, UBL suggests that because a language
is used as a means of communication, its structures acquire systematicity.2 For
UBL, usage frequency is of crucial importance in this process. For one thing,
1 I wish to express my gratitude to the organizers of the Freiburg conference for the invitation.
In addition, I would like to thank the participants and in particular, my reviewers Rena Torres
Cacoullos and Göz Kaufmann, for helping me to develop my ideas.
2 Consequently, it is necessary to distinguish “systematicity” from “system” in the structuralist
sense (a set of paradigmatic oppositions through which (grammatical) meaning arises). System-
aticity is a matter of degree: Some grammatical functions can be expressed more systematically
than others. For instance, the Modern French “system” of intransitive auxiliary selection is rather
290 Malte Rosemeyer
inconsistent, with some verb classes – such as verbs that express a change of location – typically,
but not always, selecting être (‘be’) (for an overview, see Kailuweit 2011).
How usage rescues the system: Persistence as conservation 291
summary of the findings and relates them to one of the general questions evalu-
ated in this volume, i. e. what is the link between language use and the system?
Several contributions in this volume (e. g., Haider) argue that there is a close
relationship between systematicity and language change. As already apparent in
Coseriu’s (1974) discussion on Saussure’s attitude to language change, languages
are historical objects. Coseriu argues that the reification of language (“langue”)
as an abstract system that exists independently of its speakers leads to insur-
mountable problems in the description of language change. If the functionality
of a language is the result of its systematicity, language change cannot be due
to system-internal factors. Consequently, from the perspective of Saussurian
structuralism, language change is an “unreal phenomenon caused by ‘external
factors’” (Coseriu 1974: 23, translation M. R.). However, languages do change.
Coseriu (1974: 23–24) argues that the “apparent aporia of language change” arises
from a confusion of perspectives. Languages are not functional because they
are systematic – rather, their functionality creates systematicity. Consequently,
languages change because speakers want to continue to be able to express their
thoughts with it: “A language, however, that is continuously [...] determined by
its function, is not complete, but perpetually emerging from concrete linguistic
actions: it is not εργον [work] but ενέργεια [working]” (Coseriu 1974: 24, trans-
lation M. R.). In contrast, dead languages like Latin are no longer functional
because they have stopped changing. In Coseriu’s view, change is thus an intrin-
sic property of language.
Coseriu’s approach to language change can be directly related to the more
recent concept of “emergentism” in linguistics (Hopper 1987; Bybee and Hopper
2001b; MacWhinney 2006). Regarding grammar, Hopper’s (1987: 142) concept of
Emergent Grammar has become very influential. In Hopper’s words,
[t]he notion of Emergent Grammar is meant to suggest that structure, or regularity, comes
out of discourse and is shaped by discourse as much as it shapes discourse in an on-
going process. Grammar is hence not to be understood as a pre-requisite for discourse, a
prior possession attributable in identical form to both speaker and hearer. Its forms are
not fixed templates but are negotiable in face-to-face interaction in ways that reflect the
individual speakers’ past experience of these forms and their assessment of the present
context, including especially their interlocutors, whose experiences and assessments may
be quite different. Moreover, the term Emergent Grammar points to a grammar which is not
abstractly formulated and abstractly represented, but always anchored in the specific con-
crete form of an utterance.
292 Malte Rosemeyer
3 Note that this use of the term “creativity” is not synonymous with creativity in Generative
Grammar. At least in one acceptation of the term, creativity in Generative Grammar refers to
the fact that a language user can only ever experience a fraction of all the possible sentences
in a language. However, he can “on the basis of this finite linguistic experience […] produce an
indefinite number of new utterances which are immediately acceptable to other members of his
speech community” (Chomsky 1975: 61). In contrast, the term “creativity” as used here refers
to an individual’s capacity to use a certain linguistic element in a novel function. This reinter-
pretation of the function of a linguistic element – fundamental to historical processes such as
grammaticalization – is possible due to analogical reasoning processes, and is motivated by con-
siderations of expressiveness.
How usage rescues the system: Persistence as conservation 293
passive is rather quick because of the previous acquisition of the formally similar
perfect construction. Since the learning of grammatical categories is crucially
dependent on highly frequent chunks, grammatical categories are organized in
terms of prototypicality (Goldberg 2006).
The skewed distribution of grammatical categories across lexical items can
have an influence on the directionality of language change. Thus, it has been
argued that entrenchment leads to the loss of analyzability of complex linguistic
items (Bybee and Hopper 2001a; Bybee 2006, 2007, 2010): “The more a sequence
of morphemes or words is used together, the stronger the sequence will become
as a unit and the less associated it will be to its component parts” (Bybee 2010:
48). This loss of analyzability can lead to the conservation of highly frequent
syntagms in processes of language change. Entrenchment causes highly
frequent syntagms to grow more and more autonomous from the constructions
to which they originally belonged. In extreme cases, the paradigmatic relation
between syntagm and mother construction may be severed. If the mother con-
struction is subject to a grammatical change, highly frequent syntagms belong-
ing to that construction will be less affected by that change than other related
but less frequent syntagms: “[…] frequent forms resist regularizing or other mor-
phological change with the wellknown result that irregular inflectional forms
tend to be of high-frequency. Assuming that regularization occurs when an
irregular form is not accessed and instead the regular process is used, it is less
likely that high-frequency inflected forms would be subject to regularization”
(Bybee 2010: 25). Processes of the analogical generalization (in Bybee’s terms,
“regularization”) of a construction leading to the disappearance of another
construction are counteracted by entrenchment. The intrusion of a new con-
struction into the usage contexts of another construction will first affect those
specific syntagms which are used less, and only afterwards specific syntagms
with a high absolute frequency of use. The global disappearance process related
to the analogical transfer of the competing construction is stalled in specific
instances. Consequently, a disappearing construction can survive in particular
instantiations until very late.
A second important aspect of the conservative language behavior of speakers
is covered by the concept termed “persistence” by Szmrecsanyi (2005, 2006).
Persistence refers to the notion of “production priming” in psycholinguistics
and “repetitiveness” in discourse analysis (Szmrecsanyi 2005: 116). Produc-
tion priming has been shown to be important in lexical (Neely 1977, 1991; Hoey
2004, 2005), phonological (Baddeley 1966; Griffin 2002) and syntactic domains
(Gries 2005; Travis 2007; Travis and Torres Cacoullos 2012). Put simply, the use
of a linguistic element raises the probability of the use of a formally or func-
tionally similar element in the following discourse. Persistence thus influences
294 Malte Rosemeyer
the speaker’s choice between linguistic elements that compete within a certain
envelope of variation.4 Consequently, “while it is corpus-linguistic standard prac-
tice to view successive occurrences of a variable as independent binomial trials
(like independent, unrelated throws of a dice), there may, in fact, exist inter-
actions between neighboring variables, depending on the syntagmatic proximity
between them” (Szmrecsanyi 2005: 115).
In his analysis of alternations such as the English future markers be going
to and will, Smzrecsanyi (2005) shows that the use of one variant in the pre-
ceding co-text significantly increases the probability of a speaker selecting the
same variant in the later co-text over the competing variant. Moreover, he dem-
onstrates that this effect crucially depends on the textual distance between the
persisting element and text passage where the envelope of variation applies
(Szmrecsanyi 2005: 119–120). Persistence effects thus decrease as the temporal
distance to the original stimulus increases: The stimulus becomes less and less
salient to the speaker. Smzrecsanyi argues that these observations have far-reach-
ing consequences for quantitative analyses of alternations in language, since
“system-internal” factors governing the speaker’s choice of one variant or the
other may, in fact, in some contexts be neutralized by persistence. Failing to take
into account the co-dependence between earlier and later utterances may distort
statistical models of constraints on language use.
Crucially, Smzrecsanyi recognizes that entrenchment and persistence result
from the same cause, namely the activation of cognitive representations of lin-
guistic experiences:
ability of use in the subsequent discourse. Repetition does not lead to a qual-
itatively different phenomenon, but merely reinforces this effect.
This paper aims at evaluating this assumption. As argued above, due to the
high strength of the cognitive representation of entrenched linguistic elements
these elements are less susceptible to ongoing language change than other ele-
ments belonging to the same construction. Since persistence also leads to a (tem-
porally) higher strength of the cognitive representation of a linguistic element,
it can be hypothesized that persistence effects play a conservative role in lan-
guage change. In particular, persistence can be shown to conserve the use of a
grammatical construction whose usage frequency is declining. In disappearance
processes, a construction becomes gradually restricted to specific usage contexts.
Its syntactic productivity declines; the construction typically only appears in the
form of singular specific syntagms. Due to this growing restrictedness, the pro-
ductivity of the construction increasingly relies on persistence effects: Whether or
not a persisting token occurs in the preceding co-text becomes a better predictor
of the occurrence of tokens of the disappearing construction.
If this hypothesis is correct, the preceding discourse contexts of late tokens
of a disappearing construction (i. e., when the construction is already scarcely
attested) should have a higher probability of containing a token of the same con-
struction than early tokens (i. e., when the construction is still widely used, and
the change is only incipient).
The remainder of this paper is dedicated to the evaluation of this prediction
for split auxiliary selection in Spanish. Old Spanish possessed two auxiliaries for
compound tense constructions in which the participle (PtcP) was formed from
intransitive verbs, aver (‘have’) and ser (‘be’) (Benzing 1931; Yllera 1980; Elvira
González 2001; García Martín 2001; Aranovich 2003; Romani 2006; Mateu 2009,
among others). As shown in (1–2), participles formed from predicates involving a
change of state typically select ‘be’, whereas participles formed from predicates
that denote unbounded activities or states typically select ‘have’:
5 In the examples, the source texts are indicated with the abbreviations in square brackets. For
a list of the source texts and the abbreviations, cf. the appendix.
296 Malte Rosemeyer
This study relies on a corpus of 3,732 auxiliary + PtcP tokens from 41 Spanish
historiographical texts dated between 1270 and 1650. The selection of the texts
closely followed the guidelines regarding the authenticity of the source texts’
manuscripts established by Fernández-Ordoñez (2006).6 The majority of the edi-
tions used are from the Corpus Diacrónico del Español (CORDE, Real Academia
Española 2010), with the exception of parts from the Gran Conquista de Ultramar
(Admyte 1992) and the Spanish translation of the Roman de Troie by the order of
Alfonso XI (Parker 1977). In his study, Smzrecsanyi (2005) compares a much wider
range of data, including different registers and varieties of English (Szmrecsanyi
2005: 121). Since he shows persistence effects to be relevant for all of these dif-
ferent language varieties, the restriction of the present study to historiographical
texts is not expected to distort the results.
The tokens were selected and annotated manually by searching for parti-
ciples. In these queries, the great orthographic variation in the historical texts
was accounted for. This concerned especially the alternations between <b,v,u>,
<z,sz,sc,ç>, <f,ff,h>, <i,y,j,u>, <r,rr>, <s,ss>, and <n,nn,ñ>. Since the query syntax
in CORDE is sensitive to capitalization, additional queries for capitalized partici-
ples were conducted.
The study includes 43 verb lemmata from a wide range of semantic classes of
intransitive verbs: change of location (volver ‘return’, venir ‘come’, etc.), change
of state (morir ‘die’, espantar(se) ‘become frightened’, crecer ‘grow’, etc.), prolon-
gation of a pre-existing state (quedar ‘stay’, fincar ‘stay’, etc.), and state (yacer
‘lie’, etc.). Very frequent verb lemmata were randomized. Thus, the upper limit
of tokens collected per verb and century was defined as 50, since this quantity
allows for statistical modeling. Because CORDE does not offer an automatic ran-
domization procedure for queries in single books, the randomization was done
manually by selecting random tokens from each section of a book.
Each token was annotated for persistence effects in the following fashion.
Szmrecsanyi’s (2005, 2006) work shows that persistence effects crucially hinge
on temporal distance because the effect of the original stimulus decays over
time. Consequently, persistence was modeled as a categorical variable uniting
the factors of the presence/absence of a persisting token and, in the case of the
presence of such a token, the distance between the token and the auxiliary + PtcP
construction. Thus, the variables “PERSIST_BE” and “PERSIST_HAVE” received
the value 0 if no persistence-triggering ‘be’ + PtcP viz. ‘have’ + PtcP token was
present in the preceding co-text.7 If such a token was present in the 1–200 words
preceding the co-text, the respective variable received a value between 1 and 4.
The value was chosen on the basis of the quartiles of the distribution and rep-
resents the distance in words between the closest ‘be’ + PtcP viz. ‘have’ + PtcP
7 Only ‘be’ + PtcP and ‘have’ + PtcP tokens that fall in the envelope of variation were annotated
as persistence triggers. For instance, ‘be’ + PtcP constructions could have a passive function
in Old Spanish. It is often difficult to distinguish between passive ‘be’ + PtcP tokens and ‘be’ +
PtcP tokens with a temporal function (for instance, the verb morir could appear both with an
intransitive verb meaning (‘die’) and a transitive verb meaning (‘be killed’)). In considering only
persistence effects due to form and function priming, the study limits persistence to Szmrecsa-
nyi’s (2005) notion of “α-persistence”. Thus, persistence effects due to purely formal similarity
(“β-persistence”) are not taken into account.
298 Malte Rosemeyer
token with temporal function and the anchor token.8 Thus, the value “1” rep-
resents a very large number of intervening words, whereas “4” represents a very
small number of intervening words, with “2” and “3” as intermediate values.
Although PERSIST_BE and PERSIST_HAVE gave the best results regarding the
synchronic influence of persistence on Spanish auxiliary selection, they proved
to be too fine-grained for the diachronic statistical analysis. This is due to the
fact that both of these variables have a total of five levels (0, 1, 2, 3, 4). In many
instances, there were not enough tokens in one time point to yield a minimum of
occurrences for each of these levels. For this reason, a second set of persistence
variables was created. PERSIST_BE_BIN and PERSIST_HAVE_BIN are binary vari-
ables referring only to the presence/absence of a persisting ‘be’ + PtcP viz. ‘have’
+ PtcP token in the preceding co-text. As an illustration of this coding procedure,
consider example (3).
The ‘be’ + PtcP token eran fincados is preceded by the ‘be’ + PtcP token fue partido
which is similar in function. Consequently, a persistence effect is assumed and
the example receives the value “TRUE” for the variable PERSIST_BE_BIN. There
are 15 intervening words between the first and the second mention of ‘be’. Due
to this rather small distance in words, the example receives the value “1” on the
variable PERSIST_BE. Note that there is no ‘have’ + PtcP token in example (3).
Neither is there a ‘have’ + PtcP token in the rest of the preceding co-text (not
given in (3)). Consequently, example (3) receives the value “FALSE” for the vari-
able PERSIST_HAVE_BIN, and “0” on the variable PERSIST_HAVE.
8 As a result, the values of the variables PERIST_BE and PERSIST_HAVE represent slightly differ-
ent distances in words between stimulus and anchor token.
How usage rescues the system: Persistence as conservation 299
4 Descriptive analysis
0: No persisting ‘be’ + PtcP token 1453 49.0 1513 51.0 2966 100
1: Textual distance 112–200 words 46 24.5 142 75.5 188 100
2: Textual distance 67–111 words 50 26.0 142 74.0 192 100
3: Textual distance 32–66 words 52 26.9 141 73.1 193 100
4: Textual distance 1–31 words 49 25.4 144 74.6 193 100
0: No persisting ‘have’ + PtcP token 573 34.7 1080 65.3 1653 100
1: Textual distance 109–200 words 231 44.6 287 55.4 518 100
2: Textual distance 63–108 words 267 50.8 259 49.2 526 100
3: Textual distance 30–62 words 273 52.9 243 47.1 516 100
4: Textual distance 1–29 words 306 59.0 213 41.0 519 100
The percentages of use of ‘have’ + PtcP and ‘be’ + PtcP vary within a range of
about 25 percent according to whether or not a persistence-triggering ‘have’ +
PtcP or ‘be’ + PtcP token is present in the preceding co-text. Table 1 demonstrates
that in the absence of a persisting ‘be’ + PtcP token, the distribution of ‘have’ +
PtcP and ‘be’ + PtcP is rather balanced (49 percent vs. 51 percent). However, in
tokens where a persisting ‘be’ + PtcP token is present, ‘have’ + PtcP is much less
frequent than ‘be’ + PtcP (approximately 26 percent vs. 74 percent). Note that con-
300 Malte Rosemeyer
trary to the expectation, a smaller distance between a persisting ‘be’ + PtcP token
and anchor token does not appear to reinforce this tendency.
Table 2 demonstrates that the persistence effect operates for both alternatives.
In the absence of a persisting ‘have’ + PtcP token, ‘have’ + PtcP is less frequent
than ‘be’ + PtcP (34.7 percent vs. 65.3 percent). If a ‘have’ + PtcP token is present
however, ‘have’ + PtcP is more frequent than ‘be’ + PtcP. This effect increases with
a decreasing distance in words and is strongest in condition 4, with the smallest
distance in words (1–31 words), where the relative frequency of ‘have’-selection is
59 percent and the relative frequency of ‘be’-selection is 41 percent.
Note that in absolute numbers, the incidence of persistence-triggering
‘have’ + PtcP is almost three times more frequent than the occurrence of per-
sistence-triggering ‘be’ + PtcP. Out of 2,079 tokens one persisting ‘have’ + PtcP
token is attested. By contrast, only one out of 766 tokens is a persisting ‘be’ +
PtcP token. This observation is unsurprising given that ‘have’ + PtcP gradually
became the more frequent variant, replacing ‘be’ + PtcP. In addition, the relative
scarcity of tokens involving a persisting ‘be’ + PtcP token may explain why the
descriptive analysis does not demonstrate a word distance effect for the variable
PERSIST_BE.
With the exception of the effect of the distance between stimulus and anchor
token on ‘be’-selection, the data from Spanish auxiliary selection meets the
expectations regarding the synchronic influence of persistence gathered from
the discussion of Szmrecsanyi’s (2005, 2006) analysis. These descriptive findings
thus illustrate the influence of usage on morphosyntactic phenomena such as
auxiliary selection. Spanish auxiliary selection is crucially conditioned by the
verb lemma from which the participle is formed. However, the writers of the his-
toriographical texts gathered in the corpus did not base their decision to use one
auxiliary over another one solely on factors such as the semantics of the auxili-
ated verb. The existence of a persistence effect in the data suggests a view on com-
petence that is highly dependent on contextual factors, particularly frequency of
occurrence. Persistence effects represent a direct influence of a speaker’s expe-
rience with language on his/her language use.
In order to measure the diachronic development of the influence of these
persistence effects, it is necessary to establish a chronology of the disappearance
of ‘be’ + PtcP in the data. This study employs the variability-based neighbor clus-
tering (VNC) method developed in Gries and Hilpert (2008) and Hilpert and Gries
(2009).9 VNC offers a data-driven method to statistically identify qualitatively
9 All statistical tests and plots presented in this paper were conducted using the open-source
statistical software R (R Development Core Team 2012).
How usage rescues the system: Persistence as conservation 301
96
Distance in summed standard deviations
80
89
82
75
Figure 1: Variability-based neighbor clustering (VNC) analysis for the development of the per-
centage of use of ‘be’-selection in the corpus of historiographical texts
In Figure 1, the line with breakpoints in the background plots the frequency of
‘be’-selection relative to ‘have’-selection in each of the eight time periods per
million words. The dendrogram in the foreground illustrates the clustering pro-
posed by VNC on the basis of the data. The dendrogram suggests two temporal
clusters whose distance measured in summed standard deviation is greatest: a
first cluster spanning the period from the thirteenth century until the mid-fif-
teenth century, and a second cluster spanning the period from the mid-fifteenth
century until the mid-seventeenth century. In line with the description given in
Section 2, the pace of the replacement of ‘be’ + PtcP with ‘have’ + PtcP did not
accelerate until after the beginning of the fifteenth century. Based on the VNC
analysis, the data was therefore divided into two time periods: Old Spanish (1270–
1424) and Early Modern Spanish (1425–1650).
302 Malte Rosemeyer
100%
Percentage selection BE over HAVE
60%
40%
20%
The distance between the two lines (referring to the percentage of ‘be’-selection
in tokens where PERSIST_BE_BIN = TRUE at a point in time, and the percentage
of ‘be’-selection in tokens where PERSIST_BE_BIN = FALSE at a point in time)
gradually becomes greater. As expected, this effect increases in strength only after
the beginning of the fifteenth century: From 1425 onwards ‘be’ + PtcP tokens that
are preceded by a persisting ‘be’ + PtcP token are relatively more frequent than
‘be’ + PtcP tokens for which no persisting ‘be’ + PtcP is attested in the co-text.
Consequently, the increasing dependence of ‘be’ + PtcP tokens on persistence
appears to be related to the process of disappearance of ‘be’ + PtcP.
5 Multivariate analysis
(b) the word distance does not appear to increase the effect of persistence on
‘be’-selection.
As a last predictor variable, an interaction term between TIME and PERSIST_
BE_BIN was included. This interaction term measures whether the probability
of a persistence effect for ‘be’ + PtcP tokens increased or decreased in Early
Modern Spanish in comparison to Old Spanish. Table 3 summarizes the variables
employed in the regression model.
RANDOM VERB LEMMA Factor 43 values (i. e., the 43 verb lemmata from
EFFECTS which the participles are formed)
Using Pinheiro et al.’s (2009) nlme package in R, this statistical setup yielded
the regression formula lmer (BE ~ TIME + PERSIST_BE_BIN + TIME : PERSIST_
BE_BIN + ( 1 | VERB LEMMA, data = file, family = “binomial”). As evident in the
formula, the model was set to assume a binomial distribution because essentially,
it is a logistic regression model with a binary outcome. Table 4 summarizes the
results from the regression model.
Before the description of these results in the next section, a short evaluation
of the model fit of the model is in order. The model scores high for the C index of
concordance (0.90 of 1) and Somer’s dxy (0.80 of 1). Although all of the predictors
significantly enhance the model fit, the good score of the model is above all a
result of the random effect VERB LEMMA. The model calculates a high degree of
variance (4.51) for the random effect VERB LEMMA. As predicted by the literature
reviewed in Section 2, auxiliary selection is determined much more by the verb
lemma from which the participle is formed than the author of the source text. For
How usage rescues the system: Persistence as conservation 305
instance, the fact that the event structure template of verbs such as morir (‘die’)
involves a transition is a very potent predictor of Spanish auxiliary selection.
Figure 3 illustrates this fact. It gives the by-word random intercepts calculated
by the model for each verb. Each point in the plot refers to one verb. Its value
on the y-axis represents the adjusted intercept value for each of the values of
the variable VERB LEMMA with regard to the dependent variable BE. Thus, verbs
with a random intercept higher than 0 typically appear in the ‘be’ + PtcP con-
struction, whereas verbs with a random intercept lower than 0 typically appear
in the ‘have’ + PtcP construction. For the sake of clarity, the names of some of
the highest- and lowest-ranking verbs are given next to the points they are rep-
resented by.10
The description of the results summarized in Table 4 focuses on two values for
each effect: the odds ratio (OR) and the p-value (P). P evaluates the degree of sta-
tistical significance of an effect. Each effect to which the regression model assigns
a p-value lower than the threshold value of 0.05 can be assumed to be statistically
significant. The OR, by contrast, evaluates the strength and direction of the corre-
lation between the predictor variable and the dependent variable. ORs assume a
value between 0 and ∞. If the OR is below 1, a positive value on the predictor vari-
able lowers the probability of a positive value on the dependent variable (in this
case, ‘be’-selection). If the OR is above 1, a positive value on the predictor variable
raises the probability of a positive value on the dependent variable. Crucially, the
strength of an OR does not imply statistical significance as such: An effect with a
very high or very low OR might not reach statistical significance.
The model demonstrates a strong effect of TIME on auxiliary selection.
Thus, in comparison to tokens from source texts before 1425, the usage frequency
of ‘be’ + PtcP in comparison to ‘have’ + PtcP drops significantly after 1425
(OR = 0.062, P < 0.001).
Although PERSIST_BE_BIN only reaches marginal statistical significance in
the regression model, subsequent analyses over a larger corpus in Rosemeyer
(2014) have shown that if a greater amount of examples is included, the main
effect of persistence reaches statistical significance. If a ‘be’ + PtcP token that
falls in the envelope of variation occurs in the preceding co-text of an auxiliary +
PtcP token, ‘be’-selection becomes more probable (OR = 1.390, P < 0.1).
The interaction between TIME and PERSIST_BE_BIN has a significant pos-
itive influence on the probability of ‘be’-selection. Although the usage frequency
of ‘be’-selection decreases rapidly in Early Modern Spanish, the negative effect of
TIME on ‘be’-selection is to a certain extent cushioned by PERSIST_BE_BIN. As
predicted, late ‘be’ + PtcP tokens are more likely to involve a persisting ‘be’ + PtcP
token in the preceding co-text than early ‘be’ + PtcP tokens (OR = 2.143, P < 0.01).
10 See Rosemeyer (2012a) for a more comprehensive discussion of the influence of verb seman-
tics on Spanish auxiliary selection.
How usage rescues the system: Persistence as conservation 307
The results from the regression model suggest that from a diachronic perspective,
persistence influences a language’s systematicity. In particular, persistence has a
conserving effect: If a token of the disappearing ‘be’ + PtcP construction is used,
the probability that ‘be’ + PtcP is used in the following discourse rises. This leads
to “islands of use” of the ‘be’ + PtcP in the texts. Rather than being scattered over
a text, later examples of ‘be’ + PtcP are clustered in specific text passages. Within
these text passages, the use of ‘be’ + PtcP is conserved.
Although entrenchment and persistence both have been shown to fulfill a
conserving function in diachronic processes, the findings suggest a difference
between conserving effects due to entrenchment and conserving effects due to
persistence. This difference concerns the question of syntactic productivity (in
the sense of Barðdal 2008). Entrenchment always affects specific linguistic ele-
ments: The repeated use of a specific linguistic element leads to a stronger cogni-
tive representation of that item. As a result, processes of language change operat-
ing on its paradigm have less of an effect on highly frequent linguistic elements.
Although this process conserves systematicity in the sense that an alternation is
conserved, it also creates irregularity in that the paradigm of the disappearing
construction becomes fractured. In the late stages of Spanish auxiliary selection,
some verbs denoting a change of location usually select ‘be’, while others select
‘have’.
By contrast, the conserving effect of persistence does not create this type
of irregularity. Crucially, the mixed-effect regression modeling proposed in this
section controls for verb-specific differences. Although a quantitative correlation
between frequency of use and persistence could be assumed (linguistic elements
that appear more frequently also trigger more persistence effects), the persistence
effects observed in this study do not have a different strength for different lin-
guistic elements, but rather work globally. This is because the persisting token
need not exactly match the ‘be’ + PtcP token it triggers. Persistence consequently
involves processes of pattern recognition, i. e. analogy. In contrast, analogical
thinking is rather irrelevant for entrenchment processes where the cognitive rep-
resentation of the exact linguistic item is strengthened. Due to this difference in
the conceptual nature of entrenchment and persistence, it can be argued that
whereas the conserving effect of entrenchment creates irregularity in the para-
digm of a disappearing construction, the conserving effect of persistence affects
all instantiations of a disappearing construction alike. This is an empirical ques-
tion that could be addressed and elaborated in future research on frequency
effects in language change.
308 Malte Rosemeyer
This paper has given further evidence of a strong relationship between usage and
systematicity in language. With entrenchment and persistence, two processes
crucial for the rise and conservation of systematicity have been described. Since
both entrenchment and persistence (temporarily) strengthen the cognitive rep-
resentation of a linguistic element, they lead to conserving effects in diachronic
processes in which a construction is disappearing from use. As a case study, split
auxiliary selection in Spanish was investigated. It was shown that later tokens of
the disappearing ‘be’ + PtcP construction are more likely to involve a persisting
‘be’ + PtcP token in the preceding co-text than earlier ‘be’ + PtcP tokens. The
use of ‘be’ + PtcP thus appears to be increasingly relying on persistence effects,
which is why persistence is argued to have a conserving effect on disappearing
grammatical constructions.
The analysis proposed in this paper thus emphasizes the similarities between
entrenchment and persistence with regard to their effect. However, it is also sug-
gested that the two processes may have different diachronic effects on the sys-
tematicity of the disappearing construction’s paradigm. Whereas conservation
due to entrenchment affects specific linguistic elements and therefore leads to
irregular and fractured paradigms, conservation due to persistence acts globally.
Due to the reliance of persistence on analogical thinking, the conserving effect
of persistence is expected to affect all linguistic elements belonging to a certain
construction alike. I leave the investigation of this hypothesis to further research.
In summary, this paper has illustrated the benefits of a usage-based approach
to historical linguistics. A speaker’s linguistic behavior is crucially determined
by his or her experience with language. The effect of linguistic experience is not
restricted to very recent language events (persistence), but can accumulate over
time (entrenchment). Acknowledging the intricate relationship between the use
of a language and its systematicity offers an explanation for quantitative effects
in linguistic data that are unexpected from the perspective of an abstract “system-
oriented” approach.
References
Abbot-Smith, Kirsten and Heike Behrens (2006): How known constructions influence
the acquisition of other constructions: The German periphrastic passive and future
constructions. Cognitive Science 30: 995–1026.
Admyte (1992): Archivo digital de manuscritos y textos españoles, Tomo 0. Madrid: Micronet.
Aranovich, Raúl (2003): The semantics of auxiliary selection in Old Spanish. Studies in
Language 27(1): 1–37.
How usage rescues the system: Persistence as conservation 309
Baayen, Harald (2008): Analyzing Linguistic Data. A Practical Introduction to Statistics Using
R. Cambridge: Cambridge University Press.
Baddeley, Alan D. (1966): Short-term memory for word sequences as a function of acoustic,
semantic and formal similarity. Quarterly Journal of Experimental Psychology 18: 362–365.
Barðdal, Jóhanna (2008): Productivity : Evidence from Case and Argument Structure in
Icelandic. Amsterdam: John Benjamins.
Behrens, Heike (2009): Usage-based and emergentist approaches to language acquisition.
Linguistics 47(2): 383–411.
Benzing, Joseph (1931): Zur Geschichte von ser als Hilfszeitwort bei den intransitiven Verben im
Spanischen. Zeitschrift für romanische Philologie 51: 385–460.
Bybee, Joan L. (2002): Sequentiality as the basis of constituent structure. In: Talmy Givón
and Bertram F. Malle (eds.), The Evolution of Language from Pre-Language, 109–132.
Amsterdam/Philadelphia: John Benjamins.
Bybee, Joan L. (2003): Mechanisms of change in grammaticization: the role of frequency. In:
Richard Janda and Brian Joseph (eds.), The Handbook of Historical Linguistics, 624–647.
Oxford: Blackwell.
Bybee, Joan L. (2006): From usage to grammar: the mind’s response to repetition. Language 82
(4): 711–733.
Bybee, Joan L. (ed.) (2007): Frequency of Use and the Organization of Language. Oxford: Oxford
University Press.
Bybee, Joan L. (2010): Language, Usage, and Cognition. Cambridge/New York: Cambridge
University Press.
Bybee, Joan L. and Paul J. Hopper (eds.) (2001a): Frequency and the Emergence of Linguistic
Structure. Amsterdam: John Benjamins.
Bybee, Joan L. and Paul J. Hopper (2001b): Introduction to frequency and the emergence
of linguistic structure. In: Joan L. Bybee and Paul J. Hopper (eds.), Frequency and the
Emergence of Linguistic Structure, 1–26. Amsterdam: John Benjamins.
Bybee, Joan L. and Rena Torres Cacoullos (2009): The role of prefabs in grammaticization: how
the particular and the general interact in language change. In: Roberta Corrigan, Edith
A. Moravcsik, Hamid Ouali and Kathleen M. Wheatley (eds.), Formulaic Language, Volume
I: Distribution and Historical Change, 187–217. Amsterdam: John Benjamins.
Chomsky, Noam (1975): The Logical Structure of Linguistic Theory. New York: Plenum Press.
Coseriu, Eugenio (1974): Synchronie, Diachronie und Geschichte: Das Problem des
Sprachwandels: Übersetzt von Helga Sohre. München: Fink.
Ellis, Nick (1996): Sequencing in SLA: phonological memory, chunking and points of order.
Studies in Second Language Acquisition 18: 91–126.
Elvira González, Javier (2001): Intransitividad escindida en español: El uso auxiliar de ser en
español medieval. EdLing 15: 201–245.
Fernández-Ordóñez, Inés (2006): La Historiografía medieval como fuente de datos lingüísticos.
Tradiciones consolidadas y rupturas necesarias. In: José Jesus Bustos Tovar and José Luis
Girón Alchonchel (eds.), Actas del VI Congreso Internacional de Historia de la Lengua
Española, 1779–1807. Madrid: Arco Libros.
García Martín and José María (2001): La formación de los tiempos compuestos del verbo en
español medieval y clásico. Aspectos fonológicos, morfológicos y sintácticos. València:
Universitat de València.
Goldberg, Adele E. (2006): Constructions at Work: the Nature of Generalization in Language.
Oxford: Oxford University Press.
310 Malte Rosemeyer
Romani, Patrizia (2006): Tiempos de formación romance I: Los tiempos compuestos. In:
Concepción Company Company (ed.): Sintaxis histórica de la lengua española, 243–348.
México: Universidad Nacional Autónoma de México.
Rosemeyer, Malte (2012a): Auxiliary selection in Spanish. Gradience, gradualness, and
conservation. Ph.D. thesis, Albert-Ludwigs-Universität, Freiburg.
Rosemeyer, Malte (2012b): How to measure replacement: Auxiliary selection in Old Spanish
bibles. Folia Linguistica Historica 33(1): 135–174.
Rosemeyer, Malte (2013): Tornar and volver: The interplay of frequency and semantics in
compound tense auxiliary selection in Medieval and Classical Spanish. In: Jóhanna
Barðdal, Elly van Gelderen and Michela Cennamo (eds.), Argument Structure in Flux,
435–458. Amsterdam/Philadelphia: John Benjamins.
Rosemeyer, Malte (2014): Auxiliary Selection in Spanish. Gradience, Gradualness, and
Conservation. Amsterdam, Philadelphia: Benjamins.
Szmrecsanyi, Benedikt (2005): Language users as creatures of habit: A corpus-based analysis
of persistence in spoken English. Corpus Linguistics and Linguistic Theory 1(1): 113–150.
Szmrecsanyi, Benedikt (2006): Morphosyntactic Persistence in Spoken English. A Corpus
Study at the Intersection of Variationist Sociolinguistics, Psycholinguistics, and Discourse
Analysis. Berlin/New York: Mouton de Gruyter.
Tomasello, Michael (1992): First Verbs: A Case Study of Early Grammatical Development.
Cambridge: Cambridge University Press.
Travis, Catherine E. (2007): Genre effects on subject expression in Spanish: Priming in narrative
and conversation. Language Variation and Change 19(2): 101–135.
Travis, Catherine E. and Rena Torres Cacoullos (2012): What do subject pronouns do in
discourse? Cognitive Linguistics 23(4): 711–748.
Yllera, Alicia (1980): Sintaxis histórica del verbo español: Las perífrasis medievales. Zaragoza:
Universidad de Zaragoza.
312 Malte Rosemeyer
EDEI Estoria de Espanna que 1270 Alfonso X CORDE Pedro Sánchez Prieto, Alcalá
fizo el muy noble rey don de Henares: Universidad de
Alfonsso, fijo del rey don Alcalá de Henares, 2002
Fernando et de la reyna ...
EDEII Estoria de España, II 1275 Alfonso X CORDE
Lloyd A. Kasten; John J. Nitti,
Madison: Hispanic Seminary
of Medieval Studies, 1995
GEI General Estoria. Primera 1277 Alfonso X CORDE Pedro Sánchez Prieto-Borja,
parte Alcalá de Henares: Univer-
sidad de Alcalá de Henares,
2002
GEIV General Estoria. Cuarta 1280 Alfonso X CORDE Pedro Sánchez-Prieto Borja,
parte. Alcalá de Henares: Univer-
sidad de Alcalá, 2002
GCU Gran Conquista de Ultramar 1293 Anonymous ADMYTE ADMYTE
CSA Crónica de Sancho IV. Ms. 1340 Anonymous CORDE Pedro Sánchez-Prieto Borja,
829 BNM Alcalá de Henares: Univer-
sidad de Alcalá de Henares,
2004
RDT Roman de Troie 1345 Anonymous BOOK Kelvin M. Parker, Illinois:
Applied Literature Press,
1977
SUM Sumas de la historia 1350 Anonymous CORDE Robert G. Black, Madison:
troyana de Leomarte Hispanic Seminary of
Medieval Studies, 1995
GCE1 Gran crónica de España, III. 1384 Fernández CORDE Juan Manuel Cacho Blecua,
BNM, ms. 10134 de Heredia, Zaragoza: Universidad de
Juan Zaragoza, 2003
GCE2 Gran crónica de España, 1385 Fernández CORDE Regina af Geijerstam,
I. Ms. 10133 BNM de Heredia, Madison: Hispanic Seminary
Juan of Medieval Studies, 1995
CDP Crónica del rey don Pedro 1400 López de CORDE Germán Orduna, Buenos
Ayala, Pero Aires: SECRIT, 1994
11 When the date of a source book was given as an approximate time span, the mean of that time
span was used as the date. For instance, the Atalaya corónicas [ATA] were supposedly written
between 1443 and 1454. Therefore, tokens from this source book were assigned the date 1449.
How usage rescues the system: Persistence as conservation 313
DTL Taducción de las Décadas 1400 López de CORDE Curt J. Wittlin, Barcelona:
de Tito Livio Ayala, Pero Puvill, 1982
TAM Historia del gran Tamorlán. 1406 González CORDE Juan Luis Rodríguez Bravo;
BNM 9218 de Clavijo, María del Mar Martínez Rod-
Ruy ríguez, Hispanic Seminary of
Medieval Studies (Madison),
1986
CRR Crónica del rey don Rodrigo, 1430 Corral, CORDE James Donald Fogelquist,
postrimero rey de los godos Pedro de Madrid: Castalia, 2001
(Crónica sarracina)
VIC El victorial 1440 Díaz de CORDE Rafael Beltrán Llavador,
Games, Madrid: Taurus, 1994
Gutierre
ATA Atalaya corónicas. British 1449 Martínez CORDE James B. Larkin, Madison:
L 288 de Toledo, Hispanic Seminary of
Alfonso Medieval Studies, 1985
GJU Guerra de Jugurtha de Caio 1450 Ramírez de CORDE Jerry R. Rank, Madison:
Salustio Crispo. Escorial Guzmán, Hispanic Seminary of
G. III.11 Vasco Medieval Studies, 1995
REP Repertorio de príncipes de 1471 Escavias, CORDE Michel García, Madrid: Insti-
España Pedro de tuto de Estudios Giennenses,
1972
IBF Istoria de las bienandanzas 1474 García de CORDE Ana María Marín Sánchez,
e fortunas Salazar, Madrid: Corde, 2000
Lope
ENRC Crónica de Enrique IV de 1482 Anonymous CORDE María Pilar Sánchez Parra,
Castilla 1454–1474 Madrid: Ediciones de la
Torre, 1991
CRCP Crónica de los Reyes 1482 Pulgar, CORDE Juan de Mata Carriazo,
Católicos (Hernando del Hernando Madrid: Espasa-Calpe, 1943
Pulgar) del
CVC Claros varones de Castilla 1486 Pulgar, CORDE Óscar Perea Rodríguez,
Hernando Madrid: Universidad Com-
del plutense, 2003
CBC Compilación de las batallas 1487 Rodríguez CORDE Lago Rodríguez López,
campales de Almela, Madison: Hispanic Seminary
Diego of Medieval Studies, 1992
ENRE Crónica de Enrique IV 1492 Enríquez CORDE Aureliano Sánchez Martín,
del Cas- Valladolid: Universidad de
tillo, Diego Valladolid, 1994
MAE Hechos del Maestre de 1492 Maldo- CORDE Antonio Rodríguez Moñino,
Alcántara don Alonso de nado, Madrid: Revista de Occi-
Monroy Alonso dente, 1935
314 Malte Rosemeyer
TCAF Traducción de la Corónica 1499 García CORDE José Carlos Pino Jiménez,
de Aragón de fray Gauberto de Santa Madison: Hispanic Seminary
Fabricio de Vagad María, of Medieval Studies, 2002
Gonzalo
TCAL Traducción de la Crónica de 1524 Molina, CORDE Óscar Perea, Madrid: Uni-
Aragón de Lucio Marineo Juan de versidad Complutense de
Siculo Madrid, 2003
CBE Crónica burlesca del 1527 Zúñiga, CORDE José Antonio Sánchez Paso,
emperador Carlos V Francés de Salamanca: Universidad de
Salamanca, 1989
NAU Los Naufragios 1541 Núñez CORDE Enrique Pupo-Walker,
Cabeza de Madrid: Castalia, 1992
Vaca, Alvar
HDI Historia de las Indias 1544 Casas, Fray CORDE Paulino Castañeda Delgado,
Bartolomé Madrid: Alianza Editorial,
de las 1994
CEC Crónica del Emperador 1550 Santa Cruz, CORDE Ricardo Beltrán y Antonio
Carlos V Alonso de Blázquez, Madrid: Real
Academia de la Historia,
1920
ANA Anales de la corona de 1562 Zurita, CORDE Ángel Canellas López,
Aragón. Primera parte Jerónimo Zaragoza: CSIC, 1967
GCP Las guerras civiles peruanas 1569 Cieza de CORDE Carmelo Sáenz de Santama-
León, ría, Madrid: CSIC, 1985
Pedro
CNE Historia verdadera de la 1572 Díaz del CORDE Carmelo Sáenz de Santa
conquista de la Nueva Castillo, María, Madrid: CSIC, 1982
España Bernal
QUI Quinquenarios o Historia de 1576 Gutiérrez CORDE Madrid: Ediciones Atlas,
las guerras civiles del Perú de Santa 1963
(1544–1548) y de otros Clara,
sucesos de las Indias Pedro
GCG Guerras civiles de Granada. 1595 Pérez de CORDE Shasta M. Bryant, Newark,
1ª parte Hita, Ginés Delaware: Juan de la Cuesta,
1982
HHC Historia general de los 1601 Herrera y CORDE Ángel de Altolaguirre y
hechos de los castellanos Tordesillas, Duvale, Madrid: Real Acade-
en las islas y tierra firme. Antonio de mia de la Historia, 1934
Década primera
HVH Historia de la vida y hechos 1611 Sandoval, CORDE Alicante: Universidad de
del Emperador Carlos V Fray Alicante, 2003
Prudencio
de
How usage rescues the system: Persistence as conservation 315
HFE Historia de Felipe II, rey de 1619 Cabrera de CORDE José Martínez Millán y Carlos
España Córdoba, Javier de Carlos Morales,
Luis Salamanca: Junta de Castilla
y León, 1998
HDC Historia y descripción de la 1625 Fernán- CORDE Córdoba: Boletín de la Real
antigüedad y descendencia dez de Academia de Córdoba, 1954
de la Casa de Córdoba Córdoba,
Francisco
HCA Historia de los movimien- 1645 Melo, CORDE Joan Estruch Tobella, Madrid:
tos, separación y guerra de Francisco Castalia, 1996
Cataluña Manuel de