100% found this document useful (1 vote)
157 views289 pages

[Trends in Linguistics. Studies and Monographs] Christian Chiarcos, Berry Claus, Michael Grabski (Editors) - Salience_ Multidisciplinary Perspectives on Its Function in Discourse (Trends in Linguistics. Stud

Uploaded by

ay no
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
157 views289 pages

[Trends in Linguistics. Studies and Monographs] Christian Chiarcos, Berry Claus, Michael Grabski (Editors) - Salience_ Multidisciplinary Perspectives on Its Function in Discourse (Trends in Linguistics. Stud

Uploaded by

ay no
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 289

Salience

Trends in Linguistics
Studies and Monographs 227

Editor
Volker Gast
Founding Editor
Werner Winter
Editorial Board
Walter Bisang
Hans Henrich Hock
Matthias Schlesewsky
Niina Ning Zhang

Editor responsible for this volume


Walter Bisang

De Gruyter Mouton
Salience
Multidisciplinary Perspectives
on its Function in Discourse

Edited by
Christian Chiarcos
Berry Claus
Michael Grabski

De Gruyter Mouton
ISBN 978-3-11-024072-6
e-ISBN 978-3-11-024102-0
ISSN 1861-4302

Library of Congress Cataloging-in-Publication Data

Salience : multidisciplinary perspectives on its function in discourse /


edited by Christian Chiarcos, Berry Claus, Michael Grabski.
p. cm. ⫺ (Trends in Linguistics. Studies and Monographs ; 227)
Includes bibliographical references and index.
ISBN 978-3-11-024072-6 (alk. paper)
1. Discourse analysis. 2. Computational linguistics. 3. Psycho-
linguistics. I. Chiarcos, Christian. II. Claus, Berry. III. Grab-
ski, Michael.
P302.S24 2011
4011.41⫺dc22
2011000754

Bibliographic information published by the Deutsche Nationalbibliothek


The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie;
detailed bibliographic data are available in the Internet at http://dnb.d-nb.de.

” 2011 Walter de Gruyter GmbH & Co. KG, Berlin/New York


Typesetting: PTP-Berlin Protago TEX-Production GmbH, Berlin
Printing: Hubert & Co. GmbH & Co. KG, Göttingen
⬁ Printed on acid-free paper
Printed in Germany.
www.degruyter.com
Contents

Introduction: Salience in linguistics and beyond . . . . . . . . . . . . . 1


Christian Chiarcos, Berry Claus, and Michael Grabski

Part I. Entity-based salience in discourse


Demonstratives and salience: Towards a functional taxonomy . . . . . . 31
Olga Krasavina
Parenthetical agent-demoting constructions in Eastern Khanty:
Discourse Salience vis-à-vis referring expressions . . . . . . . . . . . . 57
Andrey Y. Filchenko
Joint information value of syntactic and semantic prominence
for subsequent pronominal reference . . . . . . . . . . . . . . . . . . . 81
Ralph L. Rose
The Mental Salience Framework:
Context-adequate generation of referring expressions . . . . . . . . . . 105
Christian Chiarcos

Part II. Beyond entities in discourse


Discourse-structural salience from a cross-linguistic perspective:
Coordination and its contribution to discourse (structure) . . . . . . . . 143
Wiebke Ramm
Rhetorical relations and verb placement in Old High German . . . . . . 173
Roland Hinterhölzl and Svetlana Petrova

Part III. Beyond purely linguistic salience


Visual salience and the other one . . . . . . . . . . . . . . . . . . . . . 205
John D. Kelleher
Salience in hypertext:
Multiple preferred centers in a plurilinear discourse environment . . . . 229
Birgitta Bexten
vi Contents

Establishing salience during narrative text comprehension:


A simulation view account . . . . . . . . . . . . . . . . . . . . . . . . 251
Berry Claus

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Language index . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Index of determinants, manifestations and aspects of salience . . . . 279
Subject index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
Introduction: Salience in linguistics and beyond

Christian Chiarcos, Berry Claus, and Michael Grabski

1. Introduction

quod punctum salit iam et movetur ut animal


Aristotle, Hist. Anim. 6.3

“A point that hops and jumps like a living being”. The Latin translation of Aris-
totle’s description of the heart of an embryo in a hen’s egg – a red point that
stands out from the yellow yolk – has been the source of metaphors through-
out all languages of Europe. The salient point, le point saillant, der springende
Punkt all refer to things that are particularly important or relevant. Starting from
this metaphor, salient and salience in English became to mean things that stand
out from the ground, can be easily recognized, are in the focus of attention, or
foremost to a person’s state of mind.1
This volume addresses perspectives and functions of salience in discourse.
The volume emanates from the 6th International Workshop on Multidisciplin-
ary Approaches to Discourse that was held in October, 2005, in Chorin, Ger-
many, under the theme Salience in Discourse. The goal of the workshop was
to illustrate the differences and commonalities in research and perspectives on
salience within and between various areas of research, including computational
linguistics, discourse studies, and psycholinguistics.
Looking for a general definition of an entity as being salient, one may start
with the familiar function salience has in linguistics. In the context of the ana-
phoric binding of noun phrases, salience of possible antecedents is an important
contextual feature, that may be established by complex linguistic means. A re-
lated phenomenon is the salience of discourse segments that allows subsequent
segments to be linked to them by means of some discourse relation. Moreover,
the use of the term salience can be stretched further to extra-linguistic entities –
for example, non-linguistic objects can be defined as salient, as they restrict
linguistic reference to real entities in the discourse-external context. As gener-

1. For a detailed description of the history of the term and the metaphor see von
Heusinger (1997).
2 Christian Chiarcos, Berry Claus, and Michael Grabski

alization over these different uses of ‘salience’, a working definition emanated


from discussions at the MAD ’05 workshop that can be given as follows:
“Salience defines the degree of relative prominence of a unit of information, at
a specific point in time, in comparison to the other units of information.”
The contributions in this volume are organized in accordance with the differ-
ent views on salience in discourse identified above: The salience of entities in
discourse (Part I, see Section 2 for an overview), discourse structural salience
(Part II, see Section 3 for an overview), and aspects of salience beyond linguis-
tics (Part III, see Section 4 for an overview).

1.1. Input and output contexts


A useful picture that has evolved in discourse semantics is that a text utterance
and the presuppositions that it makes exploit an input context that is induced
by the preceding text. This relates to anaphora that have been given a status
of presupposing expressions in the influential work of van der Sandt (1992).
As an utterance always adds some information, it also provides an output con-
text that can be exploited by subsequent text. This function of an utterance has
been stressed in dynamic semantics that has identified an utterance’s meaning
with its Context Change Potential (e.g., Groenendijk and Stokhof, 1991; Kamp,
1981; Lascarides and Asher, 2007; Muskens, 1991). To this picture, salience of
noun phrase anaphora can be related like this: at an utterance U that contains
some anaphorical expression, the relative salience of any possible antecedent is
determined by the input context for U.
The picture also explains why salience of linguistic expressions is subject to
change: as the text proceeds, input contexts change and with them the salience
status that the expression in question may have as an antecedent in some ana-
phorical relation.

1.2. Entity-based salience in discourse


For many researchers, the notion of salience is particularly closely related with
discourse referents and their realization in discourse. This field of research has
been particularly productive in the last 20 years, due to the concentration of a
great deal of the community on the computational treatment of referring expres-
sions during the 1990s.
In parts, these developments cumulated in the development of Centering
Theory (Grosz, Joshi and Weinstein, 1995): Here, a number of linguistic phe-
nomena pertaining to the appearance of discourse referents as well as the notion
of local coherence were traced back to the concept of centers (discourse refer-
Introduction: Salience in linguistics and beyond 3

ents) and their salience ranking. Centering also formalized an important insight,
that entities in discourse have both a backward-looking and a forward-looking
aspect that correspond to the schema of input and output contexts of a given
utterance.

1.3. Salience and information structure


Information structure pertains to the functional structuring of utterances into
partitions that are attributed different textual functions, such as topic/comment,
focus/background, theme/rheme, given-new etc. (e.g., Krifka, 2007; Molnár,
1993; Vallduví, 1992; Vallduví and Engdahl, 1996). In the literature, this parti-
tioning was partly defined in terms of salience, or even reduced to it, as proposed
in Sgall, Hajičová, and Panevova (1986). But the general relationship is far from
clear. One point is that such partitions may concern sentence constituents that
do not exclusively refer to entities. For instance, the focus part of a sentence,
which often is considered to be more salient than the background part, may very
well be a VP that does not contain an NP. Salience in information structure then
extends beyond entity-based salience.
On the other hand, a definition of salience might start with these partitions,
taking, e.g., topic and focus constituents as candidates for salient expressions
(Arnold, 2005). But then, these constituents are salient in different ways, as the
respective partitions to which they belong are interpreted differently. A further
problem is that the two parts very often are complementary. As a consequence,
salient material would cover the whole clause even in rather ordinary utter-
ances – rendering the notion of salience as void.
As an alternative, different dimensions of salience may be distinguished. In
this regard, Centering Theory with its differentiation between forward-looking
and backward-looking functions of centers provides a promising point of depar-
ture. They specify two dimensions of salience that linguistic objects regularly
have in discourse and that could be exploited for definitions of information-
structural distinctions. In Subsection 2.3, multidimensionality of salience in dis-
course will be specifically discussed with respect to referring expressions.

1.4. Salience and discourse structure


Another concept of salience deals with structures at or above clause level. We
then have relative salience of discourse segments, seen as either surface text ut-
terances or as their semantic counterparts, such as propositions or correspond-
ing semantic objects. The global structure of discourse is often represented by
means of a hierarchical structure, e.g., a tree (Grosz and Sidner, 1986; Mann
and Thompson, 1988). Discourse segments correspond to nodes in such a tree
4 Christian Chiarcos, Berry Claus, and Michael Grabski

and are related by coordinating or subordinating discourse or coherence rela-


tions. Discourse segments that are subordinated by such relations are assumed
to be less salient, in that they are less accessible or less important, than non-
subordinated segments (Brandt, 1996).
From a processing point of view, the identification of discourse structure
then bears a remarkable resemblance with tasks like the resolution of anaphoric
references: For a given utterance U, a discourse segment has to be identified to
which this utterance may be attached. With the prior analysis of the discourse
as input context, the attachment operation modifies the discourse structure, and
creates an updated output context. In this sense, salience of discourse segments
or propositions and salience of entities may be compared with each other. The
parallel between these different kinds of salience in discourse is further de-
scribed in Section 3.

1.5. Extra-linguistic salience


Section 4 below extends the scope beyond purely linguistic phenomena. An ob-
vious example where non-linguistic factors play a role in language use is situ-
ated communication, i.e., physically grounded communication. In situated com-
munication, the interlocutors typically refer to objects that are actually present
in the immediate environment. Hence, it can be expected that situated language
production and comprehension is determined by conditions of the physical en-
vironment, such as its visual properties.
A second, less obvious, example that is dealt with in Section 4 is visual
text marking. For instance, words in written texts may be typographically high-
lighted to signal information focus (McAteer, 1992). Furthermore, visual mark-
ing can also indicate a particular functionality as is the case with the visual
marking of cross-references in dictionaries and of links in hypertexts.
The third example, which may be surprising from a linguistic perspective,
is the effect of properties of the situation that is described in a text, in addition
to genuine textual factors. The salience of an entity that is mentioned in a text
may not only depend on its linguistic salience but also on its salience in the
situation being described. At least, this is what is suggested by empirical re-
sults that demonstrate effects of situational factors such as spatial and temporal
variables (e.g., Glenberg, Meyer, and Lindem, 1987; Kelter, Kaup, and Claus,
2004).
Introduction: Salience in linguistics and beyond 5

2. Entity-based salience in discourse

Early research on salience and referentiality includes Fillmore’s (1977) Saliency


Hierarchy and its relevance to the assignment of grammatical roles, Lewis’
(1979) characterization of definite expressions on the basis of salience consid-
erations, Osgood and Bock’s (1977) psycholinguistic study of salience and sen-
tencing, and Hajičová and Vrbova’s (1982) computational model of salience in
the stock of shared knowledge, i.e., the common ground established between
hearer and speaker. These approaches, however, are to be seen in a field of
great terminological diffusion, with related conceptions existing under different
terms, e.g., referential activation (Chafe, 1976), familiarity (Prince, 1981), top-
icality (Givón, 1983), accessibility (Ariel, 1990), or givenness (Gundel, Hed-
berg, and Zacharski, 1993).

2.1. Centering Theory


Nowadays, Centering Theory (Brennan, Friedman, and Pollard, 1987; Grosz,
1981; Grosz et al., 1995; Joshi and Weinstein, 1981; Sidner, 1978, 1981, 1983;
Walker, Joshi, and Prince, 1998) represents probably the most influential ac-
count of entity-based salience in discourse. Its attraction and spread across sev-
eral communities is particularly due to the introduction of an independent, self-
contained terminology.
Centering operates on the assumption that attention has to be focused, or
“centered” in discourse. While this insight also underlies the definition of
“topic” (Tomlin, 1995), Centering theorists used it to develop a theory-specific
terminology. Important notions in their theory are (following Grosz et al., 1983,
1995):

centers: entities that serve to link an utterance U with other utterances


forward-looking centers: the set Cf (U ) of centers that are realized in U and
that can be referred to in subsequent utterances.
backward-looking center a unique center Cb (Uk ), defined for each utterance
Uk (except the segment initial one) that refers back to a forward-looking
center of the preceding utterance Uk−1 , and that, intuitively, represents the
discourse entity which is the center of attention at the utterance of Uk .

The backward-looking center is selected from the forward-looking centers of


the preceding utterance; it thus anchors an utterance in the preceding discourse.
Now, the fundamental claim of Centering is that in order to establish (local)
6 Christian Chiarcos, Berry Claus, and Michael Grabski

coherence in discourse, speakers need to make sure that the backward-looking


center can be identified.
Forward-looking centers are thus organized in a salience ranking that re-
flects their likelihood to serve as backward-looking center of the following ut-
terance. The backward-looking center Cb (Uk ) is identified with the most salient
forward-looking center of the preceding utterance Uk−1 that is realized in Uk .
salience ranking of forward-looking centers
The elements of Cf (Uk−1 ) are organized in a partial order according to their
realization in Uk−1 , i.e., their grammatical roles:
subject > object > other
preferred center
The highest-ranking element of Cf (Uk−1 ) is defined as its “preferred cen-
ter”, Cp (Uk−1 ); intuitively, it is the most likely candidate for the backward-
looking center of the following utterance, Cb (Uk ).
Assuming that a hearer needs to keep track of the backward-looking center,
Centering then specifies the conditions that establish (local) coherence in dis-
course. These conditions include preferences among transitions between ad-
jacent utterances and constraints on the realization of the backward-looking
center.
Grosz et al. (1995) posit two rules which represent concise predictions of
Centering Theory. These concern the usage of pronouns and aspects of local
referential coherence.

Rule 1 (Pronominalization rule) If any element of Cf (Uk ) is realized by a


pronoun in Uk+1 then the Cb (Uk+1 ) must be realized by a pronoun also.
Rule 1 formulates the insight that the most salient referent, the attentional cen-
ter of the hearer, is likely to appear as a pronominal expression. If, thus, the
identification of the current backward-looking center is crucial for the flow of
attention in discourse, then pronominalization of another element that is differ-
ent from the backward-looking center could produce misleading interpretations
of an utterance.
As for local referential coherence, Centering postulates preferences on the
transitions between adjacent utterances. Following Grosz et al. (1995), the fol-
lowing types of transitions are to be distinguished:
center transitions
continue: The backward-looking center of the previous utterance is main-
tained and it is the preferred center of the current utterance.
Introduction: Salience in linguistics and beyond 7

retain: The backward-looking center of the previous utterance is main-


tained, but it is not the preferred center of the current utterance.
shift: The backward-looking center of the current utterance differs from
that of the preceding utterance.

Rule 2 (Transition rule) Sequences of continue are preferred over sequences


of retain; and sequences of retain are preferred over sequences of shift.
Rule 2 states a direct relationship between center indication and the local coher-
ence of two utterances: Shifts of attention (i.e., shifts of the backward-looking
center, shift) have to be minimized, and the continuity of the current backward-
looking center has to be signaled by cohesive means (i.e., continue > retain).
Both requirements support the identification of the backward-looking center.

2.2. An example
(1) (a) For insurance agent Toni Johnson, dealing with the earthquake
has been more than just a work experience.
(b) She lives in Oakland, a community hit hard by the earthquake.
(c) The apartment she shares with her sister was rattled, but nothing
was severely damaged.
RST Discourse Treebank (Carlson, Marcu, and Okurowski, 2003), file 3
(slightly simplified)

The centers in this short text are the discourse referents Toni, the earthquake,
Oakland, the apartment, and Toni’s sister. Consider sentences (1.b) and (1.c).
Sentence (1.b) contains three forward-looking centers, Toni, Oakland, and
the earthquake. From these, both Toni and the earthquake are equally feasi-
ble candidates for the backward-looking center. Thus, the backward-looking
center Cb (U1.b ) cannot be identified on grounds of grammatical roles alone.
However, this uncertainty can be resolved by means of Rule 2, as Toni is the
backward-looking center of (1.c). Identifying Toni with the backward-looking
center Cb (U1.b ) thus results in a continue transition between (1.b) and (1.c),
while assuming the earthquake to be Cb (U1.b ), means that a shift occurred.
Thus, Rule 2 predicts that the (preferred) backward-looking center of (1.b)
is Toni. In fact, this prediction is also consistent with Rule 1, as the referent
pronominalized in (1.b) is Toni.
With respect to the forward-looking aspect of Centering, Toni is realized as
subject referent, the other referents being oblique, and thus, Toni represents the
preferred center.
8 Christian Chiarcos, Berry Claus, and Michael Grabski

(1.b) Cf : {Toni} > {Oakland, earthquake} Cb : Toni, Cp : Toni,


transition: continue
Sentence (1.c) realizes the apartment, Toni and Toni’s sister, with Toni being
clearly identifiable as the backward-looking center. The backward-looking cen-
ter is thus maintained from the preceding utterance, and, thus, the transition
between (1.b) and (1.c) is to be regarded as continue.
As for the Cf ordering, Toni and the apartment are realized as subject refe-
rents in (1.c), and thus, more salient than Toni’s sister.
(1.c) Cf : {Toni, apartment} > {Toni’s sister} Cb : Toni, Cp : Toni or apart-
ment, transition: continue
In this way, Centering represents a model of local discourse coherence (Cen-
tering transitions), the assignment of grammatical roles (Cf ordering), the es-
tablishment of the topic (Cb ), and pronominalization preferences (Rule 1), thus
grounding these phenomena on utterance-internal salience.
Centering has gained a lot of attention in the last decades, as it provides
a consistent terminological framework in a field notoriously plagued by ter-
minological difficulties. Moreover, it was formulated as an operationalizable
framework for the (heuristic) treatment of discourse phenomena, thus encour-
aging the formulation of algorithms, e.g., for anaphor resolution (Brennan et
al., 1987). For both reasons, it was widely adopted throughout different com-
munities in linguistics and has also stimulated psycholinguistic research (e.g.,
Almor, 1999; Brown-Schmidt, Byron, and Tanenhaus, 2005; Gordon, Grosz,
and Gilliom, 1993; Gordon and Hendrick, 1998; Stevenson, 2002). However,
with the great acceptance of Centering in different linguistic communities, also
the formulation and understanding of Centering has advanced. Centering The-
ory evolved into a family of theories that differ with respect to certain assump-
tions and parameters. These parameters include definitional issues (‘utterance’,
‘center’, ‘realized in an utterance’, ‘pronoun’), the assumed salience rankings
(e.g., based on word order and the types of referring expressions rather than
grammatical roles), but also fundamental claims (e.g., whether or not a unique
backward-looking center is assumed, what types of transitions are considered,
whether backward-looking center is to be searched only in the immediately
preceding utterance). Some of these parameters have been studied by Poesio,
Stevenson, Di Eugenio, and Hitzeman (2004).
Introduction: Salience in linguistics and beyond 9

2.3. Multidimensional models of salience


Besides the investigation of parameters of salience and its realization in dis-
course, also the nature of salience itself has been studied. In the light of newer
experimental and empirical findings, this issue evolved into a major research
question in salience research (e.g., Arnold, 2005; Kaiser, 2006; Mulkern, 2007;
Navaretta 2002). Aspects of research involve the question whether or not one
single dimension of salience is to be assumed as the basis for the choice of re-
ferring expressions, whether salience involves multiple functional aspects that
are inherently independent from each other, or whether other factors besides
salience have to be taken into consideration.
Influential models such as Givenness Hierarchy proposed by Gundel et al.
(1993), but also Centering Theory, postulate one single scale or hierarchy of
degrees of salience along which discourse referents are organized, but alterna-
tives to these unidimensional models of salience have also been proposed very
early.
Givón’s concept of topicality (1983, 2001) involves two dimensions (‘ana-
phoric topicality’ and ‘cataphoric topicality’) that represent distinct functional
and cognitive aspects of the processing of referring expressions in discourse.
‘Anaphoric topicality’ concerns memory representations of the structure of the
preceding discourse, and is therefore comparable to the notion of salience in
Centering Theory. ‘Cataphoric topicality’ relates to the structure of the subse-
quent discourse, and the speaker’s current focus of attention. Both dimensions
can be compared to the differentiation between backward-looking and forward-
looking aspects of centers (referring expressions), but are formalized here by
means of different conceptions of topicality or salience.
This functional differentiation between two dimensions of salience has later
been rendered into similar dichotomies between inherent salience and imposed
salience (Clamons, Mulkern, and Sanders, 1993; Mulkern, 2007), activation and
prominence (Chafe, 1994), or givenness and relevance (Gundel and Mulkern,
1997; Pattabhiraman and Cercone, 1990).
Such multi-dimensional models of salience receive support from newer ex-
perimental and empirical studies. There is evidence that pronouns and demon-
stratives systematically deviate in their sensitivity to different salience factors
as shown for pronominals and demonstratives in Finnish and Dutch (Kaiser
and Trueswell, 2004, 2011; see also Brown-Schmidt et al., 2005). Yet, if dif-
ferences between different salience factors are preserved, this means that the
cognitive representation factors cannot be leveled by means of a single dimen-
sion of salience.
10 Christian Chiarcos, Berry Claus, and Michael Grabski

Navaretta (2002), Arnold (2005), and Kaiser (2006) showed that func-
tionally different constructions, i.e., topic-marking constructions as opposed
to focus-marking constructions, have similar implications on the choice of
pronominal as opposed to nominal expressions in the forthcoming discourse.
This is taken as evidence that both topic-marking and focus-marking construc-
tions have a specific forward-looking function that exists independently from
backward-looking aspects of salience. Moreover, a psycholinguistic study by
Stevenson, Crawley, and Kleinman (1994; see also Miltsakaki, 2007) points to
a crucial role of semantic factors on focusing, such as the assignment of the-
matic roles that can be interpreted as a forward-looking aspect of discourse
coherence.

2.4. The contributions


The section on entity-based salience comprises four contributions that docu-
ment the recent trend to investigate the multidimensionality of salience in dis-
course.
Olga Krasavina’s contribution concerns the characterization of Russian de-
monstratives in terms of salience (activation). A standard assumption in uni-
dimensional accounts of salience in discourse (e.g., Gundel et al., 1993), is
that demonstratives are characterized by a degree of salience intermediate be-
tween (highly salient) pronominals and (non-salient) non-demonstrative nom-
inals. This hypothesis is analyzed and further developed on the basis of a case
study of narratives. In addition, a series of experiments is described that as-
sesses factors that affect the use of demonstratives. One main conclusion of
Krasavina’s study is that the relationship between salience and the use of Rus-
sian demonstrative NPs seems to be rather loose.
Andrey Filchenko’s contribution is a typological study that presents mate-
rial that can be interpreted in terms of Centering Theory. However, Center-
ing is originally restricted to textual cues and expressions of salience whereas
the contexts that are looked at in Filchenko’s contribution are highly situation-
dependent. He observes the use of certain grammatical functions by speakers of
Khanty, a Uralic language in North Siberia, in situations where speakers refer
to themselves and at the same time are interested in concealing their agenthood
in certain activities. To do this they use a locative-marked ergative construc-
tion to express an ‘intransitive/transitive subject relation’, in terms of Dixon
(1994). Pragmatically, this makes up a ‘demoting’ construction, as compared
with an agentive subject construction, which allows that their agenthood can be
inferred, but, at the same time, their reports fail to express agenthood directly.
The motivation is that in the described situations the latter linguistic behavior
Introduction: Salience in linguistics and beyond 11

would break a cultural taboo. The use of the specific locative-marked ergative
constructions can thus be analyzed as being motivated by an act of intentionally
diminishing the salience of an entity (i.e., the speaker, in the examples brought
forward).
Ralph Rose’s contribution addresses the question whether a discourse en-
tity’s syntactic salience is indeed the major determinant of the use of referring
expressions. Starting from the observation that in English, syntactic role is often
confounded with semantic role, he conducted a corpus study to investigate the
relative contribution of syntactic and semantic factors on the use of pronominal
reference. Semantic role was operationalized in terms of FrameNet (Baker, Fill-
more, and Lowe, 1998) as well as in terms of PROTO-role entailments (Dowty,
1991). To analyze the corpus data, Rose adopted the notion of information value
from Information Theory (Shannon, 1948). The results indicate that syntactic
prominence and semantic prominence independently affect the salience of dis-
course entities
Finally, Christian Chiarcos’ contribution introduces the Mental Salience
Framework, a computational framework of salience metrics for the context-
adequate generation of referring expressions, i.e., the choice of referring expres-
sions, the assignment of grammatical roles, and word order preferences. The
Mental Salience Framework describes the realization of referring expressions
on the basis of two dimensions of salience, backward-looking (hearer-centered)
salience, and forward-looking (speaker-centered) salience. On this basis, an in-
tegrated architecture for aspects of attention control in discourse is proposed
which provides necessary preconditions for the technical operationalization of
multi-dimensional accounts of salience in discourse.

3. Beyond entities in discourse

3.1. Salience of propositions


We turn now to salience of text utterances. The similarity of these (or rather
their semantic counterparts, propositions) to entities becomes apparent when
their processing is modeled by means of a hierarchically structured text repre-
sentation where so-called rhetorical relations (or discourse relations) specify
the linking of each text utterance to the preceding text (Asher and Lascarides,
2003; Lascarides and Asher, 2007; Mann and Thompson, 1988). Whenever a
text utterance is processed, its attachment site has to be determined, i.e., some
node in the representation constructed so far. Attachment to that node may be
not trivial, and in fact resembles the choice of a salient antecedent.
12 Christian Chiarcos, Berry Claus, and Michael Grabski

Salience of propositions is regulated both by input and output contexts. On


the one hand there is a ‘backward orientation’: choice of an attachment site
means exploiting salience contours of the actual input context. The input context
consists of a tree-like structure, to one of whose nodes the constituent is to be
attached. An important hypothesis here is that salient nodes are on the rightmost
branch of the tree, that leads down from the root node to the node k of the
constituent that has been attached just before. This branch makes up the so-
called Right Frontier of the actual text representation (cf. Webber, 1991). The
most salient node on this branch is k, its ‘leaf’. Nodes that are above k in the
tree can likewise be attachment sites, but require that a more general level of
the text is aimed at, a step called discourse popping. Nodes that are not on the
Right Frontier can be used for attachment only by specific linguistic means,
one of them being the use of it-clefts (cf. Knott, Oberlander, O’Donnell, and
Mellish, 2000). Clearly such nodes are much less salient within the salience
contour that is presented by an input context.
As will become apparent in what follows, also output contexts of text utter-
ances are determined with respect to coherence structure.

3.2. An example
Consider the following text, taken from Lascarides and Asher (2007):
(2) a. John had a great evening last night.
b. He had a great meal.
c. He ate salmon.
d. He devoured lots of cheese.
e. He won a dancing competition.
f. ??It was a beautiful pink.
To describe the coherence structure in (2) we use the terminology from Seg-
mented Discourse Representation Theory (SDRT; cf. Asher, 1993; Asher and
Lascarides, 2003; Lascarides and Asher, 2007). SDRT combines classical Dis-
course Representation Theory (DRT; Kamp, 1981; Kamp and Reyle, 1993) with
discourse relations that determine, in a stepwise processing of a text, the type
of coherence of each of its utterances with a preceding text segment.
In (2) the first relation to be established is between text utterances (2.a) and
(2.b). In this case, the fact can be exploited that (2.a) presents an event of which
the event mentioned in (2.b) is a part. This normally permits that the relation
Elaboration is derived (cf. Mann and Thompson, 1988). Utterances (2.c) and
(2.d) both can attach to (2.b), using the latter as an input context for linking
them, again by Elaboration. There is also a temporal sequence of the events
Introduction: Salience in linguistics and beyond 13

mentioned in (2.c) and (2.d), which establishes a Narration relationship (Asher,


1993). A discourse structure as in (3) is obtained, with the four utterances each
abbreviated by a characteristic noun. These utterance contents specify the nodes
ka –kd of a graph structure, the subordinating relation Elaboration creating a
dominance relation between ka and kb , etc. In contrast, Narration, that is coor-
dinating, establishes a relation of sisters between kc and kd :

(3)

Now, in (2.e) the text mentions an event that cannot be interpreted as a sub-event
of the event in (2.b). But the input context made up by (2.a–2.d) and represented
in (3), offers the possibility to interpret it as a sub-event of the event mentioned
in (2.a). The node ke introduced by (2.e) is therefore attached, by Elaboration,
to the higher node ka , i.e., ‘discourse popping’ occurs. The event is also related
to (2.b) by Narration, cf. (4).

(4)

The discourse structure obtained so far poses a problem for attachment of (2.f).
By its content, it is best interpreted as elaborating the ‘Salmon constituent’,
(2.c). But that constituent is beyond the Right Frontier, which by now consists
14 Christian Chiarcos, Berry Claus, and Michael Grabski

of the branch ka –ke . The oddness of (2.f) is explained by the specific input
context that has been built up so far.
We may now sketch the effect of the choice of specific coherence relations
on output contexts. In several approaches, relations are classified as for the
‘weight’ of their arguments. In Rhetorical Structure Theory (RST; cf. Carlson
and Marcu, 2001; Mann and Thompson, 1988), most relations make a differ-
ence between their nucleus and satellite arguments; in SDRT relations are either
coordinating, i.e., relate sister nodes (Narration, Result etc.), or are subordinat-
ing, i.e., relate a daughter node to its mother node (Elaboration, Explanation
etc.). This difference shapes different salience contours at the output context
in the following way: By a coordinating relation, the newly attached node be-
comes the lowest element of the Right Frontier and the most salient node for the
next discourse constituent to be attached. In the example above, this happened
when (2.d) was attached by Narration, that is by a coordinating relation. In (3)
the node kd has become the most salient node, and kc , being no longer on the
Right Frontier, has lost its salience for subsequent text utterances. In contrast,
a subordinating relation preserves the position of an attachment site k on the
Right Frontier, as instantiated by the link of (2.b) to (2.a) by Elaboration, cf.
the position of nodes ka and kb in (3), (4).

3.3. Correspondence between salience of entities and of propositions


Indirectly, an interaction between salience of propositions and of entities has
been acknowledged, as rhetorical structure restricts anaphor resolution across
discourse constituents (Asher, 1993; Cristea, Ide, and Romary, 1998; Fox, 1987;
Grosz and Sidner 1986; for an overview see Chiarcos and Krasavina, 2008).
Already classical Discourse Representation Theory (DRT; Kamp, 1981; Kamp
and Reyle, 1993) discusses restrictions on the accessibility of antecedents due
to sentence internal embedding of clauses.
Conversely, anaphor resolution has been used as a testbed for classifying
specific discourse relations as being coordinating or subordinating (Asher and
Vieu, 2005). Also, anaphoric accessibility has been used for the automated pars-
ing of discourse structure (Schauer, 2000).
Yet, a proposal that relates salience of entities and propositions in a princi-
pled way is still a desideratum. However, Knott et al. (2000) and von Heusinger
(2007) provide promising first attempts in this regards.

3.4. The contributions


In the present volume, two papers address issues of salience in discourse struc-
ture and its relationship to the salience of entities.
Introduction: Salience in linguistics and beyond 15

The contribution by Wiebke Ramm addresses the translation of discourse


relations as expressed by a given connective. In a corpus of translations be-
tween Norwegian and German, she looks at the role of the ‘additive’ Norwe-
gian connective og (Engl. and) and its German counterpart, und. Both og and
und express coordinating relations. But interestingly, the actual distribution of
the two connectives is considerably different in both languages. In many cases
Norwegian og corresponds to a different connective in German, one that sig-
nals a subordinating relation. Coordination and subordination being correlated
with salience differences, these facts invite questions about the universality of a
direct coupling of connectives and discourse relations, and a possible influence
of literary style.
In the contribution by Roland Hinterhölzl and Svetlana Petrova, different
linguistic means, positions of verb arguments, are related to both, salience of
entities and of propositions. Analyzing the Old High German Tatian translation
from the 9th century, they observe a correlation between fronting of NPs and
their status of being discourse-old, i.e., given. As an effect, V2 constructions
arise. This contrasts with the regularity that discourse-new entities obviously are
referred to in a post-verbal position, yielding V1 constructions. Interestingly,
there is a second observation, that the coordination/subordination difference
between discourse relations may overwrite the first regularity: V2 appears in
subordinated constituents, V1 in coordinated ones. The paper thus discusses
an aspect of the relationship between two major types of salience: utterance
internal salience that plays a role in information structure and in the ranking of
forward-looking centers in centering, and the salience of whole utterances that
serves to establish coherence relations within a text.

4. Beyond purely linguistic salience

The issue of salience is not specific to linguistic information processing. Sal-


ience has been a topic of research in a wide range of disciplines and areas of
study, including judgments of similarity (e.g., Tversky, 1977), social cognition
(e.g., Higgins, 1996), causal attribution (e.g., Taylor and Fiske, 1978), and mu-
sic perception (Parncutt, 1994; see also Noll, 2005).
In fact, salience is of relevance for all kinds of mental processes. A common
problem in linguistic as well as in non-linguistic cognition is capacity over-
load due to too much information. Hence, for all mental processes, there is the
need to select which part of the available information is to be processed at a
given point in time. It is a common assumption that this selection is affected
by the salience of stimuli. For instance, some stimuli may “pop out” because
16 Christian Chiarcos, Berry Claus, and Michael Grabski

they are novel or one-of-a-kind. Being more salient, these stimuli will capture
attention.

4.1. Visual salience and situated language processing


The need for selection of information is particularly important in visual pro-
cessing. Thus, it is not surprising that the issue of salience has been a topic
of much research in the area of visual perception. Many studies investigated
patterns of eye movements, based on the fact that eye movements are an overt
manifestation of selective attention.
A highly influential account of visual salience is the biologically motivated
saliency map approach of Itti and Koch (2000). In their computational model,
salience is established in a bottom-up fashion from perceptual factors such as
contrast in color, intensity, and orientation. Indeed, there is empirical support
that such purely stimulus-driven factors affect the location of gaze fixations
in free viewing of meaningless visual patterns (Parkhurst, Law, and Niebur,
2002). However, selective attention in visual processing is by no means gov-
erned solely by bottom-up salience. Rather, attention allocation in scene view-
ing appears to be affected by both perceptual, bottom-up factors and cognitive,
top-down factors. Recent evidence suggests that during active viewing of mean-
ingful scenes, fixation location is primarily determined by cognitive factors,
such as scene knowledge and task knowledge (Henderson, Brockmole, Castel-
hano, and Mack, 2007; see also Einhäuser, Rutishauser, and Koch, 2008).
A naturally occurring activity that involves active viewing of meaningful
scenes is situated language processing. Findings of studies using the so-called
visual-world paradigm point to an interaction between visual processing and
language processing during situated language comprehension (e.g., Ellsiepen,
Knoeferle, and Crocker, 2008; Kamide, Altmann, and Haywood, 2003; Tanen-
haus, Spivey-Knowlton, Eberhard, and Sedivy, 1995). On the one hand, utter-
ance comprehension directs attention in the visual scene and on the other hand,
scene information guides the interpretation of utterances. Remarkably, there is
also empirical evidence that language processing in a situated context is not
only affected by visual information per se but also by action-based affordances
of the situation. It was found that the interpretation of noun phrases depends on
the compatibility between to-be-performed actions and objects within the sit-
uation (Chambers, Tanenhaus, Eberhard, Filip, and Carlson, 2002; Chambers,
Tanenhaus, and Magnuson, 2004).
Moreover, experimental studies on situated communication show that when
interlocutors refer to objects of the immediate physical environment, the choice
of the referring expression is determined by visibility and spatial distance. For
Introduction: Salience in linguistics and beyond 17

example, the proportion of deictic expressions is considerably higher when


the reference objects are visible to both interlocutors compared with when
they are not visible to the addressee (Clark and Krych, 2004) and the use of
deixis accompanied by pointing gestures increases when the reference objects
are spatially close (at arm length) compared with when they are further away
(Bangerter, 2004).

4.2. Non-linguistic salience and text comprehension


What might be somewhat surprising at first glance is that non-linguistic salience
does not only play a role in situated communication but also in plain, non-
situated text comprehension. We consider two very different examples of this,
typographical properties and properties of the described situation.
Typographical marking (e.g., underlining, boldface, italicization, and capi-
talization) is a means for accentuation in written language, similar to prosodic
marking in spoken language. Experimental findings indicate that both, ty-
pographical marking (italicization) and prosodic marking (focus-driven word
stress), capture attention and enhance depth of processing (A.S.J. Sanford,
A.J. Sanford, Molle, and Emmot, 2006). There is also empirical evidence that
different kinds of typographical marking have different functions – capitaliza-
tion signals modulatory stress whereas italicization signals contrastive stress
(McAteer, 1992). According to the saliency map approach (Itti and Koch,
2000), typographically marked words are highly visually salient – they stand out
from the rest of the text. However, that the visual marking of a word conveys its
informational salience is – beyond bottom-up, perceptual salience – driven by a
top-down factor, that is, the readers’ knowledge concerning the communicative
function of typographic signals. This is especially true for particular cases of ty-
pographically marked information, namely cross-references in dictionaries and
hyperlinks in electronic documents. Readers of dictionary entries or hypertexts
expect that the target texts of cross-references or hyperlinks contain information
that is related to the words that are marked as cross-references or hyperlinks –
and this expectation is based upon their knowledge of conventions.
We now turn to the second example of non-linguistic salience in text com-
prehension, properties of the described situation As was mentioned before, the
resolution of ambiguous pronominal anaphoric reference is not only determined
by structural factors but also by other factors such as the thematic roles of the
possible antecedents (Stevenson et al., 1994). Stevenson and her co-authors
attribute this finding to the focusing of entities in comprehenders’ models of
described events by proposing a default focus on the thematic role that is asso-
ciated with the endpoint of the event. Results from other studies demonstrate
18 Christian Chiarcos, Berry Claus, and Michael Grabski

that anaphor resolution is also affected by properties of the described situation,


for example, the presence of the reference entity in the described situation (e.g.,
Glenberg, et al., 1987) or the temporal distance between events in the described
world (Kelter et al., 2004). Hence, empirical evidence points to effects of the
salience of entities in the described situation over and above linguistic salience.
This is perfectly in line with the simulation view of language comprehension
(e.g., Barsalou, 1999; Glenberg, 2007; Zwaan, 2004), a theoretical approach
that gained increasing importance in language comprehension research in re-
cent years. A core assumption of this view is that representations of described
situations share the same representational format and recruit the same mental
subsystems as representations that are constructed during perception and inter-
action with the world. A corollary of this assumption is that factors that affect
perception and action (such as spatial and temporal variables) should have par-
allel effects on language comprehension.

4.3. The contributions


The contibution by John Kelleher is concerned with dialogue situated in a vi-
sual context. He devises an approach to integrate linguistic salience marking
and visual salience with regard to their impact on reference resolution. In a sit-
uated dialogue, referring expression can anaphorically refer to afore mentioned
entities or denote entities of the visual context that have not been introduced
during previous discourse. To account for reference resolution in a situated di-
alogue, Kelleher proposes a framework that uses integrated scores of linguistic
and visual salience for each object that is located in the visual context. In this
framework, linguistic salience is computed by an algorithm that is inspired by
Centering Theory and is based on the forward-looking centers of the preceding
discourse. The visual-salience algorithm is based on two factors that determine
the visibility of the objects in the scene, object size and distance from the point
of focus. These basic saliencies are weighted according to the given referring
expression to be resolved – taking into account both its content and form. The
result is an integrated reference relative salience score for each of the objects
in the dialogue situation with the highest scoring object being selected as the
referent of the referential expression. Hence, in Kelleher’s framework, the ref-
erential resolution process exploits both, the linguistic context and the visual
context of the situated dialogue.
Birgitta Bexten’s contribution focuses on hypertexts. She proposes an in-
tegrated account of both linguistic and hypertextual salience marking within
the framework of Centering Theory. A characteristic feature of hypertexts is
their branching structure of information through hypertext links. Readers of
Introduction: Salience in linguistics and beyond 19

hypertexts act on the assumption that the target text of a hypertext link pro-
vides additional information on the linked-marked entity. To put it in Bexten’s
words, a hypertext link allows for predictions of the content of the target text.
In this respect, a hypertext link resembles the Centering Theory’s notion of a
forward-looking center. It is by virtue of this parallel that Bexten proposes that
hypertext links should be regarded as preferred centers as well. What makes
a hypertext an interesting theoretical case is that hypertext links do not neces-
sarily coincide with linguistic criteria that define forward-looking centers. As a
consequence, a single utterance of a hypertext can contain more than one pre-
ferred center. In her contribution, Bexten develops a descriptive approach of
coherence in hypertexts, taking into account the commonalities and differences
between hypertextual and linguistic salience marking.
The contribution by Berry Claus takes a look at the issue of salience from
the perspective of the simulation view of language comprehension. Proponents
of this view (e.g., Barsalou, 1999; Glenberg, 2007, Zwaan, 2004) assume that
language comprehension is tantamount to mentally simulating the actual ex-
perience of the described situations. In her contribution, Claus proposes that
(non-linguistic) salience of discourse entities derives from the mental simula-
tions constructed during language comprehension. Her considerations are con-
fined to narrative text comprehension and she argues that what makes an en-
tity salient – over and above linguistic factors – may depend on facets of the
narrated situation. The main implication of the simulation view with regard to
non-linguistic salience is that the most salient entities are those that are present
in the described situation. Claus reviews empirical support for this notion stem-
ming from studies that investigated whether the mental accessibility of entities
mentioned in narrative text is affected by properties of the described situation.

References

Almor, Amit
1999 Noun-phrase anaphora and focus: the informational load hypothesis.
Psychological Review 106: 748–765.
Ariel, Mira
1990 Accessing noun-phrase antecedents. London: Routledge.
Arnold, Jennifer
2005 Marking salience: The similarity of topic and focus. [On-line] Avail-
able: http://www.unc.edu/∼jarnold/papers/top.foc.html
Asher, Nicolas
1993 Reference to abstract objects in discourse. Dordrecht: Kluwer.
20 Christian Chiarcos, Berry Claus, and Michael Grabski

Asher, Nicolas and Alex Lascarides


2003 Logics of conversation. Cambridge: Cambridge University Press.
Asher, Nicolas and Laure Vieu
2005 Subordinating and coordinating discourse relations. Lingua 115: 591–
610.
Baker, Collin F., Charles J. Fillmore, and John B. Lowe
1998 The Berkeley FrameNet project. In Proceedings of the COLING-ACL
’98 Conference, 86–90. Montreal: Association for Computational
Linguistics.
Bangerter, Adrian
2004 Using pointing and describing to achieve joint focus of attention in
dialogue. Psychological Science 15: 415–419.
Barsalou, Lawrence W.
1999 Perceptual symbol systems. Behavioral and Brain Sciences 22: 577–
660.
Brandt, Margarethe
1996 Subordination und Parenthese als Mittel der Informations-strukturie-
rung in Texten. In Ebenen der Textstruktur. Sprachliche und kom-
munikative Prinzipien, Wolfgang Motsch (ed.), 211–240, Tübingen:
Niemeyer.
Brennan, Susan E., Marilyn W. Friedman, and Carl J. Pollard
1987 A Centering approach to pronouns. In Proceedings of the 25th Annual
Meeting of the Association for Computational Linguistics, 155–163.
Stanford, Cal.
Brown-Schmidt, Sarah, Donna K. Byron, and Michael K. Tanenhaus
2005 Beyond salience: interpretation of personal and demonstrative pro-
nouns. Journal of Memory and Language 53: 292–313.
Carlson, Lynn and Daniel Marcu
2001 Discourse tagging manual. ISI Tech Report ISI-TR-545.
Carlson, Lynn, Daniel Marcu, and Mary Ellen Okurowski
2003 Building a discourse-tagged corpus in the framework of rhetorical
structure theory. In Current directions in discourse and dialogue, Jan
van Kuppevelt and Ronnie Smith (eds.), 85–112. New York: Kluwer
Academic Publishers.
Chafe, Wallace
1976 Giveness, contrastiveness, definiteness, subjects, topics, and point of
view. In Subject and topic, Charles N. Li (ed.), 25–56. New York:
Academic Press.
Introduction: Salience in linguistics and beyond 21

Chafe, Wallace
1994 Discourse, consiousness, and time. The flow and displacement of con-
scious experience in speaking and writing. Chicago: University of
Chicago Press.
Chambers, Craig G., Michael K. Tanenhaus, Kathleen M. Eberhard, Hana Filip, and
Greg N. Carlson
2002 Circumscribing referential domains during real-time language com-
prehension. Journal of Memory and Language 47: 30–49.
Chambers, Craig G., Michael K. Tanenhaus, and James S. Magnuson
2004 Actions and affordances in syntactic ambiguity resolution. Journal
of Experimental Psychology: Learning, Memory, and Cognition 30:
687–696.
Chiarcos, Christian and Olga Krasavina
2008 Rhetorical distance revisited: A parameterized approach. In Con-
straints in Discourse, Anton Benz and Peter Kühnlein (eds.), 97–115,
Amsterdam: John Benjamins.
Clamons, C. Robin, Ann E. Mulkern, and Gerald Sanders
1993 Salience signaling in Oromo. Journal of Pragmatics 19: 519–536.
Clark, Herbert H. and Meredyth A. Krych
2004 Speaking while monitoring addressees for understanding. Journal of
Memory and Language 50: 62–81.
Cristea, Dan, Nancy Ide, and Laurent Romary
1998 Veins Theory: A model of global discourse cohesion and coherence.
In Proceedings of the 36th Meeting of the Association for Computa-
tional Linguistics and 17th Conference on Computational Linguistics,
281–285, San Francisco.
Dixon, Robert M. W.
1994 Ergativity. Cambridge: Cambridge University Press.
Dowty, David
1991 Thematic proto-roles and argument selection. Language 67: 547–619.
Einhäuser, Wolfgang, Ueli Rutishauser, and Christof Koch
2008 Task-demands can immediately reverse the effects of sensory-driven
saliency in complex visual stimuli. Journal of Vision 8: 1–19.
Ellsiepen, Emilia, Pia Knoeferle, and Matthew W. Crocker
2008 Incremental syntactic disambiguation using depicted events: Plausi-
bility, co-presence and dynamic presentation. In Proceedings of the
30th Annual Conference of the Cognitive Science Society, Brad C.
Love, Ken McRae, and Vladimir M. Sloutsky (Eds.), 2398–2403.
Austin, TX: Cognitive Science Society.
22 Christian Chiarcos, Berry Claus, and Michael Grabski

Fillmore, Charles J.
1977 Topics in lexical semantics. In Current issues in linguistic theory,
Roger W. Cole (ed.), 76–138. Bloomington: Indiana University Press.
Fox, Barbara A.
1987 Discourse structure and anaphora: Written and conversational En-
glish. Cambridge: Cambridge University Press.
Givón, Talmy
1983 Introduction. In Topic continuity in discourse: a quantitative cross-
language study, Talmy Givón (ed.), 5–41. Amsterdam: John Ben-
jamins.
Givón, Talmy
1995 Functionalism and grammar. Amsterdam: John Benjamins.
Givón, Talmy
2001 Syntax (2nd edition). Amsterdam: John Benjamins.
Glenberg, Arthur M.
2007 Language and action: Creating sensible combinations of ideas. In Ox-
ford Handbook of Psycholinguistics, Gareth Gaskell (ed.), 361–370.
Oxford, UK: Oxford University Press.
Glenberg, Arthur M., Marion Meyer, and Karen Lindem
1987 Mental models contribute to foregrounding during text comprehen-
sion. Journal of Memory and Language 26: 69–83.
Gordon, Peter C., Barbara J. Grosz, and Laura A. Gilliom
1993 Pronouns, names, and the centering of attention in discourse. Cogni-
tive Science 3: 311–347.
Gordon, Peter C. and Randall Hendrick
1998 The representation and processing of coreference in discourse. Cog-
nitive Science 22: 389–424.
Groenendijk, Jeroen and Martin Stokhof
1991 Dynamic predicate logic. Linguistics and Philosophy 14: 39–100.
Grosz, Barbara
1981 Focusing and description in natural language dialogues. In Elements
of discourse understanding, Aravind K. Joshi, Bonnie L. Webber, and
Ivan A. Sag (eds.), 85–105. Cambridge: Cambridge University Press.
Grosz, Barbara J., Aravind K. Joshi, and Scott Weinstein
1983 Providing a unified account of definite noun phrases in discourse. In
Proceedings of the 21st Annual Meeting of the Association of Com-
putational Linguistics, 44–50.
Grosz, Barbara J., Aravind K. Joshi, and Scott Weinstein
1995 Centering: A framework for modelling the local coherence of dis-
course. Computational Linguistics 21: 203–225.
Introduction: Salience in linguistics and beyond 23

Grosz, Barbara J. and Candace L. Sidner


1986 Attention, intentions, and the structure of discourse. Computational
Linguistics 12:175–204.
Gundel, Jeanette K., Nancy A. Hedberg, and Ron Zacharski
1993 Cognitive staus and the form of referring expressions in discourse.
Language 69: 247–307.
Gundel, Jeanette K. and Ann Mulkern
1997 Relevance, referring expressions and the Givenness Hierarchy. In Pro-
ceedings of the Workshop on Relevance Theory. University of Her-
fordshire.
Hajičová, Eva and Jarka Vrbova
1982 On the role of the hierarchy of activation in the process of natural
language understanding. In COLING 82 – Proceedings of the Ninth
International Conference of Computational Linguistics, Jan Horecký
(ed.), 107–113. Amsterdam: North Holland.
Henderson, John M., James R. Brockmole, Monica S. Castelhano and Michael Mack
2007 Visual saliency does not account for eye movements during visual
search in real-world scenes. In Eye movements: A window on mind
and brain, Roger P. G. van Gompel, Martin H. Fischer, Wayne S.
Murray, and Robin L. Hill (eds.), 537–562. Oxford: Elsevier.
von Heusinger, Klaus
1997 Salienz und Referenz: Der Epsilonoperator in der Semantik der Nomi-
nalphrase und anaphorischer Pronomen. Berlin: Akademie Verlag.
von Heusinger, Klaus
2007 Accessibility and definite noun phrases. In Anaphors in text. Cogni-
tive, formal and applied approaches to anaphoric reference, Monika
Schwarz-Friesel, Manfred Consten, and Mareile Knees (eds.), 123–
144. Amsterdam: John Benjamins.
Higgins, E. Tory
1996 Knowledge activation: Accessibility, applicability, and salience. In
Social psychology: Handbook of basic principles, E. Tory Higgins
and Arie W. Kruglanski (eds.), 133–168. New York: The Guilford
Press.
Itti, Laurent and Christof Koch
2000 A saliency-based search mechanism for overt and covert shifts of vi-
sual attention. Vision Research 40: 1489–1506.
Joshi, Aravind K. and Scott Weinstein
1981 Control of inference: the role of some aspects of discourse structure-
centering. In Proceedings of the Seventh International Joint Confer-
ence on Artificial Intelligence, 385–387.
24 Christian Chiarcos, Berry Claus, and Michael Grabski

Kaiser, Elsi
2006 Effects of topic and focus on salience. In Proceedings of Sinn und
Bedeutung 10, Christian Ebert and Cornelia Endriss (eds), 139–154.
Berlin: ZAS Working Papers in Linguistics, Vol. 44.
Kaiser, Elsi and John Trueswell
2004 The referential properties of Dutch pronouns and demonstratives: Is
salience enough? In Proceedings of the Conference „sub8 – Sinn und
Bedeutung“, Cecile Meier and Matthias Weisgerber (eds.), 137–149.
Universität Konstanz: Arbeitspapier Nr. 177, FB Sprachwissenschaft.
Konstanz, Germany.
Kaiser, Elsi and John Trueswell
to appear 2011 Investigating the interpretation of pronouns and demonstratives in
Finnish: Going beyond salience. In The processing and acquisition of
reference, Edward Gibson and Neal J. Pearlmutter (eds). Cambridge,
Mass.: MIT Press.
Kamide, Yuki, Gerry T. M. Altmann, and Sarah L. Haywood
2003 The time-course of prediction in incremental sentence processing:
Evidence from anticipatory eye movements. Journal of Memory and
Language 49: 133–156.
Kamp, Hans
1981 A theory of truth and semantic representation. In Formal methods in
the study of language, Jeroen A.G. Groenendijk, Theo M. V. Janssen,
and Martin B. J. Stokhof (eds.), 277–322. Amsterdam: Foris.
Kamp, Hans and Uwe Reyle
1993 From discourse to logic. Dordrecht: Kluver.
Kelter, Stephanie, Barbara Kaup, and Berry Claus
2004 Representing a described sequence of events: A dynamic view of nar-
rative comprehension. Journal of Experimental Psychology: Learn-
ing, Memory, and Cognition 30: 451–464.
Knott, Alistair, Jon Oberlander, Mick O’Donnell, and Chris Mellish
2000 Beyond elaboration: The interactions of relations and focus in coher-
ent text. In Text representation: linguistic and psycholinguistic as-
pects, Ted Sanders, Joost Schilperoord, and Wilbert Spooren (eds.),
181–196. Amsterdam: John Benjamins.
Krifka, Manfred
2007 Basic notions of information structure. In Interdisciplinary studies
of information structure 6, Caroline Féry and Manfred Krifka (eds.),
Potsdam.
Introduction: Salience in linguistics and beyond 25

Lascarides, Alex and Nicolas Asher


2007 Segmented Discourse Representation Theory: Dynamic semantics with
discourse structure. In Computing meaning, Volume 3, Harry Bunt
and Reinhard Muskens (eds.), 87–124. Berlin: Springer.
Lewis, David
1979 Scorekeeping in a language game. In Semantics from different points
of view, Rainer Bäuerle, Urs Egli, and Arnim von Stechow (eds.),
172–187. Berlin: Springer.
Mann, William C. and Sandra. A. Thompson
1988 Rhetorical structure theory: Toward a functional theory of text orga-
nization. Text 8: 243–281.
McAteer, Erica
1992 Typeface emphasis and information focus in written language. Ap-
plied Cognitive Psychology 6: 345–359.
Miltsakaki, Eleni
2007 A rethink of the relationship between salience and anaphora resolu-
tion. In Anaphora: Analysis, algorithms and applications, António
Branco (ed.), 91–96. Berlin: Springer.
Molnár, Valéria
1993 Zur Pragmatik und Grammatik des TOPIK-Begriffes. In Wortstel-
lung und Informationsstruktur, Marga Reis, (ed.), 155–202. Tübin-
gen: Niemeyer.
Mulkern, Ann
2007 Knowing who’s important: Relative discourse salience and irish pro-
nominal forms. In The grammar-pragmatics interface: Essays in
honor of Jeanette K. Gundel, Nancy A. Hedberg and Ron Zacharski
(eds.), 113–142. Amsterdam: John Benjamins.
Muskens, Reinhard
1991 Anaphora and the logic of change. In Logics in AI, Proceedings of
JELIA ’90, Volume 478 of Lecture Notes in Computer Science, Jan
van Eijck (ed.), 412–427). Berlin: Springer.
Navaretta, Costanza
2002 Combining information structure and Centering-based models of sa-
lience for resolving intersentential pronominal anaphora. In Proceed-
ings of the 4th Discourse Anaphora and Anaphora Resolution Collo-
quium, Tony McEnery, António Branco, and Ruslan Mitkov (eds.),
135–140. Lisbon: Edições Colibri.
Noll, Thomas
2005 Salience and musical discourse. In Salience in discourse. Proceed-
ings of the 6th Workshop on Multidisciplinary Approaches to Dis-
course, Manfred Stede, Christian Chiarcos, Michael Grabski, and
26 Christian Chiarcos, Berry Claus, and Michael Grabski

Luuk Lagerwerf (eds), 5–6. Amsterdam: Stichting Neerlandistiek;


Münster: Nodus Publikationen.
Osgood, Charles E. and J. Kathryn Bock
1977 Salience and sentencing: Some production principles. In Sentence
production: Developments in research and theory, Sheldon Rosen-
berg (ed.), 89–140. Hillsdale: Erlbaum.
Parkhurst, Derrick, Klinton Law, and Ernst Niebur
2002 Modeling the role of salience in the allocation of overt visual atten-
tion. Vision Research 42: 107–123.
Parncutt, Richard
1994 A perceptual model of pulse salience and metrical accent in musical
rythms. Music Perception 11: 409–464.
Pattabhiraman, Thiyagarajasarma and Nick Cercone
1990 Selection: Salience, relevance and the coupling between domain-level
tasks and text planning. In Proceedings of the 5th International Work-
shop on Natural Language Generation, 79–86. Pittsburgh.
Poesio, Massimo, Rosemary Stevenson, Barbara Di Eugenio, and Janet Hitzeman
2004 Centering: a parametric theory and its instantiation. Computational
Linguistics 30: 309–364.
Prince, Ellen F.
1981 Toward a taxonomy of given-new information. In Radical Pragma-
tics, Peter Cole (ed.), 223–256. New York: Academic Press.
van der Sandt, Rob
1992 Presupposition projection as anaphora resolution. Journal of Seman-
tics 9: 333–377.
Sanford, Alison J. S., Anthony J. Sanford, Jo Molle, and Catherine Emmott
2006 Shallow processing and attention capture in written and spoken dis-
course. Discourse Processes 42: 109–130.
Schauer, Holger
2000 From elementary discourse units to complex ones. In Proceedings of
1st SIGdial Workshop on Discourse and Dialogue, Laila Dybkjær,
Kôiti Hasida and David Traum (eds.), 46–55. Hong Kong.
Sgall, Petr, Eva Hajičová, and Jarmila Panevova
1986 The meaning of the sentence in its semantic and pragmatic aspects.
Dordrecht: Reidel.
Shannon, Claude E.
1948 A mathematical theory of communication. The Bell System Technical
Journal 27: 379–423, 623–656.
Introduction: Salience in linguistics and beyond 27

Sidner, Candace L.
1978 The use of focus as a tool for disambiguation of definite noun phrases.
In Theoretical issues in natural language processing, TINLAP 2,
David L. Waltz (ed.), 86–95. Association for Computing Machinery,
University of Illinois at Urbana- Champaign.
Sidner, Candace L.
1981 Focusing for the interpretation of pronouns. American Journal of
Computational Linguistics 7: 217–231.
Sidner, Candace L.
1983 Focusing in the comprehension of definite anaphora. In Computa-
tional models of discourse, Michael Brady and Robert Berwick (eds.),
267–330. Cambridge: MIT Press.
Stevenson, Rosemary
2002 The role of salience in the production of referring expressions. In In-
formation sharing: Reference and presupposition in language gener-
ation and interpretation, Kees van Deemter and Rodger Kibble (eds.),
167–192. Stanford: CSLI Publications.
Stevenson, Rosemary J., Rosalind A. Crawley, and David Kleinman
1994 Thematic roles, focus and the representation of events. Language and
Cognitive Processes 9: 519–548.
Tanenhaus, Michael K., Michael J. Spivey-Knowlton, Kathleen M. Eberhard, and
Julie C. Sedivy
1995 Integration of visual and linguistic information in spoken language
comprehension. Science 268: 1632–1634.
Taylor, Shelley E. and Susan T. Fiske
1978 Salience, attention, and attributions: Top of the head phenomena. In
Advances in experimental psychology – Vol. 11, Leonard Berkowitz
(ed.), 249–287. New York: Academic Press.
Tomlin, Russel S.
1995 Focal attention, voice, and word order. an experimental, cross-linguis-
tic study. In Word order in discourse, Mickey Noonan and Pamela
Downing (eds.), 517–554. Amsterdam: John Benjamins.
Tversky, Amos
1977 Features of similarity. Psychological Review 84: 327–352.
Vallduví, Enric
1992 The informational component. New York: Garland.
Vallduví, Enric and Elisabeth Engdahl
1996 The linguistic realization of information packaging, Linguistics 34:
459–519.
28 Christian Chiarcos, Berry Claus, and Michael Grabski

Walker, Marilyn A., Aravind K. Joshi, and Ellen F. Prince


1998 Centering theory in discourse. Oxford: Clarendon Press.
Webber, Bonnie Lynn
1991 Structure and ostension in the interpretation of discourse deixis. Nat-
ural Language and Cognitive Processes 6: 107–35.
Zwaan, Rolf. A.
2004 The immersed experiencer: toward an embodied theory of lan-
guage comprehension. In The psychology of learning and motivation,
Brian H. Ross (ed.), 35–62. Academic Press, New York.
Part I.
Entity-based salience in discourse
Demonstratives and salience:
Towards a functional taxonomy

Olga Krasavina

Abstract. The current article focuses on the use of demonstratives in Russian within
the broader discourse phenomenon of referential choice. It has been repeatedly claimed
that referential choice and the activation level (salience) of a referent in the memory of
the speaker/listener are interconnected (e.g. Chafe 1994; Tomlin and Pu 1991; Kibrik
1996; 2000). In this article, the cognitive-psychological model of Gundel et. al. (1993,
2001) that specifies this connection is assessed and formulated more precisely. Following
this analysis, I determine several cases of demonstrative NP use and summarize them
as a model of referential choice. This study combines corpus analysis with experimental
methods and presents the results obtained in three experiments. The experiments employ
a questionnaire method, and involve a forced choice task and a written text-continuation
exercise. The results of Experiment 1 demonstrate the effect of a time shift factor on
demonstrative noun phrase (NP) use; the results of Experiments 2 and 3 cast doubt on
the prevalent hypothesis that it is the prospective relevance of a discourse new referent
that stimulates demonstrative NP use when mentioned for the second time.

1. Introduction

In the last few decades, studies in referential choice1 have enjoyed great pop-
ularity. This article specifically examines the discourse use of demonstratives
in Russian. Despite considerable progress in explanation of full noun phrase
(NP) and pronoun use achieved today, the use of demonstratives has not re-
ceived proper attention. The question of what specific factors stimulate the use
of demonstrative noun phrases still remains open.
The subject of this study was restricted to proximal demonstratives, since
the factors leading to the use of proximal and distant demonstratives may be of
different nature. Proximal demonstratives have an important place in the model
of referential choice – in Russian they are the third most frequent anaphoric

1. Referential choice is a selection of one referential device from a number of referential


devices available (e.g., the student – this student – he) made by a speaker at the
moment of utterance.
32 Olga Krasavina

device2 after non-demonstrative full NPs and personal pronouns. This article
is based on Russian data, and I cite only Russian examples below. It is quite
likely, however, that some results obtained here may be relevant for a more
general understanding of the use of demonstratives in discourse.
The current article presents a combination of corpus and experimental ap-
proaches. Unlike the previous studies of demonstratives in Russian, which used
a limited number of casual examples (Padučeva 1985, Boguslavskaja and Mu-
rav’jeva 1987), the present study involves several hundred examples from nat-
ural discourse and aims at describing all these examples. To verify the role of
some potential factors, several psycholinguistic experiments were conducted.
The term demonstrative NP, or briefly ètot X in this paper, is used to refer
to a non-pronominal NP in anaphoric use3 and consists of the following lexical
items: 1) proximal demonstrative ètot ‘this’ in adnominal use followed by a
head noun with or without attributes, e.g. ètot mal’čik ‘this boy’ or 2) proximal
demonstrative ètot ‘this one’ in nominal use with or without attributes.4 The
term “pronoun” will refer to the third-person pronouns (on).5 With respect to
non-demonstrative full NPs, I will henceforth use the term “plain NP” for the
sake of brevity. The paradigm of ètot is presented below in Table 1.
Some influential explanatory models of referential choice emerged within
the cognitive approach to discourse (Chafe 1976, Givón 1983, Gundel et. al.
1993, 2001, Ariel 1994, Grosz et. al. 1995). Within this approach it is claimed
that the choice of a referential expression is constrained by the speaker’s eval-
uation of the referent’s representation in the memory, or activation (salience),
in the mind of the listener. Gundel et. al. (1993: 275) argue that each “mem-
ory and/or attention state” in the Givenness Hierarchy (see below) “is a neces-
sary and sufficient condition for appropriate use of a different form or forms”,
and that demonstrative NPs correspond to the medium memory and/or attention
state.

2. “Anaphoric device” is defined in this work as a form used to refer to a recently


mentioned referent.
3. “Anaphora” is understood as a cohesion which points back to some previous item
(Halliday and Hasan 1976).
4. Demonstratives in both syntactic uses have the same morphological forms in Rus-
sian.
5. The system of the third-person pronouns in Russian is represented by three different
forms for three genders: on (masculine), ona (feminine) and ono (neuter); pl. oni
‘they’ for all genders. Russian grammatical gender is not semantically transparent,
so on, for example, translates into English as ‘he’, ‘she’, or ‘it’ depending on the
grammatical gender of the corresponding noun.
Demonstratives and salience: Towards a functional taxonomy 33

Table 1. Paradigm of ètot.


Singular Plural
masculine neuter feminine
Nominative ètot èto 6 èta èti
Accusative ètot/ètogo èto ètu èti/ètix 7
Dative ètomu ètomu ètoj ètim
Instrumental ètim ètim ètoj ètimi
Locative ètom ètom ètoj ètix
Genitive ètogo ètogo ètoj ètix

Since activation level is a cognitive category, it cannot be seen or directly mea-


sured (at least without developing online testing methods). Attempts to explain
the use of a certain referential expression on the basis of an activation level and
vice versa often result in a circular reasoning. In the qualitative model of Kibrik
(1996, 2000 for Russian and English), judgments as to the activation level are
based on the independent factors. An activation level receives a numerical char-
acteristic. A value from zero to one is attached to the factors that prove to be
relevant for the use of pronouns/full NPs, such as the distance to the antecedent,
the referent’s role as a protagonist, the grammatical role of the antecedent, etc.
The activation level is a sum of these values. This model is designed to explain
and predict the use of a referential device on the basis of a set of observable
factors.
This work follows Kibrik (2000) in adopting a number of basic assumptions
(for the model of referential choice as a whole):
– referential choice is a process, conducted by a speaker at a certain moment;
– referential choice is a multi-factorial process;
– the model of referential choice should be a) predictive, and b) oriented to-
wards the cognitive structures of the speaker at the moment of utterance.
The basic questions addressed in this article are:
1) Is the conclusion of Gundel et. al. (1993) valid as far as demonstratives are
concerned?

6. The form èto in nominal use is a specialized device which has been excluded from
consideration in this work for the sake of simplicity. The nominal èto often denotes
events and situations rather than referents, e.g. Ja byl gotov k otvetu, i èto pridavalo
mne uverennosti. – I was prepared to answer, and this made me feel confident. Nom-
inal èto should not be confused with the neuter form of ètot. Nominal èto has been
studied by Padučeva (1985).
7. The accusative forms ètot/ètogo and èti/ètix correspond to inanimate and animate
objects, respectively.
34 Olga Krasavina

2) If not, what are the factors determining the use of a demonstrative NP?
The structure of this article is as follows. In Section 2 a connection between acti-
vation level and demonstrative NPs is investigated, and the results of the corpus
study are summarized. In Section 3 the cases that remained unexplained in the
corpus study are considered and the experiments that served to verify several
hypotheses are discussed. The final conclusions are reported in Section 4.

2. A corpus study of demonstrative NP uses

The corpus study used the texts of Russian authors such as F. Iskander, K Si-
monov, etc. (see the list of authors at the end of this article). All texts were
written fiction. The study sample consisted of 254 examples, out of which 217
were demonstrative NP uses and 37 were plain NP uses that are interchange-
able with a demonstrative NP. A demonstrative NP proved to be a relatively
infrequently used device compared to the basic referential devices – plain NPs
and pronouns (see Table 2 for the numbers). Examples that were functionally
marked and demanded a separate consideration were excluded. The excluded
cases mostly constituted lexicalized expressions like na ètot raz ‘this time’, na
ètot sčet ‘in this respect’, and occurrences of demonstrative NP in contexts of
internal and direct speech.8
Table 2. The frequency of demonstrative NP, plain NP, and pronoun uses per 1000
words (in a sample text).

plain NP pronoun ètot X


180 39 9

2.1. Gundel et. al. (1993) evaluation


The Givenness Hierarchy by Gundel et. al. (1993) (see also Gundel et. al. 2001
for the use of demonstratives) is a well-known approach which claims to pro-
vide an explanation of how referential expressions are chosen. I focus on this
model in this study because of its general character. The approach in question
suggests six cognitive statuses or memory and/or attention states of referent
representation, each status being necessary and sufficient for the appropriate
use of a certain referential device. The opposite is also true: the use of a cer-
tain referential form suggests a certain cognitive status of a referent, so that the

8. More than 30% of demonstrative NP uses were excluded in total.


Demonstratives and salience: Towards a functional taxonomy 35

addressee is informed about it. The formal means corresponding to a certain


cognitive status can vary in different languages.
According to the Givenness Hierarchy (see Table 3), demonstratives overtly
signal a medium referent activation in memory (e.g. at most “familiar” or “ac-
tivated”) being situated between pronouns that correspond to the highest ac-
tivation (“in focus”) and plain NPs that correspond to the lowest activation
(“uniquely identifiable”, “referential”, “type identifiable”).
Table 3. The cognitive statuses and the corresponding referential forms in Russian (from
Gundel et. al. 1993: 284).
Statuses in focus > activated > familiar > uniquely identifiable > referential
> type identifiable
Forms Ø; èto ètot X ØX
on ‘this’; ‘this X’;
‘he’ to tot X
‘that’ ‘that X’

Some examples in my corpus appear to confirm this generalization. According


to Gundel et. al. (1993), the referent is “familiar” if “the addressee is able to
uniquely identify the intended referent because he already has a representation
of it in memory (in long-term memory if it has not been recently mentioned
or perceived or in short-term memory if it has)”. The referent is “activated”
if it is “represented in current short-term memory”; “activated representations
may have been retrieved from long-term memory, or they may arise from the
immediate linguistic or extra-linguistic context”. The referent is “in focus” if
it includes “at least the topic of the preceding utterance, as well as any still
relevant higher-order topics” (topic is understood as “what the speaker intends a
sentence to be primarily about”; Gundel et. al. 1993: 278–279).9 The occurrence
of ètot X in the following example appears to conform to what one can assess
at least as “familiar”:
(1) Oni emu ne poverili, no očen’ obradovalis", kogda on, naščupav v kar-
mane šineli banku konservov, predložil im poest’ pered dorogoj. V banke
okazalis’ kil’ki, i oni vtroem s""eli èti kil’ki bez xleba i vody ([3]).10

9. Gundel et. al. (1993, 2001) often make judgements as to what memory and/or at-
tention state a referent may have at a certain case, on the basis of the context or
the referent’s properties, as quoted above. A general model of identifying cognitive
statuses for certain cases has not been proposed, though.
10. I do not use word-by-word glosses here since the details of the Russian examples are
irrelevant for the topic of this article.
36 Olga Krasavina

‘They didn’t believe him but they were very glad when he, having dis-
covered a tin in the pocket of his overcoat, suggested that they eat be-
fore the journey. In the tin there were sprats, and the three of them ate
these sprats without bread and water.’
Gundel et. al. (1993) claim that the referent of ètot must be at least familiar, so
this condition would also be met by the higher cognitive statuses (“activated”
or “in focus”). This claim is supported by my material (see (2) where ètot X is
used for a referent “in focus” in Gundel et. al. (1993) terms). It is not difficult
to find contexts, however, in which under a low activation level – where only
a plain NP is expected (“uniquely identifiable” or “referential” or “type identi-
fiable”) – a demonstrative NP is used, as in examples where the distance from
a demonstrative NP to its antecedent is high (more than three clauses)11 or as
in (3):
(2) Mne stalo stydno za sebja, potomu čto ni razu v žizni ja ne projavil
nastojaščego interesa k tomu, što on delal. Kak i vse my, pogloščennyj
svoimi zabotami, ja ne pridaval dolžnogo značenija žiznennoj celi ètogo
ognennogo mečtatelja [1].
‘I felt ashamed for myself because I didn’t express real interest in what
he was doing, not even once. Like all of us swallowed by everyday prob-
lems, I never attached due importance to the life goal of this fired-up
dreamer.’
(3) – Na sever sejčas nevozmožno – čerez šosse ne proskočim. Noči nado
ždat’ … I xoronit’ aby kak – grešno. My što – boimsja ix? Ili udiraem?
Ladno, davaj na xutor k Šandoru Borce.
(Beginning of a new chapter) Ètogo vengra xorošo znali vse okrest. On
razvodil xmel’, deržal neskol’ko korov … [7].
‘– [To go] to the north is impossible now – we’ll not make it through the
highway. We need to wait for the night … And to bury [him] improperly
is no good. We are not afraid of them, are we? Or are we running away?
Okay, let’s go to the farm of Shandor Borec.
[Beginning of a new chapter] This Hungarian was famous all around
(lit. This Hungarian [Obj] knew everyone [Subj] around). He cultivated
hops, had a few cows …’
In (3), the referent Hungarian was mentioned previously only once by means
of a proper name in the end of the preceding chapter in a non-prominent syntac-

11. I avoid citing such examples here, in order to save space: these examples are ex-
tremely lengthy.
Demonstratives and salience: Towards a functional taxonomy 37

tic position before it was mentioned by means of a demonstrative NP. Ètot X


occurred at the beginning of the first sentence of a new chapter.12 The nominal
part of the demonstrative NP is not the same as that of the antecedent. Thus,
the identity of the referents encoded by the proper name and the demonstrative
NP is far from obvious. This identity may be obvious for the speaker (writer,
in this case), but not for the addressee. Moreover, both mentions of the referent
are separated by the chapter border which lowers the referent’s activation level
even more. The demonstrative NP in question corresponds to what Gundel et.
al. (1993) would call “referential”: the speaker intends to say something about
the particular Hungarian.
As seen in (3), ètot X can be used to encode the referents which have both
higher than “familiar” status (in consistency with the Gundel et. al. (1993) pre-
diction) and lower than “familiar” status. That means that the scope of the possi-
ble appropriate uses of ètot X is wider than is predicted by Gundel at. al. (1993).
Thus, within the model of Gundel et. al. (1993) a more precise statement regard-
ing the use of demonstrative NPs, at least the Russian ètot X, needs to be made.
The resulting model can be summarized as follows (see Figure 1).

low activation middle activation high activation


plain NPs
demonstrative NPs
pronouns

Figure 1. The prototypical correspondences between cognitive statuses and referential


forms according to Gundel et. al. (1993), and directions in which the forms can expand
the scope of their appropriate uses.

Low activation is necessary and sufficient for appropriate use of a plain NP,
but it can also expand to referents that have middle or high activation. High
activation is necessary and sufficient for appropriate use of pronouns. Middle
activation is necessary and sufficient for appropriate use of a demonstrative NP
ètot X, yet ètot X can expand the scope of its possible uses in both directions,
that is, it can also be appropriately used for coding referents enjoying either
high or low activation. In other words, a pronoun cannot expand the scope of
its possible appropriate uses, while a demonstrative NP and a plain NP can – to
all other statuses.

12. Long distances (e.g. in clauses, sentences, over paragraph borders) are important
factors lowering the referent activation (cf. Fox 1987; Givón 1983, 1990; Kibrik
1996, 2000). Chapter borders may be an even stronger factor than paragraph borders.
38 Olga Krasavina

Now, it seems clear that activation level is irrelevant for the use of demon-
strative NPs. Rather demonstrative NPs appear to occur under very different
circumstances in terms of activation level. In this respect, they differ from plain
NPs and pronouns, assuming that the distribution of the latter is determined by
an activation level of a referent (Chafe 1994; Tomlin and Pu 1991; Kibrik 1996,
2000). Therefore, the question remains: what specific factors govern the use of
a demonstrative NP?

2.2. Results of the corpus study of demonstrative NP uses


In the study of demonstratives in Dutch, Maes and Noordman (1995) hypoth-
esize that a demonstrative NP produces modification in the referent’s repre-
sentation that existed in the preceding discourse, activating relevant contextual
properties of a referent. This modification is reflected in the formal-lexical re-
lationship between the antecedent and anaphor NPs. After adapting this idea on
the Russian data and considering the vast number of cases where demonstra-
tives are used (for more details see Krasavina 2004), I came to the conclusion
that a number of ètot X occurrences can be explained by a modification proce-
dure, as in (2). The fact that the referent was a fired-up dreamer is activated by
mentioning it within the demonstrative NP.
Yet, in contrast to the theory of Maes and Noordman (1995), not all exam-
ples in my sample could be explained this way. I propose an alternative typology
which consists basically of two classes: functionally determined uses and sub-
stitutionally determined uses. As functionally determined I consider the cases
in which the use of ètot NPs is basically determined by one of its fundamental
functions: 1) selection of one element from a set (as in (4)); 2) pejorative func-
tion (as in (5)), and 3) identification (of not obviously identical referents) (as in
(6) and (7)). The functionally determined uses make up 45% of all cases.
(4) Nemcy mogli sbrasyvat’ trupy počti u vxoda, a ètot ležal daleko ([3]).
‘The Germans could throw out the corpses beside the entrance, but
this (one) lay in the distance ([7]).’
(5) Matematičke kakoj-to binom Njutona dorože vsej poetiki Puškina …
I nikto ne podumaet, što ètot binom, možet, nikogda emu v žizni ne pon-
adobitsja …([4]).
‘For a math teacher some Binomial theorem is of a higher value than
Pushkin’s poetry…And no one would think that this Binomial theorem
may never be of any use to him [=the student] in his life…’
(6) Poka čto on byl spisan i čislilsja voenrukom v Tyrnyauze, v škole. Iz
ètoj školy prišel i Griša ([5]).
Demonstratives and salience: Towards a functional taxonomy 39

‘So far, he was retired and was listed as a military teacher in Tyrnyauz,
at school. From this school came Grisha as well. ’
(7) I tut že migom odna ruka lezet v ledjanoj sumrak za butylkoj, v to vremja
kak drugaja vytiraet trjapkoj prilavok, pološčet gromadnuju lituju
kružku s žul’ničeskim tolstym dnom, perevoračivaet ètu kružku i so
stukom stavit ee pered pokupatelem ([6]).
‘And immediately one hand reaches into the icy darkness for the bot-
tle, while the other wipes the counter with a cloth, rinses a huge cast
mug with a thick false bottom, turns this mug over and puts it loudly in
front of the client.’
In (4), nominal ètot points out that specifically this corpse is meant. Thus the
corpse in question is contrasted to those that could have been thrown out beside
the entrance. In (5), the negative evaluation of the Binomial theorem and the
speaker’s ironical attitude to its exaggerated importance to a math teacher is
expressed by means of ètot X. Russian does not use articles to denote whether
the NP refers to a unique object or any object of the sort. Consider (6), where at
its first mention, the school can be interpreted as referring either to some school
or to a specific school. Ètot X can only refer to a specific school. Moreover, ètot
can be substituted by ètot že ‘the same’, without a change of meaning. Thus the
use of ètot marks that the referents in the first and the second predications in (6)
are identical. In (7), all three devices are interchangeable – a demonstrative
NP, a pronoun, and a plain NP. The use of a demonstrative NP in this example
shows the speaker’s intention to identify the mug in question with the one that
is “huge”, “cast”, and “with a thick false bottom”, rather than some other mug.
Another class of demonstrative NP uses is that of substitutionally deter-
mined uses (41% of all cases). In these cases a demonstrative NP is used when
neither a pronoun nor a plain NP can be used for a certain reason. In other
words, some restricting factors prevent the use of both basic referential devices
(the use of the latter on the basis of the activation level is explained in detail in
Kibrik 1996; 2000). As a result, a demonstrative NP substitutes a correspond-
ing device, as for example, in (8). A plain NP cannot be used in (8) because of a
stylistic constraint in Russian on full repetition of a plain NP at short distances.
The distance at which this constraint functions requires further investigation.
If the activation level (see Kibrik 1996, 2000) typical for the examples of this
type is included (8), the value of this measurement would be high enough for
a pronoun to be used. The introductory character of the antecedent, however,
lowers the activation level. For this reason it is unlikely that a pronoun can be
used.
40 Olga Krasavina

(8) Ja raskryl knigu, kotoruju prines s soboj. Eto byl kakoj-to xrestom-
atijnyj učebnik dlja vtorogo ili tret’ego klassov s nebolšimi otryvkami
iz klassičeskix rasskazov i povestej. Ja stal gromko čitat’ èti otryvki
(*nebol’šie otryvki, ? otryvki, *ix) isključitel’no dlja togo, čtoby obratit’
vnimanie učitelej na beglost’ moego čtenija ([1]).
‘I opened the book that I brought with me. It was some text-
book for the second or the third form containing short extracts from
some classic short stories and novels. I began reading these extracts
(*the short extracts, ? the extracts, *them) loudly, in order to turn the
teachers’ attention to the fluency of my reading.’
Figure 2 below summarizes the model of referential choice which accounts for
functionally and substitutionally determined uses:

Figure 2. Referential choice heuristics.

Step 1: This stage involves checking if there are any conditions for a demon-
strative NP to be used in one of its functions. If such conditions are present
and no blocking filters are present, a demonstrative NP is used (see examples
(4) – (7)). If there are no such conditions or if such conditions are present, but
blocking filters are also present, then one proceeds to Step 2.
Step 2: According to the activation level, either a plain NP or a pronoun is
chosen (see Kibrik 1996, 2000).
Demonstratives and salience: Towards a functional taxonomy 41

Step 3: If there are no filters that can block the use of a pronoun or a plain NP,
then a corresponding referential device is used. If such filters are present, then
one proceeds to Step 4.
Step 4: A substitutionally determined demonstrative NP is used (see (8)).

At this point a question may occur: what happens if not only the basic referential
devices are blocked, but a demonstrative NP as well? Other ways of referring
to the target object may exist. At this time no definitive answer to this question
is available.

2.3. Unaccounted cases


In Section 2.2, I have demonstrated that functionally and substitutionally deter-
mined uses make up most of the cases of demonstrative NP use. There are some
cases, however, that are neither functionally, nor substitutionally determined,
namely 14% of all cases. For example, what governs the referential choice in
cases like (9)?

(9) Poezd opazdyval, i ja progulivalsja po perronu, razgljadyvaja okruža-


juščix. V osnovnom eto byli dačniki i studenty. Sredi tolpy ja razgl-
jadel neskol’ko turistov s sobakoj. Na ètix turistax (=nix) byla jarkaja
sportivnaja odežda i professionalnye botinki na tolstoj podošve.
‘The train was delayed, and I walked along the platform watching the
people around me. For the most part these were summer residents and
students. Among the crowd I noticed several tourists with a dog. These
tourists (= they) wore bright tracksuits and professional shoes with a
thick sole.’

Many claims have been made concerning the use of demonstrative NPs that
signal different kinds of shifts (cf. Kleiber 1988, Gundel 1993, Himmelmann
1996, Cornish 1999), with the only difference from plain NPs being that plain
NPs have a presupposition of uniqueness, whereas “the notion of contrast is
built into the conditions of use regulating demonstratives” (Cornish 1999:59).
In my study sample, 3% of demonstrative NPs occur under the condition of a
time shift.
It has also been repeatedly argued, on the other hand, that various kinds of
shifts are the typical conditions for plain NP use as contrasted to pronoun use
(cf. Walker 1998; Cornish 1999). Most likely such contradictions in conclu-
sions are due to the difference in unaccounted variables that could have had
different values in the texts tested. One needs to carefully consider all condi-
42 Olga Krasavina

tions that could have had any weight in referential choice and make the material
as uniform as possible.
In Section 3, I consider the cases where 1) a demonstrative NP is used to
indicate a topic shift, and 2) a demonstrative NP is used after a time shift. In the
first case, the animate and inanimate referents were considered separately, in
two different experiments: animate and inanimate referents can have different
activation levels.13 In the second case, referents were locations.
Since this study has been conducted in the production-oriented perspective,
the experimental tasks were modeled in such a way that utterance production
process was imitated. The questionnaire method was used. Experiments 1 and
2 involved a forced choice task, and Experiment 3 a written text-continuation
exercise. The following section is devoted to these experiments.

3. Experimental evidence

3.1. Experiment 1. The use of a demonstrative NP after a time shift


3.1.1. Purpose of experiment
The purpose of this experiment was to discern whether the factor of time shift
between the antecedent clause and the anaphor clause affects the use of a demon-
strative NP as an anaphor.
According to Padučeva (1985), a shift in the temporal perspective can be a
reason for a demonstrative NP to be used in (11) as contrasted to (10):
(10) Za rekoj rasstilalsja lug. Na lugu paslis’ korovy.
‘Across the river there was a meadow. Cows grazed on the meadow.’
(11) Za rekoj rasstilalsja lug. Na ètom lugu v prošlom godu paslis’ korovy.
‘Across the river there was a meadow. Last year cows grazed on this
meadow.’
In the examples from my sample, all usages of ètot X under consideration have
two common features:
1) the antecedent of the demonstrative NP is the introductory mention of the
referent;

13. As mentioned before, referential choice is sensitive to activation level of a referent.


Animacy contributes to the activation level.
Demonstratives and salience: Towards a functional taxonomy 43

2) the predications preceding the one containing the demonstrative NP are con-
nected by the rhetorical relation of “sequence”14 (in terms of RST Theory;
see Mann, Matthiessen, and Thompson 1992).
The hypothesis tested in this experiment can be formulated as follows. Let us
assume that the context meets criteria 1) and 2). Then a demonstrative NP is
used after a time shift (Condition 1). If there is no time shift, mostly plain NPs
are used (Condition 2).

Condition 1. Nakonec, kogda počti sovsem stemnelo, on vyšel iz lesa na pereko-


pannoe protivotankovym rvom pole. Ix otrjad kopal ètot rov včera noč’ju.
‘Finally, when it got almost completely dark, he went out of the forest to the
field with a tank ditch. His detachment had been digging this ditch the night
before.’
Condition 2. Nakonec, kogda počti sovsem stemnelo, on vyšel iz lesa na pereko-
pannoje protivotankovym rvom pole. On perebralsja čerez rov i došel do kakix-
to vyselok – trex domikov s tjanuvšimisja szadi nix pletnjami.
‘Finally, when it got almost totally dark, he went out of the forest to the field
with a tank ditch. He got over the ditch and reached some settlement – there
were three little houses with wicker fences stretching behind them.’

3.1.2. Participants
The participants were 18 to 32 years old. They were divided randomly into
two parts. Each part consisted of 16 people. Only individuals with no linguistic
background were admitted. This was meant to prevent problems connected with
the so-called “effect of study”, that is, the possibility that a participant would fig-
ure out exactly what the experimenter expected to see and would unconsciously
either help the experimenter to obtain the expected result, or prevent him from
doing so.

3.1.3. Materials and procedure


In this experiment, each participant was involved in tasks based on two different
conditions. One part of the participants was tested independently of the other
part. The purpose of this division was to ensure that no subject variables were
disregarded. Each text was presented in two variants corresponding to the two
above-mentioned conditions. Participants were not given two versions of the

14. The sequence relation specifies that a succession relationship must exist between the
related spans.
44 Olga Krasavina

same text, so they had no chance to compare these two variants while making
the choice of a referential device.
Each experimental sheet consisted of four texts, one of which was the target
text with presence or absence of a time shift; the three others were filler-texts.
The filler-texts imitated the target texts. Each experimental text consisted of two
sentences. In the first sentence the target referent was mentioned for the first
time, and in the second sentence for the second time. At the second mention of
the referent the variants of choice (a plain NP and a demonstrative NP) were
suggested in parentheses. Thus the task involved a forced choice between these
variants.
Each participant received four test sheets, including two with a time shift
and two with no time shift. This yielded 4 (text types) × 32 (participants) = 128
test sheets. The test texts and filler texts were typed out in a random order on
the test sheets. The time for completing the task was not limited. Some texts
that were used for this experiment were original extracts from the corpus and
some were constructed for this experiment.
Below is an example of an experimental sheet. The target text is the second
one from the top. This text includes a time shift. The other texts are filler-texts.

Task: Below you see some extracts from Russian fiction. Please underline that variant
given in parentheses that you think sounds best in this context. Accomplish the tasks
in order. Proceed to the following task; do not return to any of the previous ones.
Nakonec vygljanulo solnce. (Ono, èto solnce) osvetilo komnatu, brosaja vyzov plox-
omu nastroeniju.
‘At last the sun came out of the clouds. (It, this sun) lightened the room, challenging
the bad frame of mind’.
Miša vošel v komnatu i podošel k stolu. Kogda-to za (ètim stolom, stolom) sidel ego
deduška.
‘Misha entered the room and came to the table. Some time ago his grandfather used
to sit at (this table, the table).
Vernuvšis’, ja zastal ix za jarostnym sporom. Ponabl’udav nemnogo za (ètoj scenoj,
nej), ja rešil vmešat’sja.
‘As I returned, I found them in a furious dispute. After watching (this scene, it) for a
while, I decided to interrupt’.
Odin student rešil podšutit’ nad svoim odnokursnikom. (Ètot student, on) napisal čto-
to na kločke bumagi i položil v sumku svoemu drugu.
‘Once a student wanted to make fun of his classmate. (This student, he) wrote some-
thing on a sheet of paper and dropped it into his friend’s bag’.
Figure 3. Example of an experimental sheet (translation from Russian)
Demonstratives and salience: Towards a functional taxonomy 45

3.1.4. Results and discussion


The results conformed to the expectation (see Table 4): under Condition 1 dem-
onstratives were chosen in 69% of cases, and plain NPs only in 31% of cases.
And vice versa: under Condition 2 ètot X was chosen only in 34% of cases, in
contrast to plain NPs, which were used in 66% of cases.
There is a statistically significant distinction in the use of referential devices
(demonstrative and plain NP) depending on the presence/absence of a time shift
(χ 2 (1) = 6.96, p < 0.01). Thus, the results obtained in this experiment con-
firmed the hypothesis that the presence of a time shift in case of sequential
events stimulates the use of a demonstrative NP. The results obtained in Exper-
iment 1 with the second part were similar to those obtained with the first part,
and for this reason are not presented here. The similarity of results proved that
there were no subject variables that were disregarded.
Table 4. Experiment 1 with the first part of the participants. The frequency of demon-
strative and plain NP uses under the conditions of presence and absence of a time shift.

ètot X plain NP
Condition 1 (time shift) 22 (69%) 10 (31%)
Condition 2 (no time shift) 11 (34%) 21 (66%)

3.2. Experiment 2. Demonstrative NP use in cases with an animate referent


3.2.1. Purpose of experiment
The use of a demonstrative NP after the first mention of a referent is considered
to be a common strategy for establishing important discourse participants. As
soon as a new participant is established, further mentions are coded by means of
third-person pronouns and plain NPs. For example, as Himmelman (1996:229)
indicates, in languages both with and without definite articles demonstrative
NPs are used after the first mention of a thematically prominent referent that will
be mentioned again in the subsequent discourse. In the following experiment I
test this observation on the Russian material.

3.2.2. Participants
The participants involved in this experiment were the same individuals who
were involved in Experiment 1. The subjects were subdivided in a random way
into two parts, 16 subjects each, so that each part received different versions
of the same texts. These two versions corresponded to the two conditions men-
tioned above.
46 Olga Krasavina

3.2.3. Experimental design and procedure


The target context can be described as follows: the referent is animate, and
its first mention in the discourse occurs one discourse unit before its second
mention. At its second mention a pronoun or a demonstrative NP can be em-
ployed. A plain NP cannot be used because of certain constraints in Russian.
The hypothesis tested is as follows: demonstrative pronouns are used when a
new referent important for the subsequent discourse is established. I assume
that the importance of the referent for the subsequent discourse correlates with
the presence of subsequent mentions of this referent in the discourse.
Thus we have two conditions. In Condition 1 (thematic prominent), a refer-
ent is present in the following context after its second mention. In Condition 2
(not thematic prominent), a referent is not mentioned in the following context.
The target referent is double-underlined in the examples:
(12) Kak-to večerom ja pošel guljat’ s sobakoj. Projdjas’ po parku, ja pošel
obratno, podumav, čto pora vozvraščat’sja domoj. Vdrug otkuda-to iz
kustov vyšel kakoj-to čelovek. (Ètot čelovek, on) sprosil u menja doku-
menty, pokazav mne milicejskoe udostverenie.
‘One evening I went to walk the dog. I made a walk around the park and
went back with the intention of going home. Suddenly a man stepped
out from somewhere in the bushes. (This man, he) showed me a police
card and demanded my identity document.’ {The continuation follows
according to either Condition 1 or Condition 2}.

Condition 1 (the referent is mentioned in the following context)


Konečno že, ja ničego ne vz’al s soboj, no on ne stal vyslušivat’ menja.
‘Certainly I had no document with me, but he wouldn’t listen to me’.
Condition 2 (the referent is not mentioned in the following context)
Sueta suet, vse sueta, podumal ja. Õodim, čto-to delaem, suetimsja, a vo vsem
ètom net nikakogo smysla.
‘Vanity of vanities, I thought. We’re moving around, doing so much useless
stuff, and all this makes no sense.’

Each participant received a test sheet with 4 test texts on it, each 45–60 words
long: two texts according to Condition 1, and two according to Condition 2.
Filler material was not used. In the target place there were two variants of
choice. Participants were instructed to choose one, either a pronoun or a demon-
strative NP. This yielded 4 (texts) × 32 (participants) = 128 test texts.
Demonstratives and salience: Towards a functional taxonomy 47

Demonstrative NPs that were chosen noticeably more frequently under Con-
dition 1, and demonstrative NPs and pronouns that were chosen with an equal
frequency or with prevalence of pronouns under Condition 2, would be inter-
preted as support for the hypothesis.
The participants were given the test sheets for both experiments at the same
time. The test tasks were carried out consecutively, without any time break.

3.2.4. Results and discussion


In this experiment I found that there was no significant difference between the
referential device uses under two conditions (χ 2 (1) = 4.04 n.s). On the whole,
pronouns were preferred (73% of all cases under Condition 1 and 81% under
Condition 2), see Table 5. The effect of pronoun preference may have been
caused by a factor that was not considered as the experiment was being planned:
participants could have made their choices before they read the text to its end.
Thus there was no guarantee that it was the information located in the further
context that affected the subjects’ choice. To avoid this effect, the task was
changed: the subjects were specifically instructed to read the texts to the end
before making their choice.

Table 5. The number of ètot X and on.


ètot X on
Condition 1 (thematic prominent) 17 (27%) 47 (73%)
Condition 2 (not thematic prominent) 12 (19%) 52 (81%)

Twelve new individuals took part in a modified version of Experiment 2. All


conditions remained the same.
The results of this modified version of the experiment, like the results of
the initial experiment (see Table 6), showed that at the second mention of an
animate referent there was no significant difference between the referential de-
vice uses under two conditions (χ 2 (1) = 0.4 n.s.). This disproved the hypoth-
esis regarding the influence of the “thematic prominence” factor on the choice
“demonstrative NP/pronoun”. One can observe the tendency that the number of
demonstrative NPs used under Condition 1 is larger than the number of demon-
stratives used when the referent is not mentioned in the further context. Still,
the results showed that a referent’s relevance to the subsequent discourse was
not sufficient to warrant that a demonstrative NP would be used at the second
mention of this referent. Moreover, in Table 7, one can see another illustration
that pronouns on the whole were used more often than demonstrative NPs at the
second mention of an animate referent (pronouns made up more than two thirds
48 Olga Krasavina

Table 6. Frequency of demonstrative NP and pronoun use under Conditions 1 and 2.

ètot X on
Condition 1 (thematic prominent) 8 (33%) 16 (66%)
Condition 2 (not thematic prominent) 6 (25%) 18 (75%)

Table 7. Summary frequency of ètot X and pronoun use at the second mention of the
referent under Condition 1 and 2.
ètot X on Total
Condition 1
or 14 (29%) 34 (70%) 48 (100%)
Condition 2

of the chosen devices). There must be either some additional, more powerful
factors that regulate the choice between a pronoun and a demonstrative NP in
this situation, probably depending on the speaker’s intention, the genre, etc., or
a different experimental design needs to be chosen, in order to ensure that the
participants understand the relevance of the target referent for the subsequent
discourse.

3.3. Experiment 3. Demonstrative NP use in cases with an


inanimate referent
3.3.1. Purpose of experiment
The hypothesis tested in this experiment is similar to that in the previous exper-
iment. There were several important differences, however. First, in this exper-
iment the contexts with an inanimate referent were tested. Second, the experi-
mental design was changed: the experiment involved a story continuation task.
In the original text, the referent a hole was presented as a pronoun at its
second mention (see (13)). According to the hypothesis, a pronoun and not a
demonstrative was used because the referent was not mentioned again in the
following context. So, most likely the pronoun was used because the speaker
evaluated this referent as irrelevant for the development of the further discourse.
(13) Vskore ona naučilas’ vypolnjat’ nesložnuju xozjajstvennuju rabotu:
molotit’ kolotuškoj kukuruzu, taskat’ na mel’nicu meški…Pod otkrytym
navesom, gde ona žila, Zana vyryla jamu, obložila ee paporotnikom i,
takim obrazom, ustroila sebe dovol’no ujutnuju spal’nju ([2]).
‘Soon she learned to accomplish simple domesticities: threshing maize
with a mallet, dragging sacks to the mill…Under the open shed where
Demonstratives and salience: Towards a functional taxonomy 49

she lived, Zana dug a hole and put some ferns around it, and thus she
made herself quite a cozy bedroom.’

3.3.2. Participants
12 subjects took part in the experiment. The subjects had never taken part in this
kind of experiment. The participants met the same criteria as the participants in
Experiments 1 and 2.

3.3.3. Material and experimental procedure


Each participant was given one test text consisting of two sentences. The tar-
get referent was introduced within the second sentence. The participants were
instructed to continue this text with two or three sentences. The time for task
completion was not limited. No hints about what referent should be used were
provided, so participants were free to choose the subject they wrote about and
to mention or not to mention a certain referent and if they did mention one, they
were free to do so as many times as they wished.
This yielded 1 × 12 = 12 test sheets. The question was by what means the
further mentions of the inanimate referent would be coded. The predictions were
distributed in the following way:
(1) If the referent was mentioned only once in the following context, and name-
ly in the next sentence, then a pronoun would be used;
(2) If the referent was mentioned more than once in the following context, then
at the first of these mentions (i.e. at the second mention altogether) this
referent would be coded by a demonstrative NP.
In Figure 4 I present the test sheets used in this experiment. Before the test texts
there was a short “introduction”, where the protagonist was introduced.
Task: continue the story in 2 or 3 sentences.
The results were as follows: the subjects either did not mention the target
referent at all, or used relative pronouns or adverbial demonstratives for coding
it (see Table 8). Consequently, the test material was modified in the following
way: at the end of each text that was to be continued, there was an instruction in
parentheses as to what the end of the story should be. The referent jama ‘ditch’
was not explicitly mentioned. The instruction stated, “Describe her place” or
“Describe her subsequent actions.” Initially 16 subjects participated in the ex-
periment; the number of participants was increased until there were 16 relevant
referent mentions through a demonstrative NP or a pronoun. The final number
of participants was 70. In half of the experimental texts, subjects were asked to
continue the text after a comma; in the other half, after a full stop. Punctuation
50 Olga Krasavina

Kogda mir zalixoradilo poiskami snežnogo čeloveka, Viktor Maksimovič neredko


prixodil v kofejnju s žurnalami ili gazetnymi vyrezkami, v kotoryx govorilos’ ob ètom.
I vdrug èti poiski obrušilis’ na Abxaziju. Okazyvaetsja, v abxazskom sele Txina v
prošlom veke pojmali dikuju, ili lesnuju, kak govorjat abxazcy, ženščinu.
Vskore ona naučilas’ vypolnjat’ nesložnuju xozjajstvennuju rabotu: molotit’ kolo-
tuškoj kukuruzu, taskat’ na mel’nicu meški … Pod otkrytym navesom, gde ona žila,
Zana vyryla jamu … ([2]).
‘When the world was in snowman-search fever, Viktor Maximovich often came to the
coffeehouse with magazines or newspaper excerpts about it. Suddenly the search fell
upon Abkhazia. It turned out that in the Abkhazian village called Thina, a wild, or
forest woman was caught in the past century. She was named Zana.
Soon she learned to accomplish simple domesticities: threshing maize with a mal-
let, dragging sacks to the mill … Under the open shed, where she lived, Zana dug
a hole …’

Figure 4. Example of the test sheet (translation from Russian)

marks were neglected regardless of the instructions: sometimes the participants


started a new sentence after a comma and continued the previous one after a
period. Introduction and experimental texts remained the same.
Table 8. Referential devices used by the subjects for referring to the referent ‘the hole’
in the first variant of Experiment 3.

Referential devices Number %


Plain NP 0 0
Pronoun 1 8
Demonstrative NP 0 0
Adverbial demonstrative (such as “there”) 2 17
Relative pronoun (i.e. “where, which”) 8 67
Referent is not mentioned anymore 1 8

3.3.4. Results and discussion


The final results can be seen in Tables 9 and 10. Table 9 represents the for-
mal referential devices that the participants used when they mentioned ‘hole’.
The devices not relevant for our discussion and excluded from consideration
here are marked by an asterisk. As for the relevant uses, the number of cases
with more than one subsequent mention of ‘hole’ was only 5 (31%) out of
16, three of which were realized as pronouns and two as demonstrative NPs
(Table 10, lines 1 and 3). Under the condition of no further mentions of the
Demonstratives and salience: Towards a functional taxonomy 51

Table 9. Experiment 3. Text continuations.

Number %
*Plain NP at linear distance = 1 7 10
*Plain NP at linear distance > 1 9 13
Pronoun 10 14
Demonstrative NP 6 9
*Adverbial demonstrative 7 10
*Relative pronoun 8 11
*Target referent is not mentioned 23 33
Total 70 100

Table 10. Correlation of pronoun use and referent mention in the following context
(16 Total).

referent is mentioned in the following referent is not mentioned in the follow-


context ing context
5 11
pronoun demonstrative pronoun demonstrative
3 (60%) 2 (40%) 7 (63%) 4 (37%)

referent, demonstratives were still used (Table 10, lines 2 and 4), although the
number of pronouns used under this condition was higher (7 cases, or 44%) than
the number of demonstratives (4 cases, or 25%). At the same time, the number
of demonstratives on the whole (6 cases, or 38%) was lower than the number
of pronouns (10 cases, or 63%).
The statistical analysis shows that there is no significant difference between
pronoun or demonstrative NP use under two conditions (χ 2 (1) = 0.02 n.s.). This
indicates that the factor of “thematic prominence” in the subsequent discourse
of a new discourse referent does not affect the choice of demonstratives.
The number of demonstrative NPs turned out to be quite low. As can be seen
in Table 9, jama ‘hole’ was often referred to by means of adverbial demonstra-
tives and plain NPs at the linear distance of one clause despite the constraint on
full NP repetition at such distance. This constraint can be abolished in cases
where the referent is a location, as in the case considered in Experiment 1.
The high occurrence of adverbial demonstratives and plain NPs here can be
addressed in terms of referent semantics.
52 Olga Krasavina

4. Conclusions

The present study has focused on the use of an important, but little-studied,
referential device in Russian – the demonstrative NP ètot X. One of the most
essential points made in this study is the clarification of the connection between
demonstrative NPs and activation level. This connection turned out to be quite
weak: demonstrative NPs can be used under any activation level of a referent. It
was demonstrated that ètot X can encode the referents which have both higher
than “familiar” status (in line with the Gundel et. al. (1993, 2001) prediction)
and lower than “familiar” status. The latter can be interpreted as an important
addition to the theory of Gundel et. al. (1993, 2001). Using the corpus mate-
rial, two classes covering most of the demonstrative NP uses were singled out.
Demonstratives are used when neither of the basic referential devices can be
used. There are certain factors that block the use of a pronoun under a high acti-
vation level and of a plain NP under a low activation level of a referent, so that
the only possible referential form left is a demonstrative NP. These cases make
up the first class. Another class is represented by cases that can be accounted
for by one of the basic functions underlying the use of demonstratives: selection
from a set, identification, or pejorative function.
After the examination of several hypotheses concerning the character of the
factors that lead to the use of demonstrative NPs in experimental studies, it has
been proven that the demonstrative NP is a preferable referential device after
a temporal shift takes place. An important “negative” result was also obtained
during the study: the experiment indicates that the factor of further referent’s
relevance in the following context does not affect the choice between a demon-
strative NP and a pronoun.
This study calls for further empirical studies in the use of Russian demon-
stratives. The questions that remain to be answered are:
– What governs the referential choice in cases like those described in Experi-
ments 2 and 3?
– Within the class of substitutional uses, what happens if not only a pronoun
and a plain NP are blocked, but a demonstrative NP as well?
Additionally, the nature of constraints on the use of referential expressions re-
quires further investigation. For example, the distance from the antecedent at
which a plain NP is blocked needs to be clarified. Future work will include a
study of a larger data set, as well as reaction time experiments.
Demonstratives and salience: Towards a functional taxonomy 53

Acknowledgements
The research presented in this paper was supported by grant 03-06-80241a of
the Russian Fund of Fundamental Research.

Abbreviations
The literature from which the examples were taken (URL: http://www.lib.ru):
[1] Iskander F. “Školnyj val’s, ili Energija styda”;15
[2] Iskander F. “Stojanka čeloveka”;
[3] Simonov K. “Živye i mertvye”;
[4] Bykov V. “Obelisk”;
[5] Vizbor J. “Legenda sedogo El’brusa”;
[6] Kataev V. “Beleet parus odinokij”;
[7] Akimov I. “Legenda o malom garnizone”.

References

Ariel, Mira
1994 Interpreting Anaphoric Expressions: A cognitive versus a pragmatic
approach. In: Journal of Linguistics (30), 3–42.
Boguslavskaja, Olga and Muravjeva, Irina
1987 Mexanizm anaforičeskoj nominacii. In: Modelirovanie jazykovoj de-
jatel’nosti v intellektual’nyx sistemax (red. A.E. Kibrik, A.S. Narin-
jani). Moskva: Nauka, 78–127.
Chafe, Wallace
1976 Givenness, contrastiveness, definiteness, subjects, topics and point of
view. In: C.N. Li (ed.), Subject and topic. New York: Academic Press,
25–55.
Chafe, Wallace
1994 Discourse, Consciousness, and Time. The Flow and Displacement of
Conscious Experience in Speaking and Writing. Chicago: University
of Chicago Press.
Cornish, Fransis
1999 Anaphora, Discourse and Understanding. Evidence from English and
French. Oxford: Clarendon.
Fox, Barbara
1987 Discourse Structure and Anaphora. Cambridge: Cambridge Univer-
sity Press.

15. Examples from this and the other sources used in this study, were abridged as far as
possible.
54 Olga Krasavina

Givón, Talmy, ed.


1983 Topic Continuity in Discourse: A quantitative cross-linguistic study.
Amsterdam: Benjamins.
Givón, Talmy
1990 Syntax: A Functional-typological Introduction. Vol. 2. Amsterdam
and Philadelphia: John Benjamins.
Grosz, Barbara, Weinstein, Scott and Joshi, Aravind
1995 Centering a framework for modeling the local coherence of discourse.
In: Computational linguistics 21(2), 203–25.
Gundel, Janette, Hedberg, Nancy and Zacharski, Ron
1993 Cognitive status and the form of referring expressions in discourse.
In: Language 69 (2), 274–307.
Gundel, Janette, Hedberg, Nancy and Zacharski, Ron
2001 Cognitive Status and Definite Descriptions in English: Why Accom-
modation is Unnecessary. In: English Language and Linguistics 5,
273–295.
Halliday, Michael and Hasan, Riqaiya
1976 Cohesion in English. Longman, London.
Himmelmann, Nikolaus
1996 Demonstratives in Narrative Discourse: a taxonomy of universal uses.
In: B. Fox (ed.), Studies in Anaphora. Amsterdam and Philadelphia:
John Benjamins, 205–254.
Kibrik, Andrej
1996 Anaphora in Russian narrative prose: A cognitive account. In: B. Fox
(ed.), Studies in Anaphora. Amsterdam and Philadelphia: John Ben-
jamins, 255-303.
Kibrik, Andrej
2000 A Cognitive Calculative Approach towards Discourse Anaphora. In:
Paul Baker, Andrew Hardie, McEnery, Tony and Siewierska, Anna
(eds.) Proceedings of the Discourse anaphora and reference resolution
conference (DAARC 2000). Lancaster University: University Center
for Computer Corpus Research on Language, Technical Papers 12,
72–82.
Kleiber, Georges
1988 Sur l’anaphore demonstrative. In: G. Maurand (ed.), Nouvelles re-
cherches en grammaire: Acts du Colloque d’Alibi, Université de Tou-
louse-Le Mirail, 52–74.
Krasavina, Olga
2004 Upotreblenije ukazatel’noj imennoj gruppy v russkom pis’mennom
narrativnom diskurse. Voprosy jazykoznanija 3.
Demonstratives and salience: Towards a functional taxonomy 55

Maes, Alfons and Noordman, Leo


1995 Demonstrative nominal anaphors: A case of nonidentificational
markedness, Linguistics 33, 255–282.
Mann, William, Matthiessen, Christian and Thompson, Sandra
1992 Rhetorical structure theory and text analysis. In W. Mann and
S. Thompson (eds.), Discourse Description. Diverse Linguistic Anal-
yses of a Fund-raising Text. Amsterdam and Philadelphia: John Ben-
jamins, 39–78.
Padučeva, Elena
1985 Vyskazyvnije i ego sootnesennost’ s dejstvitel’nostju. (Utterance and
its interrelationship with reality). Moskva: Nauka.
Tomlin, Russel and Pu, Ming
1991 The management of reference in Mandarin discourse. Cognitive Lin-
guistics 2: 65–93.
Walker, Marylin, Joshi, Aravind and Prince, Ellen (eds.)
1998 Centering Theory in Discourse. Oxford: Clarendon Press.
Parenthetical agent-demoting constructions
in Eastern Khanty: Discourse salience
vis-à-vis referring expressions1

Andrey Y. Filchenko

Pragmatic salience of referents in natural discourse is a matter of gradience and


dynamics. Such dynamics in pragmatic salience of the discourse entities appears
to be accordingly manifest in their morphosyntactic form. Most frequently, this
pragmatic and syntactic dynamics is directed towards promotion, that is, for a
discourse entity to become more discourse prominent enjoying correspondingly
prominent formal syntactic coding: reduced refererring expression (cf. Rose,
this vol.; Lambrecht 1994, inter alia), or particular word order alterations (Hin-
terhölzl and Petrova, this vol.). Opposite dynamics that is, towards demotion
is also possible, rendering a referent decreasingly salient in a stretch of dis-
course. Such demotion is often temporary: entity’s discourse prominence or its
semantic properties (particularly agenthood) may be parenthetically disrupted
or contested, which is appropriately reflected in the special morphosyntactic
arrangement: Agent-demoting constructions. These constructions signal deso-
nance in canonical cross-mapping of the referent’s semantic roles, grammatical
relations and pragmatic status.

1. Introduction

Khanty is one of the indigenous languages traditionally genetically affiliated


with the Uralic language family. It is spoken by fewer than 7000 indigenous
hunter-gatherers and reindeer herders in north-western Siberia. The present-
day territory settled by the Khanty lies to the east of the Ural Range along the
Ob’ river in the Tyumen and Tomsk regions of Russia. Though considered to

1. The work leading to this publication was supported by the NEH-NSF Documenting
Endangered Languages Fellowship, 2005–2006. Any views, findings, conclusions,
or recommendations expressed in this publication do not necessarily reflect those of
the National Endowment for the Humanities and the National Science Foundation.
58 Andrey Y. Filchenko

be a single language, Khanty is a dialectal continuum with a large conventional


division into western and eastern (Decsy, 1965; Jääsalmi-Krüger, 1998). The
dialects of interest in this study are the adjacent eastern-most river varieties of
Vasyugan, Alexandrovo and Vakh totaling fewer than 200 speakers. All Eastern
Khanty speakers are bilingual with Russian being the language of daily com-
munication across ethnic groups. The language of Khanty undergoes a steady
decrease of the functional sphere, reserved primarily for occasional family use
and rare peer communications. These dialects are particularly interesting as they
are the least documented and represent reportedly more archaic and richer sys-
tems (Gulya 1970, Honti 1982, Kulonen 1989, Decsy 1990). Dialectal variation
of Khanty is considerable to the extent that even within the cluster of Eastern
Khanty many varieties are mutually incomprehensible to the speakers. In typo-
logical terms the variation is extensive, with the Vasyugan, Alexandrovo and
Vakh dialects demonstrating the most distinct features: Vowel Harmony vs.
Vowel Length of western dialects; up to 12 grammatical cases vs. 5 in other
eastern and vs. 3 in western Khanty; 5 individual tense forms vs. 3 in other di-
alects; “ergative-like” (Comrie 1975; Kulonen 1989) or Loc-Agent (Filchenko
2006) constructions apart from a set of passive constructions in other dialects;
conceptual variation in numerals: “18” in the east: jöG¨@rki n1l@G ‘eight after/over
ten’ vs. “18” in the west nijEl-xus ‘8 towards 20’; unique eastern Khanty use of
the nominalizer taG1 ‘place’ (Potanina 2007).

Figure 1. North-western Siberia, Eastern Khanty language area.


Parenthetical agent-demoting in Eastern Khanty 59

The empirical base for the discussion is a corpus of the Eastern Khanty nar-
ratives, collected and transcribed between 2000-2003, supplemented by some
Eastern Khanty texts published between 1900-1995.
In the discussion of the structural properties and information structuring of
the Eastern Khanty clauses, I will differentiate the grammatical relations from
the semantic roles of the arguments of propositions, and pragmatic functions of
the referents and pragmatic operations they are involved in.
Grammatical relations are henceforth indicated following Dixon (1994) as:
S – intransitive subject; A – transitive subject; O – transitive non subject.
The main semantic roles of the arguments of propositions (relevant for the
purpose of discussion) are generally defined after Van Valin and Lapolla (1997)
as either Agent or Target, each representing a host of semantic features, or en-
tailments (cf. also Rose, this vol., for a more in-depth discussion).
The notions of discourse-pragmatic status (salience), pragmatic operations,
topicality are central to the discussion of the features of the constructions in the
narrative. A referent is defined as pragmatically central if it shows the follow-
ing properties: it belongs to the presuppositional part of the proposition, it is
contextually accessible and active, in dislocation tests (“as for” and “about”)
it produces the target clause, it normally does not carry the clause accent, and
the rest of the proposition appears to carry a relation of “aboutness” towards it
(Strawson, 1964; Kuno, 1972; Gundel, 1976; Lambrecht, 1994). In the discus-
sion of the pragmatic structure of the data, I will also use the terminology and
premises of the Centering framework as in Grosz et al. (1995), cf. the Intro-
duction of this volume for a more detailed discussion of Centering. I will also
use less conventional terms foregrounded center and backgrounded center to
mark special pragmatic states resulting from the voice operations in the Eastern
Khanty parenthetical constructions (cf. Hinterhölzl and Petrova, this vol. for
similar notions). In this, by the backgrounded center in case of Eastern Khanty
parenthetical Agent-demoting constructions I will understand a primary topical
referent (Cb ), whose pragmatic status is temporary demoted (backgrounded)
(C[backgrnd] = C[n−1] (= C[n+1] ) =/ C[n+2] ), while another referent with a compet-
ing discourse status but with non-typical semantic role appears temporary pro-
moted (foregrounded) as a pragmatic center for an interval of 1–2 clauses, as in
foregrounded center (C[foregrnd] =/ C[n−1] (= C[n+1] ) =/ C[n+2] ).
Section (2) will provide a brief outline of the canonical Eastern Khanty
active-direct clause and consider the main features of organization of gram-
matical relations important for the discussion to follow. Section 3 will intro-
duce the Eastern Khanty agent demotion constructions, listing first the so called
“ergative constructions” (3.1.) immediately compared to the “agented passive
constructions” (3.2.). The description of the key morphosyntactic, semantic and
60 Andrey Y. Filchenko

pragmatic features of these constructions will be supplemented with the insight


into their cultural contexts in naturally occurring discourse in section 4 of the
paper, offering a possible pragmatic explanation of their usage. In conclusion,
section 5 of the paper will provide a multi-factorial comparison of these con-
structions and posit their common nature, namely the demotion of the agent
participant of the event (cf. Krasavina, Kelleher, Rose, Chiarcos, this vol. for
related multi-dimensional models).

2. Canonical Active-Direct Constructions

Typical Eastern Khanty simple clause shows general tendency towards the SOV
pattern. Khanty unmarked neutral simple clause is verb-final where, generally,
subject precedes object. The position of the O constituent may vary contingent
upon its pragmatic properties, that is, brand new, inactive, unidentifiable O ref-
erents are always rigidly fixed in the SOV order. In cases of pragmatically active
and identifiable O referents other orders may result, OSV and SVO.
The semantic role of Agent is typically mapped to A grammatical relation,
while the semantic role mapped to O relation is typically the Target, an entity
saliently affected in the event. The semantic role mapped to the S grammatical
relation is understood as that of a single core NP of an intransitive verb.

(1) a. mä m@n-l-im
1sg walk-prst-1sg
‘I walk’
↓ ↓
b. mä ajr1t-äm t1Gl-a qar1-mta-s-1m
1sg canoe-1sg/sg det-ill pull-intn-pst2-1sg/sg
‘I pulled my canoe here’
The referent with the semantic role of Agent appears clause-initially, expressed
by the argument in S/A relation marked by Nom. case that controls S/A-V agree-
ment inflection on the predicate. S/A-V agreement and O-V agreement refers to
Khanty single and double conjugation, i.e. co-referential agreement inflection
on the transitive predicate controlled by the S/A relation and O relation respec-
tively. Agreement controlled by S/A grammatical relation is obligatory (1a–b).
Agreement between the grammatical relation of O and transitive predicate is
contingent upon the pragmatic properties of the O (1b), i.e. pragmatic identifi-
ability and activation of this referent in the interlocutors’ discourse universe.
Parenthetical agent-demoting in Eastern Khanty 61

The argument in the S/A grammatical relation normally has all the tradi-
tional subjecthood properties, such as control over referential relations clause-
internally and -externally: control over embedded non-finite clauses; control
over zero anaphora across conjoined clauses; control over reflexivization; quan-
tifier movement control.
Eastern Khanty nouns in O relation are zero-marked for case, i.e. morpho-
logically indistinguishable from the Nominative. In the pronouns, the Accusa-
tive case has a marker /-t/. Thus: mä ämp tuG@m ‘I brought a dog’ vs. ämp mänt
por ‘a dog bit me’.
Pragmatically, a new referent is introduced or re-activated by a full NP or
a free pronoun in the S/A grammatical relation. Once the referent is identifi-
able as topic, its discourse salience is then expressed by elision and by verbal
agreement – a preferred topic expression (Filchenko 2006).
There is a strong correlation: Topic = S/A = ClauseInitial = Agent, that is
within a general typological information structural pattern (Lambrecht, 1994).
Topicality associates strongly with minimal morphological complexity, implic-
itness, while new information and focus correlate with morphological explicit-
ness. In Centering framework terms (Grosz et al. 1995), Eastern Khanty canon-
ical clause adheres well to the cross-linguistic prototypical correlation of for-
mal referring expressions to attention state and inference load, i.e. in center
continuation, a persisting preferred center Cp(n) is typically realized by a zero
pronoun and is strongly associated with grammatical relation of S/A and sub-
jecthood, clause-initiality and pronominalization. On the other hand, in center
shifts, preferred center Cp(n) that associates with C(n+1) is typically realized by
a morphologically explicit full NP.

3. Parenthetical Agent-demoting Constructions

3.1. “Ergative” Construction


Khanty ergative constructions display structural similarity to the canonical ac-
tive-direct clause type, with an important exception: the S/A argument is always
overt and inflected for Loc case, which typically marks spatial and temporal
relation: kat-n@ ‘in (the) house’, it-n@ ‘in the evening’.
(2) a. tS1laGt-@s-m rut’ saG1: “medved!”
cry-pst2-1sg Russian manner “bear”
‘I cried2 in Russian “bear!”’

2. The small caps are used to denote the clause constituent bearing the clausal accent.
62 Andrey Y. Filchenko

b. moS@t j1G1-n@ qol-waGta-l-1l


“maybe” 3pl-loc hear-atten-prst-3pl/sg
‘Maybe they would hear it’
c. nu jem-aki, j1Gata-l-1m
j1Gata-l-1m, aGa, wajaG
“nu” good-prd look-prst-1sg/sg OK, animal
‘Ok, I look, there it is, the animal’

In (2a), the canonical active-direct clause with the elided 1sg. topic is followed
by the ergative (2b) with a new 3 pl referent. Counter to the preferred topic
expression pattern discussed in the section above, the topical Agent referent
‘they’ in (2b) appears coded by a free Loc-marked pronoun in S/A relation and
by S-V agreement inflection. The Message “Bear!” now has high discourse acti-
vation status, marked in (2b) by 3sg O-V agreement on the predicate. However,
the narrative resumes canonically in the immediately subsequent active-direct
clause (2c), where the topical status of the 1sg referent has the preferred topic
expression. That is, the demoted 1sg. topic referent of (2a) reappears expressed
in (2c) by elision and co-reference agreement inflection on the predicate, thus
retaining its overall discourse salience.
Superficially, nothing in the structure/grammar of these clauses, and their
immediate discourse environment, precludes the use of the canonical active-
direct construction type to express the same semantic content. These features
correlate to general pragmatics of Topic-comment (2a) vs. marked predicate-
focus (2b) with the Loc-marked S/A agent pronoun.
In respect to discourse information structuring, the ‘ergative’ events ap-
pear parenthetical, consequential (reactive) in their nature, representing a cause-
effect or action-consequence dependence upon the event in the preceding active-
direct clause: (2a) implies the projected effect of (2b).
Although the affect is often implicit, however, the affectedness of the Target
referent (expected in the typical transitive event) is never specified.

(3) män-n@ tSimläli tSi-näm joGo-s-im, tSut-na-pa


1sg-loc a.little det-lat shoot-pst2-1sg det-com-top
@nt-im-äki
neg-pp-prd
‘I shot there a little, nothing happens’

Thus, the Loc-marked ergative S/A referents, although mainly inherently agen-
tive (definite human/animate), are deprived, at least in part, of some of the sub-
jecthood properties: control/volition, which correlates with the fact of their in-
creased morphological complexity: oblique case marking of the Agent. This is
Parenthetical agent-demoting in Eastern Khanty 63

yet another departure from canonical Topic-Agent-S/A arrangement. Pragmat-


ically, the Eastern Khanty ergative clause’s preferred center Cp(n) appears re-
alized implicitly, whereas the current foregrounded center Cf(n) does normally
not correspond to either C(n−1) or C(n+1) . The fact that for a typical ergative
clause U(n) , consequently salient entity C(n+1) is typically the same as the pre-
viously salient entity C(n−1) shows that “ergative” clauses are largely a center-
retaining relation, which complies with the cross-linguistic preference (ergative
clauses do not sequence), that is Center[foregrnd] = Prn(typical) = Cf(n) =/ C(n+1) =
Subjecthood(−/+) .
A more complete list of properties of this construction is as follows:
– Agent is typically clause-initial, marked for Loc case and mapped to A gram-
matical relation controlling co-referential verbal agreement
– matrix predicate is a transitive verb in active morphological form, normally
expressing a perfective action, however, the affectedness of the Target is
uncertain and underlying resultant transitivity of the event is low;
– the argument with the semantic role of Target, is expressed by a full Ø-
marked NP or the Acc-marked pronoun;
– prosodically, the Loc-marked ergative Agent, particularly pronominal, does
not carry the sentence stress, whereas the active-direct A arguments have
sentence accent of some kind;
– ergative constructions mark temporary alteration of the discourse center,
foregrounding a Loc-marked Agent other than current discourse topic;
– the primary discourse topic preceding the ergative clause, reappears in pre-
ferred topic expression (elision and verbal inflection), thus maintaining its
overall discourse topicality (salience);
– ergatives code events in reactive semantic relation to preceding clause;
– overall type frequency: average 12%.

3.2. Agented Passive Clauses


One of the most typical Eastern Khanty passive constructions has overt agent
referent marked with Loc case and mapped onto a non-S/A grammatical rela-
tion. The referent mapped to S grammatical relation is instead the one in the
semantic role of Target.
(4) min-n@ tü taG1 jöG-ä ert@l-s-i
2du-loc det place 3ag-ill tell-pst2-ps.3sg
‘We (two) told him all about this’
64 Andrey Y. Filchenko

In passive (4), the ‘message’, is essentially equated to Target and promoted to


the S relation, whereas the Agent-speaker is obliqe case marked.
The analysis of the sequence (5a–b) reveals some of the structural features
and discourse functional patterns of the Eastern Khanty passive constructions.
In the active-direct (5a), the topical agentive referent 1sg S/A, is expressed pre-
dictably by elision and the predicate agreement inflection.

In the adjoined passive (5b), the Target referent ‘small dog’ is expressed by the
full NP in the S relation controlling 3sg verbal agreement, whereas the Agent
referent is expressed by the free Loc-marked pronoun in the non-S relation. The
topic in (5b) is the S argument ‘small dog’, whereas the 1sg agent is apparently
temporarily demoted to an oblique-like role. The relevance of the agent in the
proposition (5b) is still manifested through its overt presence, to minimize po-
tential ambiguity.
However, in the active-direct (6), the demoted 1sg agent reappears in the S/A
relation, remaining topical, coded appropriately by elision and predicate agree-
ment inflection, and controlling the S/A relation over the non-finite modifiers
(inflected for 1sg), thus unaffected discoursively by the passive demotion.

With regard to information structure, the correlation of [pragmatic function –


to semantic role – to grammatical relation] (TOPIC = Agent = S/A) translates
in passive into (TOPIC = Target = S). However, the referent with the role of
Agent, demoted in passive from A to O grammatical relation, maintains some
pragmatic properties testifying to its retained discourse centrality that allows it
to emerge as topic in the immediately subsequent discourse without any special
Parenthetical agent-demoting in Eastern Khanty 65

topic promotion means, i.e. expressed by elision and co-referential agreement


on the predicate.
Pragmatically, Eastern Khanty agented passives typically have referents
with competing discourse centrality. One, a primary topic, a backgrounded cen-
ter Cb (often corresponding to a C(n−1) ) has the semantic role of Agent realized
by Loc-marked grammatical relation O, typically a free pronoun, that appears
to have some of the subjecthood properties and which typically corresponds
to the subsequently salient entity C[n+1] . That is, O = Prn(typical) = Agent =
Subjecthood(+/−) = Cb = C[n+1] = Cp(n) . Another referent with competing dis-
course centrality, a secondary topic, a foregrounded center Cf with the role of
Target is realized by Nom-marked grammatical relation S, typically a full NP
or a free pronoun, clause-initial, also having some subjecthood properties, such
as predicate agreement control, and which is highly improbable as C(n+1) , i.e.
S = NPfull = Target = Cf =/ C[n+1] = Subjecthood(−/+) . This manifests the center-
retaining relation of agented passive clauses, adhering to the cross-linguistic
center-continuation sequencing preference constraint (agented passives do not
normally sequence over 2 clauses in Eastern Khanty).
Summary of the key structural and discourse-pragmatic features of Eastern
Khanty passive constructions is as follows (Filchenko 2006):
– Target is typically a full NP unmarked for case, mapped onto the S gram-
matical relation, controlling the S-V agreement on the predicate;
– Agent is typically a free pronoun or full NP, mapped onto a non-S grammat-
ical relation, marked by oblique Loc case;
– passive marks a change in the pragmatic status of the referents, temporar-
ily foregrounding the non-Agent, promoting it to the S relation; and back-
grounding the Agent, demoting it to the non-S relation (oblique);
– however, while at the clausal level the pragmatic status of the referents is al-
tered in course of passive clauses, at the level of overall discourse the agent
referent maintains high pragmatic status – discourse salience, which follows
from its canonical preferred topic expression by elision and agreement in-
flection in the subsequent active-direct clauses;
– passive predicates are prototypically transitive verbs, implying two core ar-
guments, one of which, the demoted O, is high in agentivity status, while
the promoted S, is reduced in agentivity being affected in the event;
– passive-active sequences, testify that passive is a marked construction type,
requiring a special arrangement of the referents, which is outside the canon-
ical pattern of mapping pragmatic functions - to semantic roles - to gram-
matical relations;
66 Andrey Y. Filchenko

– passive manifests the Eastern Khanty tendency for Topic initiality, i.e. the
strongest alignment appears <pragmatic function = grammatical relation>,
overriding that of <pragmatic function = semantic role> or <semantic role =
grammatical relation>;
– type frequency in the narratives ∼ 13%.
The Eastern Khanty passives with overt Agent appear to resonate with the gen-
eral fundamental function of passives “having to do with defocusing of agents”
(Shibatani, 1985).
The structural properties of the Eastern Khanty voice constructions listed
above are not typologically unique, nor are they new to the language descrip-
tion. However, the exact identification of the motivation of their usage, i.e.
the functional explanation of these grammaticalized form-function correspon-
dences, is not accounted for. In the meanwhile, the explanation of the choice
of speakers’ strategies of mapping of the propositional-semantic content to the
structural features and discourse functions in the passive constructions may be
aided by the insights into the specifics of the cultural context typically correlat-
ing to frequent use of these constructions, passive voice talk.

4. Cultural Context and Pragmatic Explanation

4.1. Passive voice talk


It can be posited that the use of the passive voice talk in Eastern Khanty cor-
relates most typically to the cultural context of marriage and appears illus-
trative of the conventionalized cultural frame. The choice of grammatical re-
sources appears in correlation with the conventionalized cultural practice (pa-
triarchal/patrilocal residence and strict exogamous marriage).
Parenthetical agent-demoting in Eastern Khanty 67

Thus, within the cultural context of marriage/family, the woman is “given” by


her family to go and live with the husband, acts as a Target of ‘giving’, ‘tak-
ing’ and ‘keeping’. The man, on the other hand, is the one who is ‘taking’ and
‘keeping’ the woman, acts as a volitional agent.

In the Eastern Khanty conventionalized cultural frame of marriage, the social


role of wife towards the man typically correlates in linguistic discourse to the
semantic role of the Target, while the social role of husband typically correlates
to the semantic role of the Agent.
The only consistent discrepancy from the above correlation, is the ‘wife =
agent’ in motion constructions, where ‘bride’3 is the Agent of motion events.
This motion implies both the literal sense (physical relocation) and a more ab-
stract sense of transferring oneself from the general domain of the Father to the
general domain of the husband: ikija m@nta ‘to marry (Lit. towards husband
go)’, i.e. becoming associated with the general domain of Husband, as implied
by the frame:4

Figure 2. Marriage - ‘social role’ transition = ‘space’ transition.

3. In the Eastern Khanty cultural frame implies that the status of ‘wife’ is complete only
upon arrival and observing certain rituals at the new family location (new house), the
‘husband’s’ clan residence.
4. Eastern Khanty have patrilineal, patrilocal setting, where the oldest son normally
resides with his Father and inherits from him.
68 Andrey Y. Filchenko

This spatial/abstract transition (motion) is not genuinely agentive, however,


in the sense of volition and control, as the cultural frame implies that the agent
(woman) is not the one who is truly controlling and volitional in the motion
event, but complies with the external will. The Agent (woman) never acts alone,
on her own accord, being rather taken/accompanied to a new location by the
man.
Thus, it can be posited, that the key linguistic features (passive voice) of the
Eastern Khanty marriage ‘talk’ are informed and conditioned by the dominant
cultural patterns, norms of social behavior. This, perhaps, to an extent anecdo-
tal exemplication of the cultural groundedness of the Eastern Khanty agented
passive constructions sought not to identify the functional range or in any way
to predict possible grammaticization route, but rather illustrate possible cultural
mechanisms underlying the usage of these constructions, which appears to be
the tendency to demote the agent referent. The demotion of the agent here ap-
pears an operation having to do with de-emphasizing of the core participant in
the event, moving it towards the periphery of the semantic frame of the event,
manifested by an increase in morphological complexity of the argument, that is,
Loc case-marking of the passive Agent, a fairly consistent typological pattern
(Shibatani 1985).

4.2. “Ergative” voice talk


Eastern Khanty “ergative” constructions show similar general features of agent
demotion. That is, agent arguments here are more like adjunct clause constitu-
ents marked with oblique case, generally rendering agent and the whole event
as less volitional, controlled.
Similar “ergative” constructions in remotely related Finnic languages were
referred to in early descriptions as logically impersonal sentences, where events
were conceptualized by speakers as caused by other unapparent forces. A hu-
man here is not granted adequate agentive status, merely marking a locus of an
event, and appears in essence a semi-responsible performer of an act (Bubrix
1946; Balandin 1946).
The most frequent cultural context for the “ergative” voice talk in Eastern
Khanty appears to be that describing interactions with bears.
Parenthetical agent-demoting in Eastern Khanty 69

Bear is an extremely significant cultural agent for many of the Siberian native
cultures, and Eastern Khanty in particular. From the available ethnographic ac-
counts (Tschernetsov 1974, Kulemzin 1984, Lukina 1990) and surviving oral
folk tradition (Filchenko 2009) it is known that behaviour towards the bear is
highly ritualized, with omnipresent taboos and restrictions. Having basically
equal status with a human, a bear is the biggest and most dangerous local preda-
tor, referred to as wont-iki ‘forest master’, or qaq1 wajaG ‘brother animal’, and
just wajaG ‘animal’, and almost never by a proper taboo term jiG ‘bear’. Bear
is the only animal who has a complete set of taboo terms for body parts un-
related to proper somatic nomenclature of humans and other animals: cf. kil
‘bear’s stomach’ vs. qon ‘human/animal stomach’); laGl’ip ‘bear’s tooth’ vs.
pöNk ‘human/animal tooth’, etc). Bear’s bones are never broken and are spe-
cially disposed of out of the reach of dogs and other animals, while the skulls
are kept by the hunters on house roofs (cf. photo below). The events of hunting
the bear and feasting over it are of extreme significance and ritual value, char-
acterized importantly by concealment of the identities of the hunters (mask (cf.
photo below) and nicknames) interpreted as a need to avoid possible retaliation
from the bear’s spirit (Kulemzin 1984, Filchenko 2007). On the other hand, the
ability to hunt bear successfully and thus provide for the community translates
into a prominent social status for the hunter (cf. photo, Figure 3).
Thus, within the Eastern Khanty cultural convention, on the one hand, there
is an apparent tendency towards considerable caution and avoidance of personal
association with the bear, and particularly affecting the bear. On the other hand,
there is an apparent social significance of the status of a successful bear hunter.
This special cultural status and behaviour conventions towards the bear corre-
late to the special treatment of it linguistically, in that a special pragmatically
motivated morphosyntactic operations are implemented aimed at demoting the
70 Andrey Y. Filchenko

Figure 3. Cultural significance of the bear for Eastern Khanty.

status of the agent in the otherwise typical transitive event. In other words, the
culturally conditioned tendency towards avoidance of association with affect-
ing the bear is manifested in the linguistic avoidance of discourse salience of
the agent in the propositions describing this type of events (cf. also Claus, this
vol., outlining experiential effects on language processing).
The above, however, should not be seen as a statement that this construction
is restricted exclusively to the contexts involving the bear. In (15), the context
involves hunting for the biggest local fish, pike.
(15) a. mä sart wel-s-@m, ¨@ll¨@
1sg pike kill-pst2-1sg big
‘I caught a pike-fish, a big one’
b. Öll¨@ sart män-n@ löGöli-s-im
big pike 1sg-loc cut-pst2-1sg/sg
‘I prepared the big pike’
c. terkä-s-im iwes-n@
fry-pst2-1sg/sg stick-loc
‘I fried it on sticks’
Parenthetical agent-demoting in Eastern Khanty 71

In (15a), the human Agent appears clause-initial, controlling predicate agree-


ment. The established pattern would predict further maintenance of this referent
as topical by elision and agreement inflection. However, counter to this expec-
tation, in (15b), this referent appears expressed by the free 1sg Loc-marked pro-
noun and 1sg predicate inflection. The referent “pike” has become identifiable
and accessible textually in (15b), which is also evident from the marked O-V
agreement. After the temporary alteration by the “ergative” (15b), the narrative
discourse resumes in the expected canonical way in the immediately following
active-direct (15c), where the topicality of the 1sg Agent referent is canonically
expressed by the elision and the 1sg agreement on the predicate. Thus, the prag-
matic, semantic and structural features of (15) comply to the established pattern
for the “ergative” agent-demoting construction, which is used contextually at
exactly the point in the event structure, where an apparent agentive affecting the
Target by the Agent is coded at the formal level in an agentivity and salience
avoidance manner.
Reflexive or middle events may also be coded in Eastern Khanty by the
“ergative” constructions consistently with the lower control/volition context,
where in (16), adverbials specifying the degree of intentionality appear optional
if not redundant.

(16) a. män-n@ köt-äm (mil-näm / %toGoj) öGö-käs-@m


1sg-loc hand-1sg (touch-rfl / away) cut-pst3-1sg
kötSäG-nä
knife-com
‘I cut my hand with a knife (incidentally / %on purpose)’
b. mä köt-äm kötSäG-nä (mil-näm / toGoj) öGö-käs-@m
1sg hand-1sg knife-com (touch-rfl/away) cut-pst3-1sg
‘I cut my hand with a knife (incidentally/on purpose)’

Notice, that in the Agent-demoted (16a), the ‘incidental’ interpretation and re-
spective adverbial use is preferred to the ‘intentional’. However, in the reflexive
event with the active-direct transitive clause and canonical Agent coding (16b)
both, the ‘incidental’ and ‘intentional’ interpretations and respective adverbial
use are acceptable. Unlike (16b), which allows for volitional, purposeful event
of acting on oneself, the “ergative” voice in (16a) codes less intentional, defo-
cused Agent and typically has a ‘reading’ of the less volitional, unintentional
Event/Action
The “ergative” agent-demotion construction is also attested with modal ver-
bal predicates, particularly of cognition (17), which are in many features similar
to the already exemplified perception predicates (‘look / aim at’).
72 Andrey Y. Filchenko

(17) män-n@ onql-l-@m tom qu ju-w@l


1sg-loc know-prst-1sg det man walk-prst.3sg
‘I know the man, who is walking there’

In their features, these examples are consistent with the established pattern of
low Target affectedness and reduced/unapparent agentivity and control.
In referential terms, the preference for the contextual use of the construction
holds for the non SAP.

(18) Igorenka SaSka-n@ sam-a tSi-näm joGo-w@l


Igorenko Sashka-loc mug-ill det-lat shoot-prst.3sg
‘Sashka Igorenko shot at the (bear’s) mug’

In (18), the 3sg non-SAP agentive referent is demoted in the context of bear
hunting, implying agent’s affecting the bear, but the agent’s control and volition
in the proposition is backgrounded by Loc case-marking.
In less culturally significant contexts, the “ergative” agent-demoting con-
structions are also attested, however, the general pragmatic tendency of agen-
tivity and salience avoidance holds consistently (19):

(19) a. Matrena Jakowlewna, temi nuN rabota-n? muGuli t@m


Matrena Jakovlevna det 2sg work-2sg what here
w@r-s-@n?
do-pst2-2sg
‘Matrena Jakovlevna, is this your job? What did you do here?’
b. (temi) @nt@ män-n@, metali-p @ntu-s-@m
(det) neg 1sg-loc some-top neg-pst2-1SG
‘(That’s) not me, I did not do anything (nothing happened)’

Notice, in the reply utterance (19b), the Agent is coded as the clause-initial argu-
ment marked by the Loc case, with the main pragmatic function being the agent
demotion, making it less volitional, controlling, affecting and, importantly less
topical for the given stretch of the discourse. The non-topicality, and rather
the focus pragmatic relation of the Agent argument towards the whole of the
proposition (19b), is an important feature, the one by which the presupposition
of (19a) is falsified, or made absent. Thus, it is not incidental or random that
the morphosyntactic properties of the Agent argument in (19b) are consistent
with those of the demoted Agent voice construction. On the other hand, the non-
topicality of the role of Agent here, its coding by the Loc-marked NP aligns it
with another marked Agent-demotion construction, the agented passive.
Parenthetical agent-demoting in Eastern Khanty 73

In some of the examples (20), the Target referent may be elided from ex-
plicit coding, being expressed only by predicate agreement inflection, whereas
Agent is overt and marked formally by the Loc case. These examples testifying
additionally to the Agent’s demotion from the core of the proposition towards
the periphery, while the elided Target, being elided, appears more topical, prag-
matically foregrounded.

(20) män-n@ tSäs qötS@G-näti tuG1 tSoG-l-uj-@n


1sg-loc now knife-instr away cut-prst-ps-2sg
‘I’ll cut you up with a knife now’

This could be seen as a further pragmatic and overall semantic convergence of


the two reviewed Agent-demoting voice constructions, the “ergative” and the
agented passive.

5. General Characteristics of the Agent-demoting Constructions

The Eastern Khanty Agent demoting constructions (Loc-marked agents) dem-


onstrate a mixture of features typical of both subject and non-subject argu-
ments. Though demonstrating regular S/A-V agreement control and agency-
subjecthood features, the Locative case marking aligns these Agents with non-
agentive locative arguments of motion/posture/state propositions, which are es-
sentially intransitive in their nature.
These features correlate consistently with the cross-linguistic observations
on constructions with non-canonically marked Agent arguments, particularly
with the fact that among the predicates requiring the non-canonical marking of
the Agents, those expressing uncontrollable activities are typical to the extent
that control vs. non-control may be “a generally applicable semantic feature”
(Onishi 2001). It is also observed cross-linguistically that in general, oblique
case marking of the core arguments “reflects decreased transitivity status of the
whole clause” (Onishi 2001).
This tendency could be represented as in Figure 4 below in an adaptation of
the Onishi continuum (Aikhenvald et al. 2001).
Typologically, the well documented variety of manifestations of ergativity
introduces less discreteness to the category, allowing for “ergative-like” be-
haviour in otherwise prototypically nominative languages, making it less a cat-
egory, but a scalar pattern of organisation of grammatical relations, which can
be present to a varying extent at different levels of a language system (Comrie,
1978; Dixon, 1994).
74 Andrey Y. Filchenko

(+) agent's subjecthood (-) agent's subjecthood


(+) control/volition (-) control/volition
(+) clause/event transitivity (-) clause/event transitivity
(+) agent’s pragmatic salience (-) agent’s pragmatic salience
Nominative case Locative case

canonical agent-demotion

Figure 4. Continuum of pragmatic, semantic and morphosyntactic features vs. canoni-


cal/noncanonical Agent coding.

Demotion of the Agent referent in both of the reviewed Eastern Khanty con-
structions appears an operation having to do with defocusing of the core par-
ticipant in the event, moving it towards the periphery of the semantic frame of
the event, typically manifested by an increase in morphological complexity of
the Agent referent, i.e. cross-linguistic examples of the passive constructions
(Shibatani, 1985).
Among the essential factors that underlie attested ergative, nominative or
split systems, it is often suspected that larger discourse pragmatic and/or se-
mantic considerations might be the key conditioning factors. Particularly the
degree of topicality/referentiality, as well as volition/control of referents may
affect the inclination in a language’s grammatical relations either towards erga-
tivity or nominativity. Morphological complexity of the Agent argument of the
Eastern Khanty “ergative” construction quite consistently correlates with pro-
totypical ergative continuity between S and O at the deeper level, i.e. A of the
“ergative” approximates the O of the active in its pragmatic and semantic fea-
tures: decreased topicality, low agentivity/control, approaching the semantics
of experiencer/undergoer.
In “ergative” clauses, the overt Loc-marked human/animate Agent, the low
transitivity of the morphologically active verbal predicate, predicate agreement
control, and its parenthetical character (one clause length followed by canonical
active-direct clauses with the continuing topic expressed by elision) signal tem-
porary pragmatic demotion of the low control/volition Agent in the consequen-
tial event, where the agentive nature of the Agent is de-emphasised, consistent
with specific cultural conventions and practices.
In passive clauses, such features as: demoted overt Loc-marked Agent, the
high semantic transitivity of the morphologically passive verb, promoted case-
unmarked Target controlling predicate agreement, and parenthetical character
(1–2 clause length followed by a canonical active-direct clause with the contin-
uing topic expressed by elision) communicate temporary pragmatic prominence
Parenthetical agent-demoting in Eastern Khanty 75

of the Target in the spontaneous/consequential event, where the pragmatic top-


icality (salience) and causer nature of the Agent is de-emphasized.
The Eastern Khanty Agent-demoting constructions, the passive and “erga-
tive” are similar as they manifest parenthetical establishment of an alterna-
tive, secondary topical discourse referent, whose pragmatic status (topicality)
is briefly competing with that of the primary topical agentive referent, which is
expressed by the temporary demotion (backgrounding) of the Agent referent.
The secondary discourse topic is typically coded by a full NP or a free pro-
noun, which, as shown at the onset, is not a preferred primary topic expression
in Eastern Khanty.
What appears to be differing in these Agent-demoting constructions, moti-
vating their co-existence in Eastern Khanty, is the variation in the pragmatic sta-
tus of the two core roles in the proposition within the discourse context. That is,
the Agent-demoting constructions temporary demote (background) current top-
ical Agent, rendering it less controlling/volitional, and possibly parenthetically
promoting (foregrounding) another referent for the length of the utterance. The
passive construction, while also demoting (backgrounding) the Agent, is not
primarily concerned with its agentivity features (as in the case of Loc-Agent
construction), but rather aims to promote (foreground) the non-Agent, Target
role to the discourse fore.
More broadly, within the Eastern Khanty system, Agent-demoting construc-
tions vs. canonical active-direct, indicate a general consistency in the Loc mark-
ing of the Agent with the particular pragmatic and semantic environments.
The identification of the two Agent-demoting constructions based on discourse
pragmatic parameters is also supported by the indication of what appears to be
their complementary distribution in the narrative discourse. That is, these con-
structions have compatible type frequency in the narratives (12%–13%), how-
ever, they appear to show counter-proportional or mutually exclusive frequency
in the same narratives, as evident in the original corpus and from prior studies
(Kulonen 1989).

Conclusion

This case study of two agent-demoting constructions in an indigenous language


of Siberia offers an empirical contribution to the study of the wider issue of dis-
course salience. Contrasting the structural features, discourse-pragmatic func-
tions and propositional-semantic content of two types of referring expressions
in their narrative environment, I attend to the issues of information structure and
the dominant cultural contexts of the expressions. Based on the analysis, I posit,
76 Andrey Y. Filchenko

that a wide cognitive faculty, facilitating discourse coherence in the given cul-
tural context is at play in structuring the information and ultimately affecting the
linguistic form, governing the choice of referring expressions of the arguments
of propositions. The system’s specific grammatical resources consistently cor-
relate with identifiable and predictable pragmatic and semantic properties, man-
ifesting speaker’s choices in representing the salient entities across a stretch of
discourse.
The Eastern Khanty voice constructions typically code ultimately de-transi-
tive events with multiple referents potentially competing for discourse salience.
The “agented passive” and “ergative” constructions manifest parenthetical
shifts in salience of the agentive discourse referents, allowing for gradience in
discourse prominence, primary and secondary topicality. This is expressed by
promotion/demotion voice operations, oblique case marking and grammatical
relation shifts of agents.
The implications of the above for the issue of discourse salience are vari-
ous. What is salient in the discourse appears to be a matter of gradience and
underlyingly culturally motivated. The salient entity in a stretch of discourse is
a persistent center in a sequence of utterances. Centering is a multifactorial dis-
course phenomenon controlled by an interaction of cultural convention, prag-
matic, lexical semantic and syntactic features. “Ergative” and passive construc-
tions show that: i) center continuation sequencing is indeed preferred over cen-
ter retention sequencing; ii) Cp(n) does strongly associate with clause-initiality
and pronominalization, which confirms correlation of low morphological com-
plexity to high pragmatic salience; iii) Cp(n) ’s association with subjecthood or
grammatical relation of S is though strong, however, not impenetrable for such
factors as cultural preferences, speaker intentions and pragmatic pressures; iv)
in the non-canonical constructions S relation is indeed firmly tied with a rela-
tively high degree of discourse salience, albeit occasionally temporarily fore-
grounded compared to the primary persisting discourse center; v) in a stretch of
utterances more than one simultaneous discourse centers are possible, in which
case; vi) the choice of referring expression is determined by the speaker’s in-
tentions interacting with prevailing patterns in the linguistic system, such as:
word-order, clause-initiality of the center, association of semantic role(Agent) –
grammatical relation (S/A)–and grammatical role/function (Subject).

List of abbreviations (glosses)

acc – Accusative case com – Comitative case


Ag – Agent cond – Conditional affix
atten – Attenuative affix det – Determiner
Parenthetical agent-demoting in Eastern Khanty 77

dim – Diminutive affix 3pl/sg – verbal agreement or possessive


du – Dual number affix (3Pl – Agent/Possessor, SG –
el – Elative case Object/Possessed)
Ep – Epenthetic vowel/consonant 3sg/pl – verbal agreement or possessive
ill – Illative case affix (3SG – Agent/Possessor, PL –
impp-imperfective participle Object/Possessed)
impr – Imperative affix pp – Perfective participle
inf – Infinitive affix prd – Predicator affix
instr – Instrumental case prst – Present-Future tense
intn – Intensive ps – Passive voice affix
lat – Lative case pst1 – Past tense affix #1
loc – Locative case pst2 – Past tense affix #2
neg – Negative particle pst3 – Past tense affix #3
np – Noun phrase pst0 – Suffixless past tense affix
mmnt – Momentative affix rfl – Reflexive particle/affix
pl – Plural number sap – Speech act participant
1sg – 1 person singular sg – Singular number
1sg/sg – verbal agreement or possessive top – Topicality marker
affix (1sg – Agent/Possessor, sg – tr – Trajector
Object/Possessed) tr – Transitivizer affix
3pl – 3 person plural

References

Aikhenvald, A.Y., Dixon, R.M.W., Onishi, M.


2001 Non-canonical marking of subjects and objects. Amsterdam/Philadel-
phia: Benjamins.
Comrie, B.
1978 Ergativity. In “Syntactic Typology” ed. W.P.Lehmann. UT Austin
Press.
Dixon, R.M.W.
1994 Ergativity. CUP.
Du Bois, J.
1987 Discourse basis of ergativity. Language 63.
Filchenko, A.
2006 The Eastern Khanty Loc-Agent Constructions. Functional Discourse-
Pragmatic Perspective. In: Demoting the Agent, Ed. Torgrim Solstad
and Benjamin Lyngfelt. John Benjamins. Amsterdam-New York.
78 Andrey Y. Filchenko

Filchenko, A.Y.
2009 Landscape Perception and Sacred Places amongst the Vasyugan
Khanty. In: ed. P.Jordan: Landscape and Culture in the Siberian
North. UCL Press.
Filtchenko, A.Y.
Field notes from ethno-linguistic research of Eastern Khanty. The
Field Archive of the Laboratory of Siberian Indigenous Languages
at TSPU. Tomsk.
Givon, T.
2001 Syntax. An Introduction. Amsterdam/Philadelphia: John Benjamins.
Grosz, B., A. Joshi and Sc. Weinstein
1995 Centering: A framework for modeling the local coherence of dis-
course. Computational Linguistics 21. pp. 203–225.
Gundel, J.K.
1976 Topic-comment structure and the use of tože and takže. Slavic. and
East European Journal 19. pp. 174–176.
Jordan, P. and Filtchenko A.
2005 Continuity and Change in Eastern Khanty Language and World-
view. In: “Rebuilding Identities: Pathways to Reform in Post-Soviet
Siberia” edit. Erich Kasten. Dietrich Reimer Verlag.
Karjalainen, K.F.
1927 Die Religion der Jugra-Völker. Parvoo.
Klimov, G.G.
1984 Nominativnoe i ergativnoe predlozhenia. Moscow.
Kulemzin, V.M.
1984 Celovek i priroda v verovanijakh Khantov. Tomsk.
Kulemzin, V.M.
1995 Mirovozzrencheskie aspekty ohoty i rybolovstva. In. V.I. Molodin,
N.V. Lukina, V.M. Kulemzin, E.P. Martinova, E. Schmidt, N.N. Fe-
dorova (eds.) Istoria i kultura Khantov. Tomsk. pp. 45–64.
Kulonen, U.-M.
1989 The Passive in Ob-Ugrian. Helsinki.
Lambrecht, K.
1994 Information Structure and Sentence From. Cambridge: CUP.
Li, C. (Ed.)
1977 Subject and Topic. London: Academic Press.
Lukina, N.V.
1990 Obshee i osobennoe v kulte medvedja u obskix ugrov. Obrjady naro-
dov severo-zapadnoj Sibiri. Tomsk.
Parenthetical agent-demoting in Eastern Khanty 79

Sarkany M.
1989 Female and Male in Myth and Reality. Uralic Mythology and Folk-
lore, Bp., Helsinki.
Shibatani, M.
1985 Passives and Related Constructions. Language. V-61, #4. pp. 821–
848.
Trask, R.L.
1979 On the origins of ergativity. In: Plank F. (ed), Ergativity. Towards a
theory of grammatical relations. London – New-York.
Tschernetsov, W.N.
1974 Bärenfest bei den Ob-Ugriern. Acta Ethnographica Academiae Scein-
tiarum Hungaricae, t.23 (3–4). Budapest.
Van Valin, R. and Lapolla, R.J.
1997 Syntax. Structure, meaning, and function. CUP.
Joint information value of syntactic and semantic
prominence for subsequent pronominal reference

Ralph L. Rose

1. Introduction

Many studies of discourse production and perception have observed that entities
evoked in subject position are treated somewhat differently than those evoked
in other positions when those entities are referred to subsequently. For instance,
consider the short discourse in (1).
(1) a. Luke i hit Max j
b. Then he i/#j ran home.
b’. Then #Luke/Max ran home.
While the pronoun in (1b) is ambiguous and could be interpreted as referring
to either luke or max, the preferred interpretation is luke, the subject of the
preceding sentence (cf., Hudson-D’Zmura and Tanenhaus 1997; Mathews and
Chodorow 1988). Similarly, repeated reference to luke by name as in (1b’)
is more marked than repeated reference to max by name (Gordon, Grosz, and
Gilliom 1993; Almor 1999; Almor and Eimas 2008). These observations are
from the hearer’s perspective, but even from the speaker’s perspective, sim-
ilar preferences have been observed. Brown (1983) observed that entities in-
troduced as subjects persisted longer than those introduced in other syntactic
positions: That is, there were more contiguous utterances in which the entity
was referred to again.
Many models of discourse production and processing capture these obser-
vations through two assumptions. First, the salience of entities evoked in a dis-
course determines how subsequent reference to those entities should be per-
formed or interpreted (Ariel 1988; Gundel, Hedberg, and Zacharski 1993; Gor-
don and Hendrick 1997, 1998). Second, syntactic information1 is a primary

1. An alternative to syntactic prominence is word order, which for simplex clauses at


least, often results in the same ordering. This is the approach taken in Gernsbacher
and Hargreaves (1988). Further, Hinterhölzl and Petrova (this volume) show that
82 Ralph L. Rose

or even the sole factor which determines salience (Grosz, Joshi, and Wein-
stein 1995; Lappin and Leass, 1994). Thus, according to this kind of model,
the first sentence in (1) introduces two entities into the discourse representa-
tion, luke and max. With respect to the syntactic prominence hierarchy shown
in (2), in this representation, luke is more salient because it was realized in
subject position while max is less salient having been realized in object po-
sition.
(2) subject > object > oblique > none
One problem with this account is that in such languages as English, syntactic
information is often conflated with semantic information. That is, syntactic sub-
jects are often semantic agents and carry more Proto-Agent entailments (e.g.,
sentience, volition; Dowty 1991), while syntactic objects are often semantic pa-
tients and carry more Proto-Patient entailments (e.g., undergo change-of-state,
causally affected). Thus, assuming a semantic prominence hierarchy as in (3)
(cf., thematic hierarchies in Dorr, Habash, and Traum 1998; Jackendoff 1972,
1990; Speas 1990), it could be the case that luke is more salient than max in
(1a) not because it is realized in subject position, but rather because it is realized
as an agent (of the hitting event).
(3) agent > patient > others
Along these lines, Stevenson et al. (2000) argues that semantic focusing is a
crucial factor helping to explain such inter-utterance coherence effects. How-
ever, Miltsakaki (2007) shows that semantic focusing alone is insufficient to
explain related phenomena in Greek and suggests that things may be more com-
plicated.
The purpose of this paper is to investigate syntactic and semantic promi-
nence. Specifically, I am seeking to answer the question, “What is the relative
contribution of syntactic prominence and semantic prominence to the salience
of entities evoked in a discourse?” I investigate this question with a corpus in-
vestigation which looks at coreference across adjacent utterances and the form
of referring expression (pronoun or description) used in subsequent reference.
The results are presented in terms of Information Theory (Shannon 1948) and
suggest that while syntactic and semantic prominence are comparably infor-
mative about the form of subsequent reference, taken together, syntactic and
semantic prominence are more informative than either is alone.

in Old High German, word order involving verb placement indicates the discourse
status of referents.
Joint value of syntactic and semantic prominence 83

In the next section, I describe the basic discourse model I assume in this
paper and then in Section 3, I describe the corpus used in this study. Section 4
contains an overview of Information Theory and particularly the concept of the
value of information. I report the results of the study in Section 5 along with
interleaved discussion.

2. Discourse model

In this paper, I assume a model of discourse processing in which the current ut-
terance is processed with respect to the context; that is, the representation of the
discourse so far (Kamp and Reyle 1993; Kehler 2002). I assume that the con-
text contains representations of the entities evoked in the discourse. Following
Karttunen (1976) and Heim (1982, 1983), I call them discourse referents (or
just referents for short). The set of referents is a partially-ordered list, the or-
der of which is determined by salience – “the degree of relative prominence of
a unit of information, at a specific point in time, compared to the other units
of information” (cf. the Introduction, this volume). A number of factors con-
tribute to salience including syntactic role and recency of linguistic expressions
in a discourse (see Hirst 1981 and Mitkov 2002 for an overview of these and
many other factors). Non-linguistic factors may also contribute to the salience
of referents including visual salience (e.g., Kelleher, this volume) and possibly
prominence within a mental simulation (e.g., Claus, this volume). However, in
this paper I will focus only on linguistic factors.
I take the highest ranking referent to be the most salient referent in the cur-
rent context. As such, if this referent is evoked in the current utterance, then
it should be done so pronominally (cf., Rule 2 of the Centering Framework of
Grosz, Joshi, and Weinstein 1995). This then is a useful metric for determining
which referents in the context are more salient than others (see Krasavina in this
volume for more extensive discussion of referential choice and how this works
in Russian).
This is the approach I use in the corpus analysis in order to examine which
referents are most salient and subsequently which syntactic and semantic fea-
tures are most informative for determining their salience. However one simpli-
fication I make is to assume that recency determines that all referents evoked
in the most recent utterance are more salient than those evoked in earlier ut-
terances. Thus, while inter-utterance coreference could conceivably span mul-
tiple utterances, the present study only considers coreference in adjacent utter-
ances.
84 Ralph L. Rose

The theoretical approach which I take in this study embodies the speaker’s
point of view in discourse processing. In other words, I am investigating what
the speaker takes as salient in the discourse and the encoding decisions made
as a result of that. However, I take salience to be a feature of discourse rep-
resentation which is ultimately used by both hearer and speaker in their re-
spective tasks. The precise way in which each uses salience may be different,
but I assume that they rely on the same core notion of salience in the process
of discourse production or perception (cf., Prince 1986; Blutner 1998, 2000;
though see Chiarcos in this volume for a detailed view of how speaker and
hearer salience may be distinguished).

3. Corpus design

The corpus is composed of texts selected from an on-line, refereed magazine


of fiction called InterText (http://www.intertext.com/magazine). At present the
corpus contains five complete texts of varying length comprising a total of 5,480
words. The selected texts are third-person narratives with minimal quoted pas-
sages. These texts were manually marked-up using XML. In this section, I de-
scribe the relevant mark-up elements and how the corpus was analyzed in order
to answer the main research question. It is important to note here that the corpus
mark-up was performed entirely by myself. Thus, at present there is no inter-
rater validation. However, numerous passes over the corpus by me have likely
ensured a high degree of intra-rater consistency.

3.1. Utterances
The texts were first parsed into sentence nodes, <s>, based on their appearance
in the text: word strings terminated by a period (except of course for periods
marking an abbreviation). The <s> nodes were further marked with a relatively
shallow parse based on clause relations. Each clause, <c>, contained at most
one <verb> child. The noun-phrase, <np>, and clausal arguments of a verb were
marked as siblings of the <verb>. The text shown in (4) was thus tagged as in
(5) (leaving out currently irrelevant details).
(4) John hit Matt. He told his teacher that John did so.
(5) <s>
<c>John hit Matt</c>
<punc>.</punc>
</s>
<s>
Joint value of syntactic and semantic prominence 85

<c>He told his teacher that


<c>John did so</c>
</c>
<punc>.</punc>
</s>

In the analyses which follow, I will be investigating instances of inter-utterance


coreference. In terms of the corpus, I define an utterance as a <c> node which
is the immediate child of a <s> node. Thus, the embedded clause in (4), John
did so is not an utterance. On the other hand, conjoined clauses (e.g., [S [C The
building is tall] and [C it is old.]) are treated as separate utterances.
One final note here is that this study looks only at coreference between noun-
phrases (see below for discussion of coreference mark-up in the corpus). Thus
such things as event references as in John secretly pinched Matt but the teacher
saw it are not included. It is doubtful that this exclusion has much effect on the
overall results since there were only a handful of such cases in the corpus.

3.2. Syntactic information


The syntactic role of each argument <np> was marked as “subject”, “object”,
or “oblique”. Any other <np> nodes which were not arguments of a verb were
marked as “none” (i.e., not subject, object, or oblique). In each clause, the near-
est <np> node preceding the <verb> was marked as the subject; the nearest <np>
node following the <verb> but not immediately preceded by a preposition was
marked as the object (so-called double-object constructions like give Mark the
pen were marked with two objects); and any <np> node immediately preceded
by a preposition was marked as an oblique. Thus, (6) was tagged as in (7).
(6) Ken threw the frisbee to Jaime.
(7) <s>
<c>
<np synrole="subject">Ken</np>
<verb>threw</verb>
<np synrole="object">the frisbee</np>
to
<np synrole="oblique">Jaime</np>
</c>
<punc>.</punc>
</s>
86 Ralph L. Rose

3.3. Semantic information


The semantic role of each <np> argument was marked with respect to two se-
mantic systems: the FrameNet (Baker, Fillmore, and Lowe 1998) system of
frames and elements and the Proto-role entailments of Dowty (1991). Here,
I briefly explain each of these.

3.3.1. FrameNet
Based on the Frame Semantics of Fillmore (1968, 1976), the FrameNet system
defines a large number of conceptual frames (e.g., intentionally affect, transitive
action), each of which incorporates a set of frame elements (i.e., thematic roles:
agent, patient, etc.) which participate in that frame. Each frame encompasses
a number of lexical items which invoke that frame and therefore define the
particular roles that the arguments of each item play. For instance, the verb
throw invokes the cause_motion frame and therefore takes several participants
including an agent, a theme, and a goal.
In the present study, for each <verb>, the semantic role of each <np> argu-
ment of that verb was determined by consulting the FrameNet database for the
frame which encompassed that verb and then assigning the respective frame
element labels to the <np> nodes. If a verb was not in the FrameNet database,
then the database was searched for a suitable alternative (e.g., via synonym or
hypernym relations). Thus, the sentence in (6) was tagged as in (8).
(8) <s>
<c>
<np semrole="agent">Ken</np>
<verb>threw</verb>
<np semrole="theme">the frisbee</np>
to
<np semrole="goal">Jaime</np>
</c>
<punc>.</punc>
</s>

3.3.2. Proto-roles
Dowty (1991) proposes an alternative view of the linking between lexical con-
ceptual structure and syntax through semantic entailments placed on arguments
by a verb. He posits two sets of Proto-role entailments as in (9).
(9) Proto-Agent entailments
– sentience
Joint value of syntactic and semantic prominence 87

– volition
– cause event or change-of-state
– undergo movement
Proto-Patient entailments
– undergo change-of-state
– causally affected
– incremental theme
– stationary
Under Dowty’s theory, arguments of a verb may carry any number of these en-
tailments. A selection principle then determines that the argument which carries
the most Proto-Agent entailments becomes the surface subject. The remain-
ing argument with the most Proto-Patient entailments becomes the object. Any
other arguments become obliques. It is important to notice then that under this
system, arguments may take on the Proto-Agent or Proto-Patient roles in vary-
ing degrees. With one verb, the argument realized as subject may carry all four
Proto-Agent entailments while with another verb, the argument realized as sub-
ject may carry only one or two. Furthermore, some crossover between the roles
is possible: An argument realized as a subject may carry some Proto-Patient en-
tailments while an argument realized as an object may carry some Proto-Agent
entailments.
In the corpus, Proto-role entailments for every <np> argument were marked.
The entailments associated with any particular verb were determined using a
series of linguistic tests described in Rose (2005). Thus, (6) was marked as
in (10).
(10) <s>
<c>
<np sentience="yes" volition="yes" stationary="yes">
Ken</np>
<verb>threw</verb>
<np movement="yes">the frisbee</np>
to
<np stationary="yes">Jaime</np>
</c>
<punc>.</punc>
</s>

3.3.3. FrameNet vs. Proto-roles


The two different semantic systems used in this study provide an interesting
contrast. In Frame Semantics, upon which FrameNet is based, case roles are
88 Ralph L. Rose

seen as derived from primitive, psychologically real semantic concepts (Fill-


more 1968). Proto-roles, on the other hand, are seen merely as labels for flexi-
ble configurations of semantic entailments (Dowty 1991). If one or the other
of these two views could be shown as more closely linked to salience, this
may suggest different things about the nature of salience. For instance, if the
FrameNet approach can be shown to be better, this may suggest an interesting
link between salience and semantic primitives via the roles that entities are seen
to play in conceptual frames.

3.4. Coreference information


In order to be able to examine coreference relationships across adjacent utter-
ances, every referential noun phrase (i.e., excluding such things as expletive it)
was marked with an identifier string. Within any given text, all noun-phrases
which were interpreted as referring to the same real-word referent were given
the same identifier. Thus, (11) was marked as shown in (12).
(11) Louis watched a ballerina. She was graceful.
(12) <s>
<c>
<np id="LOUIS">Louis</np>
<verb>watched</verb>
<np id="BALLERINA">
a ballerina</np>
</c>
<punc>.</punc>
<s>
<s>
<c>
<np id="BALLERINA">She</np>
<verb>was</verb>
graceful
</c>
<punc>.</punc>
</s>

Coreference was determined as cases where (in the coder’s opinion) the author
intended for two noun phrases to have the same extensional meaning (and in-
tended the reader to make the same interpretation). Anaphoric dependence was
not a sufficient cause to determine coreference. Hence, in Although John saw
the students raise their hands, his remained down, although his is anaphorically
dependent on their hands, these phrases were not taken as coreferent because
Joint value of syntactic and semantic prominence 89

they have different extensional meanings. Another potential difficulty in mark-


ing coreference is the possibility of ambiguous coreference. In this corpus, there
were surprisingly few cases of ambiguous coreference. As a result, this was one
complication that did not need to be dealt with.

3.5. Notes on analysis


The results given in Section 5 are based on an analysis which takes each utter-
ance as a whole, intact unit consisting of a list of all the unique discourse refer-
ents realized within the boundary of that utterance. Furthermore, the syntactic
and semantic information attached to each discourse referent is the cumulative
information for that referent within that utterance. This procedure has the ad-
vantage of enriching the data set by allowing the syntactic and semantic roles
of referents in embedded clauses to be included (rather than only those in the
matrix clause). However, it does lead to some other difficulties in the analysis
which will be discussed in greater detail in Section 5.

4. Information Theory

The corpus analysis which follows makes use of one fundamental concept in
Information Theory (Shannon 1948): the value of information (hereafter, EIV).
EIV is based on the entropy, H – an estimate of the uncertainty of the outcome –
of a given probability space. H for a probability space with N possible outcomes
can be calculated as shown in (13) where P(n) is the probability of the n-th
outcome.
N
(13) − ∑ P(n) · log2 P(n)
n=1

For a given question in which all possible outcomes are equally likely (e.g.,
the flip of a fair coin), the entropy is very high. However, if we learn some
information, x, that causes one outcome to be far more likely to occur, then our
uncertainty will decrease: H will be reduced. The amount of entropy reduction
as a result of learning x, Hr (x), is thus calculated as the difference between the
initial entropy, H, and the conditional entropy H (x) (i.e., H given x).2
To illustrate, consider the following problem: If I open a novel to a random
page and point to a random letter on the page, what is the probability, P, that the
letter is “u”? Without any other information, P is simply the prior probability of

2. The value amounting to the reduction in entropy has also been referred as entropy
value (van Rooy 2004).
90 Ralph L. Rose

the occurrence of “u” in the language as a whole. Using this prior probability we
could calculate the entropy, H , of the problem. However, imagine we learn that
the preceding letter is “q”. Then we can be much more certain that the letter in
question is “u”. Thus, the conditional entropy, H (“q”), will be less – a reduction
in entropy.
Entropy reduction may be either positive or negative: learning that x is true
may make us more certain while learning that x is false may make us less certain
about some outcome. It is therefore useful to calculate the value of learning
whether or not x is true. In other words, it is useful to know what the overall
value of asking the question of whether x is true or false is. In Information
Theory, this value is estimated as the weighted sum of the entropy reductions
for all possible outcomes of x (here, true or false). This value is known as the
estimated information value, EIV. Formally, the EIV of learning whether or
not x is calculated using the formula shown in (14), where P(x) is the prior
probability of the occurrence of x.

(14) EIV(x) = P(x) · Hr (x) + P(¬ x) · Hr (¬x)

A good illustration of information value comes from the game “Who am I?”
in which one person pretends to be some famous person and others must ask
yes/no questions to find the identity of the person. In this scenario, what is an
informative (i.e., having a large information value) first question assuming that
there is no bias in the choice of famous person? One candidate would be Are you
a male/female? In this case, both terms in the sum of (14) will be at a maximum
and thus EIV will be large. However, a question like Are you Albert Einstein?
will be much less informative: While the entropy reduction if the answer is yes,
Hr (x), is large, the probability the answer is yes, P(x), is very small. If the
answer is no then the converse is true. Thus both terms in the sum of (14) will
be small and EIV will be small. Of course, if after several questions we have
learned that the mystery person is male, is a scientist, lived in the 20th century,
and won a Nobel Prize, then the EIV would be much larger.
In the present study, I am investigating the information value of syntactic
and semantic prominence toward determining the salience of discourse refer-
ents. This is done by asking, for example, the following question: What is the
information value of learning whether or not a particular discourse referent was
a subject to the probability of its being pronominalized in subsequent refer-
ence? This information value, EIV(subject), can be calculated using the formu-
las above. Likewise, the information values for the other syntactic and semantic
features can be calculated. Finally, I will calculate the net information value,
EIVtot , for syntactic prominence as the total of the EIVs for the various syntac-
Joint value of syntactic and semantic prominence 91

tic features (i.e., EIV(subject), EIV(object), etc.). Similarly, I will calculate the
EIVtot for semantic prominence as the total of the EIVs for the various seman-
tic features. Therefore, the central question becomes whether either information
about syntactic prominence or semantic prominence is more informative (i.e.,
larger EIVtot ) than the other or if they are equally informative. A second ques-
tion is whether syntactic and semantic information together is more informative
than either is alone. These two questions are formally summarized in (15)–(16).
(15) Is the syntactic prominence EIVtot greater than, equal to, or less than
the semantic prominence EIVtot ?
(16) Is the joint syntactic and semantic prominence EIVtot greater than ei-
ther the syntactic or semantic prominence EIVtot ?
With respect to (15), if results show that syntactic and semantic prominence
are equally informative, then another question may be posed: Are syntactic and
semantic prominence redundant with each other or are they at least somewhat
independent but equally informative? An answer to this question may be found
by looking at the answer to (16). If the joint information value is higher than ei-
ther is alone, then they cannot be redundant and must therefore be independent.

5. Results and Discussion

In the corpus there are 291 cases of inter-utterance coreference. In 224 (77%)
of these coreference cases, the coreferent noun phrase in the latter utterance is
pronominalized. Thus, the entropy of pronominalization is calculated as shown
in (17) where P(pro) is the probability of pronominalization.
(17) H = −[P(pro) · log2 P(pro) + P(¬ pro) · log2 P(¬ pro)]
H = −[224/291 · log2 (224/291) + 67/29] · log2 (67/291)]
H = 0.778
This value serves as the baseline for entropy reduction: How much is entropy
reduced from H = 0.778 by learning some information about syntactic or se-
mantic prominence? In this section, I will present these results along with some
interleaved discussion. However, before presenting the results, it is necessary
to deal with one complication. The referents in the current context may have
been realized in multiple syntactic positions and semantic roles. For instance,
in (18), as a verbal argument, john has been realized as a subject and an object,
an experiencer and a recipient, and carries the entailments sentience, volition,
and stationary.
92 Ralph L. Rose

(18) <s>
<c>
<np id="JOHN" synrole="subject" semrole="experiencer"
sentience="yes" volition="yes">
John</np>
<verb>wants</verb>
<c>
<np>his father</np>
to
<verb>give</verb>
<np id="JOHN" synrole="object" semrole="recipient"
stationary="yes">
him</np>
<np>a bicycle</np>
</c>
</c>
<punc>.</punc>
</s>

In short, there is an overlap of information caused by such co-occurrences within


an utterance. This generates such questions as how or whether these occurrences
should be handled in the analysis (e.g., should a doubly-realized referent be
treated differently from a singly-realized referent?) Accounting for this requires
a sophisticated mathematical model. For the present research, I will therefore
make certain simplifying assumptions about syntactic prominence and the two
semantic prominence approaches. These assumptions will be clarified in greater
detail in the respective sections below.

5.1. Syntactic Prominence


For syntactic prominence information, I assume that for any given referent, the
role highest on the syntactic hierarchy shown in (2) determines that referent’s
salience. Thus, a referent realized as both a subject and an object within an
utterance would be regarded as having its salience determined by its status as
a subject for that utterance. Given this, the results shown in Table 1 indicate
that learning that a referent was realized as a subject is much more informative
than learning it was realized in any other role about whether or not subsequent
reference to that referent will be pronominalized or not.
The result that the information value of subject-hood is much higher than
that of other syntactic roles is especially interesting in that it resembles the bi-
nary nature of many information-packaging theories (e.g., topic-comment in
Gundel 1974; topic-focus in Sgall 1967; focus-ground in Vallduví 1990). The
Joint value of syntactic and semantic prominence 93

Table 1. Information Value of Syntactic Prominence

x EIV(x)
subject 0.059
object 0.021
oblique 0.010
none 0.011
EIVtot 0.101

concept of the value of information may provide a useful method for quantify-
ing these theories.
It should be noted here that there is some evidence (Miltsaki 2003) that ref-
erents introduced in main clauses are more salient than those introduced in sub-
ordinate clauses regardless of grammatical role. In the present study, the syntac-
tic information was analyzed with respect to a hierarchical approach that distin-
guishes the prominence of referents according to level of embedding. However,
the information value of syntactic prominence under hierarchical marking was
only EIVtot = 0.060. Thus, for expository reasons, the details of this approach
were excluded from the present paper, but can be seen in Rose (2005).

5.2. Semantic Prominence


5.2.1. FrameNet Roles
In the corpus, 158 different frame elements occur. An exhaustive treatment of
these elements is beyond the scope of this paper and is also unwarranted because
many elements have only one or two occurrences. Therefore, I collapsed these
elements into seven groups as shown in (19). Each group is shown with a word
that briefly describes the central property of the elements in that group as well
as some examples of elements in that group.
(19) 1. agentivity: agent, deformer, driver
2. perception: cognizer, experiencer
3. movement: theme, impactor, message
4. affected: created entity, victim
5. movement parameters: direction, ground
6. events: activity, event
7. other: specifier, none
While groups 1–6 include elements defined in the FrameNet system, group 7
includes ad hoc labels assigned to noun phrases in roles not defined under
94 Ralph L. Rose

FrameNet. This included genitive noun phrases (e.g., a teacher in a teacher’s


chair) and arguments in copular constructions (e.g., John is a teacher).
The ordering of the groups shown in (19) parallels orderings given in the-
matic hierarchies proposed in the literature on syntactic linking theories (cf.,
Dorr, Habash, and Traum 1998; Jackendoff 1972, 1990; Speas 1990). Similar
to the simplifying technique for syntactic information above, for a given refer-
ent, the one of its semantic roles which is highest on this hierarchy is regarded
as the role which determines the salience of that referent. Thus, a referent real-
ized as both a cognizer and a victim in the same utterance would be regarded
as having its salience determined by its role as a cognizer–a group 2 element.
Based on this simplification, the results are shown in Table 2.
Table 2. Information Value of Semantic Prominence via FrameNet Roles
group EIV(group)
1 0.013
2 0.045
3 0.012
4 0.002
5 0.005
6 0.004
7 0.019
EIVtot 0.101

Three results are notable. First, it is interesting that the perception roles in
group 2 are more informative than the agentive roles in group 1, in spite of
the fact that agentive roles are usually posited to be highest on many thematic
hierarchies. This suggests that sentience is more important to the salience of en-
tities evoked in a discourse than agentivity. This would seem to parallel other
results showing the importance of animacy to the salience of discourse entities
(Prat-Sala and Branigan 1999).
The second interesting result is that the total information value of semantic
prominence under FrameNet is equal to that of syntactic prominence. I will
discuss the implications of this below. Before that, I present the results for the
other semantic prominence approach used in this study.
The third result that requires some discussion is the moderately high EIV for
the group 7 roles. As noted above, these are roles that fall outside the scope of
the FrameNet system. A cursory examination of these cases in the corpus shows
that many are instances where a particular referent is being retained across a
sequence of utterances (in a manner not unlike the retain transition in Center-
Joint value of syntactic and semantic prominence 95

ing Theory of Grosz, Joshi, and Weinstein 1995) by being placed in a genitive
construction (e.g., John likes apples. His mother does too. So, he bought her a
whole bushel.). While interesting, these cases are not particularly relevant here
to the question of how semantic information influences salience in terms of the
FrameNet system and are therefore not discussed further.

5.2.2. Proto-roles
A particular discourse referent may carry more than one Proto-role entailment.
In order to avoid the overlap problems that this generates in the present anal-
ysis, I use a simple transformation. For each referent, I calculate a parameter
I call Proto-Agency as the total number of (unique) Proto-Agent entailments on
that referent minus the total number of Proto-Patient entailments. Thus, Proto-
Agency ranges in integer values from +4 to −4 (although in this corpus, there
were no instances of −4). For example, a referent which carries, say, three
Proto-Agent entailments and one Proto-Patient entailment within a single ut-
terance would be regarded as having a Proto-Agency value of 3 − 1 = 2. In
short, then, Proto-Agency might be regarded as a measure of how agent-like
(in Dowtian terms) a particular referent is entailed to be: The higher the value,
the more agent-like the referent is. Under this transformation, the results are as
shown in Table 3.
Table 3. Information Value of Semantic Prominence via Proto-role Entailments
Proto-Agency EIV(Proto-Agency)
+4 0.000
+3 0.001
+2 0.045
+1 0.005
0 0.045
−1 0.000
−2 0.001
−3 0.002
−4 ***
EIVtot 0.098

Results here show that semantic prominence with respect to Proto-roles is com-
parably informative to semantic prominence with respect to FrameNet as well
as to syntactic prominence.
96 Ralph L. Rose

5.3. Joint Information Value


While the above results have looked at the information value of learning about
the syntactic or semantic prominence of a referent (e.g., learning that it was re-
alized as a subject or as a group 1 FrameNet element or with a Proto-Agency of
+4 or so on), in this section, I look at the value of learning some joint informa-
tion. That is, what is the value of learning that a referent was realized as, say, a
subject and a group 1 role?
The fact that the EIVtot values of syntactic and semantic prominence are
essentially equal suggests that either they are essentially redundant or that they
are at least somewhat independent, but comparably informative. If the former
is the case, then the joint information value should be no different than that of
each alone. However, if there is some independence between the two pieces of
information, then the joint information value may increase.
The joint information value of syntactic and semantic prominence was cal-
culated by crossing the four syntactic roles against the seven FrameNet groups
or the nine levels of Proto-Agency, and then calculating the EIV for each of
the pairings (e.g., subject/group 1, subject/group 2, etc.). The total information
value, EIVtot was then calculated as the total of these individual EIVs. The final
results are shown in Table 4.

Table 4. Joint Information Value of Syntactic and Semantic Prominence

EIVtot
syntactic role × FrameNet group 0.165
syntactic role × Proto-Agency 0.141

The joint information value of syntactic and semantic prominence is higher than
that of either factor alone. Thus, the results suggest that syntactic and semantic
prominence are not redundant with each other and that each provides at least
some unique information with respect to the pronominalization of subsequent
reference. This conclusion should be regarded as tentative, however, because
the differences noted above are not statistically confirmed. One attempt to eval-
uate the strength of these findings used a boot-strapping procedure in which the
original sample of inter-utterance coreference instances was resampled with re-
placement 10,000 times. Under this procedure, the differences between the joint
EIVs and the individual EIVs were not shown to be significant. However, this
procedure is suspect because measures of skewness and kurtosis show that the
bootstrap distribution is non-normal. It is clear that a larger data set will be
required to confirm this conclusion.
Joint value of syntactic and semantic prominence 97

One interesting result here, though, is the fact that the Proto-role informa-
tion is not quite as informative as the FrameNet group information when taken
together with syntactic role. This, however, could be a by-product of the trans-
formation on the Proto-role information described in Section 5.2.2. This trans-
formation is a mathematical convenience and glosses over semantic distinctions
between the various entailments. Perhaps a more sophisticated transformation
would result in a greater information value.

6. General Discussion

Under the discourse model presented above, the results presented in this cor-
pus analysis suggest that the salience of referents in a discourse is influenced
by both syntactic and semantic information: Taking both into account results
in greater predictive ability for the form of subsequent reference. These results
are thus in line with a view of discourse processing in which salience represents
information about discourse structure: the more salient a referent is in the cur-
rent context, the greater the information value about the structure of subsequent
discourse, particularly the form of referring expressions. Information Theory
thus potentially offers another view of the relative value of the different factors
known to affect discourse salience and may provide another means by which to
narrow down on which factors are most crucial.
The idea that syntactic and semantic information seem to be at least partly
independent in their influence on salience suggest that models of discourse
salience may benefit by including some account of semantic information as
distinct from syntactic information. This is especially relevant to modular ap-
proaches in which one module is responsible for structure while an indepen-
dent module is responsible for interpretation. The results here may be rele-
vant for determining how these modules interact for the purpose of determining
salience.
The improvement in the joint information value suggests that computational
implementations of discourse salience models (e.g., parameterized systems,
such as the Mental Salience Framework; Chiarcos, this volume) might see some
performance improvement by the inclusion of semantic prominence informa-
tion. If the assumption that salience is a core notion common to both speaker
and hearer is correct, then the present results would indicate that pronoun reso-
lution algorithms might also benefit from the inclusion of semantic prominence
as a contributing factor.
In this study, two different semantic systems were employed to evaluate
semantic prominence. The joint information values suggest that the FrameNet
98 Ralph L. Rose

system may be more informative than the Proto-role system. However, as noted
above, this difference may not be real. If it is real, then an interesting line of
future investigation would be to look more closely at the relationship between
salience and the notion of primitive semantic roles as assumed in Frame Seman-
tics. On the other hand, if the difference between the two systems turns out not to
be real, then there is a practical conclusion to make: Technologically speaking,
the Proto-role system is less cumbersome than the vast network of frames and
roles in FrameNet and therefore may be more efficient in the implementation
of mechanisms for discourse processing and salience.
Throughout this study, I have compared the semantic prominence hierar-
chy to thematic hierarchies used in syntactic-semantic linking theories. While
it would be very interesting if it were to turn out that these hierarchies are paral-
lel, there is no theoretical reason to presume that this is so. These two different
research areas use these hierarchies for completely different reasons so it would
not be problematic for the present study if the semantic prominence hierarchy
is different. In fact, the evidence from corpus analysis suggests that this might
be the case: In particular, it seems that roles entailing sentience (e.g., cognizer,
experiencer) are higher on the hierarchy than roles entailing agency. Hence,
validation of the semantic prominence hierarchy is required.3
The results presented in this corpus analysis are perhaps somewhat pre-
mature to conclude firmly that syntactic and semantic information contribute
independently to salience (though see Rose 2005, 2006 for converging evi-
dence from a series of psycholinguistic experiments using different experimen-
tal paradigms). One possibility that needs to be investigated is whether or not
semantic prominence has a broad effect across the whole range of linguistic
items or whether only certain verbs (or certain classes of verbs) influence the
salience of their arguments via semantic information.
Another angle for further investigation of the role of syntactic and seman-
tic prominence along the lines presented here might include looking at different
languages. In English – the language used in this study – syntactic and semantic
role is often conflated as noted early in this paper. However, in languages where
word order is more free such as Spanish or Japanese, the distinction between

3. An interesting side note is a consideration of the possible connection between the


semantic prominence hierarchy and the relative prominence of referents that results
as readers construct a mental simulation (see Claus, this volume) of the ongoing
discourse. In an extreme view, these could be seen as one and the same thing. Further
investigation is warranted to verify whether they are the same and if not, how they
differ.
Joint value of syntactic and semantic prominence 99

syntactic and semantic prominence may be easier to observe.4 Such work may
provide a clearer view of the degree to which syntactic and semantic promi-
nence each determine the salience of discourse referents.

7. Conclusion

In this paper, I have sought to compare the relative information value of syn-
tactic and semantic prominence to the salience of discourse referents. Results
show that both contribute to salience. On its face, the fact that semantic promi-
nence contributes to salience is perhaps unsurprising. However, this research
has also shown ways that semantic prominence can be computed via traditional
thematic role labels or via Dowtian Proto-role entailment configurations. The
results further suggest how semantic prominence information may be combined
with syntactic prominence information to yield more informative measures of
salience.

Acknowledgments
The work presented here is based on my Ph.D. dissertation. I am indebted to
Stefan Kaufmann, my adviser, and Michael Dickey for discussions about the
theoretical background and data analysis. I am also grateful to two anonymous
reviewers for helpful comments on earlier drafts of this paper.

References

Almor, Amit
1999 Noun-phrase anaphora and focus: The informational load hypothesis.
Psychological Review 106: 748–765.
Almor, Amit and Eimas, Peter
2008 Focus and noun phrase anaphors in spoken language comprehension.
Language and Cognitive Processes, 23: 201—225.
Ariel, Mira
1988 Referring and accessibility. Journal of Linguistics 24: 65–87.
Baker, Collin, Fillmore, Charles, and Lowe, John
1998 The Berkeley FrameNet project. Proceedings of the 17th International
Conference on Computational Linguistics, 86–90.

4. Another language in which syntactic and semantic prominence effects may be


more easily delineated is Eastern Khanty (Filchenko, this volume), in which agent-
demoting constructions differ minimally from canonical agent constructions.
100 Ralph L. Rose

Blutner, Reinhard
1998 Lexical pragmatics. Journal of Semantics 15: 115–162.
Blutner, Reinhard
2000 Some aspects of optimality in natural language interpretation. Journal
of Semantics 17: 189–216.
Brown, Cheryl
1983 Topic continuity in written English narrative. In: Givón, Talmy
(ed.), Topic Continuity in Discourse: A Quantitative Cross-Language
Study, 313–342. Amsterdam: John Benjamins.
Dorr, Bonnie, Habash, Nizar, and Traum, David
1998 A thematic hierarchy for efficient generation from lexical-conceptual
structure. Proceedings of the Third Conference of the Association for
Machine Translation in the Americas, 333–343.
Dowty, David
1991 Thematic proto-roles and argument selection. Language, 67: 547–
619.
Fillmore, Charles
1968 The case for case. In: Bach, Emmon and Harms, Robert (eds.), Uni-
versals in Linguistic Theory, 1–90. New York: Holt, Rhinehart and
Winston.
Fillmore, Charles
1976 Frame semantics and the nature of language. Annals of the New York
Academy of Sciences: Conference on the Origin and Development of
Language and Speech, 280: 20–32.
Gernsbacher, Morton Ann and Hargreaves, David
1988 Accessing sentence participants: The advantage of first mention. Jour-
nal of Memory and Language, 27: 699–717.
Gordon, Peter, Grosz, Barbara, and Gilliom, Laura
1993 Pronouns, names, and the centering of attention in discourse. Cogni-
tive Science, 17: 311–347.
Gordon, Peter and Hendrick, R.
1997 Intuitive knowledge of linguistic co-reference. Cognition, 62: 325–
370.
Gordon, Peter and Hendrick, R.
1998 The representation and processing of coreference in discourse. Cog-
nitive Science, 22: 389–424.
Grosz, Barbara, Joshi, Aravind, and Weinstein, Scott
1995 Centering: A framework for modeling the local coherence of dis-
course. Computational Linguistics, 21: 203–225.
Joint value of syntactic and semantic prominence 101

Gundel, Jeanette
1974 The role of topic and comment in linguistic theory. Ph.D. dissertation,
University of Texas, Austin.
Gundel, Jeanette, Hedberg, Nancy, and Zacharski, Ron
1993 Cognitive status and the form of referring expressions. Language, 69:
274–307.
Heim, Irene
1982 The semantics of definite and indefinite noun phrases. Ph.D. disserta-
tion, University of Massachusetts, Amherst.
Heim, Irene
1983 File change semantics and the familiarity theory of definiteness. In:
Bäuerle, Rainer, Schwarze, Christoph and von Stechow, Arnim (eds.),
Meaning, Use, and Interpretation of Language, 164–189. Berlin: W.
DeGruyter.
Hirst, Graeme
1981 Anaphora in Natural Language Understanding: A Survey. Berlin:
Springer-Verlag.
Hudson-D’Zmura, Susan and Tanenhaus, Michael
1997 Assigning antecedents to ambiguous pronouns: The role of the center
of attention as the default assignment. In: Walker, Marilyn, Joshi, Ar-
avind, and Prince, Ellen (eds.), Centering Theory in Discourse, 199–
226. Oxford, UK: Clarendon Press.
Jackendoff, Ray
1972 Semantic Interpretation in Generative Grammar. Cambridge, Mas-
sachusets: MIT Press.
Jackendoff, Ray
1990 Semantic Structures. Cambridge, Massachusetts: MIT Press.
Kamp, Hans and Reyle, Uwe
1993 From Discourse to Logic. Dordrecht: Kluwer Academic.
Karttunen, Lauri
1976 Discourse referents. In: McCawley, James, (ed.), Syntax and Seman-
tics, Vol. 7: Notes from the Linguistic Underground, 363–385. New
York: Academic Press.
Kehler, Andrew
2002 Coherence, Reference, and the Theory of Grammar. Stanford Univer-
sity, CA: CSLI Publications.
Lappin, Shalom and Leass, Herbert
1994 An algorithm for pronominal anaphora resolution. Computational
Linguistics, 20: 535–561.
102 Ralph L. Rose

Mathews, Alison and Chodorow, Martin


1988 Pronoun resolution in two-clause sentences: Effects of ambiguity, an-
tecedent location, and depth of embedding. Journal of Memory and
Language, 27: 245–260.
Miltsakaki, Eleni
2003 The syntax-discourse interface: Effects of the main-subordinate dis-
tinction on attention structure. Ph.D. dissertation, University of Penn-
sylvania.
Miltsakaki, Eleni
2007 A rethink of the relationship between salience and anaphora resolu-
tion. Proceedings of the 6th Discourse Anaphora and Anaphor Reso-
lution Colloquium, 91–96.
Prat-Sala, Mercè and Branigan, Holly
1999 Discourse constraints on syntactic processing in language production:
A cross-linguistic study in English and Spanish. Journal of Memory
and Language, 42: 168–182.
Prince, Ellen
1986 On the syntactic marking of presupposed open propositions. Papers
from the Parasession on Pragmatics and Grammatical Theory, 22nd
Regional Meeting, Chicago Linguistic Society, 208–222, Chicago,
Illinois: University of Chicago.
Rose, Ralph
2005 The relative contribution of syntactic and semantic prominence to the
salience of discourse entities. Ph.D. dissertation, Department of Lin-
guistics, Northwestern University.
Rose, Ralph
2006 Evidence for gradient salience: What happens with competing non-
salient referents during pronoun resolution. Proceedings of the Aus-
tralian Language Technology Workshop, 91–98.
Sgall, Petr
1967 Functional sentence perspective in a generative description. Prague
Studies in Mathematical Linguistics, 2: 203–225.
Shannon, Claude
1948 A mathematical theory of communication. The Bell System Technical
Journal, 27: 379–423, 623–656.
Speas, Margaret
1990 Phrase Structure in Natural Language. Dordrecht, Netherlands:
Kluwer Academic Publishers.
Joint value of syntactic and semantic prominence 103

Stevenson, Rosemary, Knott, Alistair, Oberlander, Jon and McDonald, Sharon.


2000 Interpreting pronouns and connectives: Interactions among focusing,
thematic roles and coherence relations. Language and Cognitive Pro-
cesses, 15: 225–262.
Vallduví, Enric
1990 The informational component. Ph.D. dissertation, University of Penn-
sylvania.
van Rooy, Robert
2004 Relevance and bidirectional optimality theory. In: Blutner, Reinhard
and Zeevat, Henk (eds.), Optimality Theory and Pragmatics, 173–
210. Oxford, UK: Palgrave Macmillan.
The Mental Salience Framework:
Context-adequate generation of referring
expressions

Christian Chiarcos

Abstract. Here, a general architecture for mechanisms of attention control in discourse


is suggested, based on a meta-theoretic notion of salience. I consider two dimensions
of salience, hearer salience indicating the status quo or “givenness” of an entity, and
speaker salience underlying the attempts of the speaker to manipulate this status. This
framework is applied to three information-packaging phenomena, choice of referring
expressions, word-order preferences, and assignment of grammatical roles. The ade-
quacy of this proposal is illustrated by providing reconstructions of two theories, Givón’s
topicality approach and different instantiations of Centering. The proof for Centering-
adequacy is sketched, and the framework is compared to related proposals.
As a result, a parameterized architecture for modeling linguistic variability in dis-
course is presented. It provides a powerful, simple and intuitive mechanism to integrate
cognitive-pragmatic aspects of coding preferences in the field of natural language gen-
eration (NLG).

1. Mechanisms of attention control

It had been noticed very early that the choice among syntactically well-formed
expressions is by no means determined by semantic constraints alone. Consider
the following classical text (Grosz et al. 1995, ex. (5), shortened) with possible
truth-semantically equivalent variations as illustrated in (4’), (1’) and (5’):
(1) Terryte really goofs sometimes.
(2) Hete wanted Tonyto to join him on a sailing expedition.
(3) Hete called himto at 6AM.
(4) Tonyto was sick and furious at being woken up so early.
(5) Heto told Terryte to get lost and hung up.
(6) Of course, Terryte hadn’t intended to upset Tonyto .
106 Christian Chiarcos

Here, I concentrate on referring expressions, in particular on three interacting


levels of variability. Note, that for this exemplar text, the textual function of the
alternatives is roughly identical to that in the original examples.

choice of referring expressions (REF)


(4’) This guyto was sick and furious at being woken up so early.
assignment of word-order preferences (WO) e.g. topicalization
(1’) Sometimes, Terryte really goofs.
assignment of grammatical roles (GR) e.g. passive vs. active clauses
(5’) Terryte was told to get lost and Tonyto hung up.

Taking up popular assumptions among functional linguists, I regard these gram-


matical devices to serve (at least partly) as means a speaker can use to guide the
hearer’s flow of attention in discourse (Chafe, 1976; Tomlin, 1995; inter alia).
The flow of attention is a key mechanism controlling any kind of mental
activity. It is motivated by a “bottle-neck effect”: The world surrounding us (and
even our internal world) is far too rich to be realized, understood, or described
as a whole. Rather, just relevant or especially significant elements are chosen
to build up a finite symbolic representation describing the situation sufficiently
but sparsely enough to be held in mind or to be communicated.
In this view, attention selects only a small subset of the information to be pro-
cessed (Chafe 1976, “center of attention”), but shifts rapidly across the scene.
Thus, complex representations arise not from the current center of attention
alone, but from the sequence of attention shifts as well. Applied to text pro-
duction, a speaker needs to make sure that the hearer’s center (or “focus”) of
attention moves along the lines he had in mind. Otherwise, the hearer cannot
obtain the mental representation the speaker wants him to construct. To prevent
such a failure of communication, the speaker has to be aware of the hearer’s
state of mind and of the effects a given utterance might have on the hearer’s
model of discourse.
In cognitive Linguistics, iconic form-function mappings between mental
states and grammatical devices are assumed as a basis of a general framework
for mechanisms of attention control in discourse. However, these correlations
have remained notoriously vague, which prohibits their practical application
in the field of natural language generation (NLG). To overcome this problem,
I introduce salience as a cover term of properties of mental states such as
The Mental Salience Framework 107

“discourse prominence” (Pustet, 1997), “activation” (Chafe, 1976), ”topicality”


(Givón, 1983), etc., that have defined in an abstract manner only.
I adopt a definition from the field of visual attention control (Koch and Itti,
2000): Salience is a situation-bound, dynamic property of entities within a men-
tal model. Opposed to this, attention is a binary property of a selected sub-set of
entities, but it tends to be attracted by a high degree of salience. Thus, attention
is an epiphenomenon of salience.1 Depending on the salience-induced topo-
logical structure (ranking, order) over the entities within a (discourse) model,
packaging preferences are assigned.

2. Salience in discourse

2.1. A generalized conception of salience


In linguistics, psychology, artificial intelligence and neighboring fields, differ-
ent (and partly contradictive) traditions using the notion of salience evolved
during the last 30 years. Two extreme bounds in the usage of this term can be
seen in the discussion of focal prosody (e.g. Davis and Hirschberg, 1988) and
in the discussion of referential accessibility in discourse (e.g. Sgall et al, 1986).
Pitch accents mark items as intonationally prominent and convey the relative
‘newness’ or ‘salience’ of items in the discourse. (Davis and Hirschberg, 1988)
… salience, [i.e.] foregrounding, or relative activation (in the sense of being
immediately ‘given’, i.e. accessible in memory). (Sgall et al., 1986, p.54f.)
I take these examples to be prototypes of two different dimensions of salience
corresponding to the two most elementary perspectives on information intended
to be uttered:
speaker salience (importance/newsworthiness) Speaker salient information
is speaker-private and relevant, e.g. new for the hearer, not predictable or
something the speaker wants to put special emphasis on.

1. The original definition as used by Koch and Itti is based on visual fields of neurons,
i.e. on areas within a scene, not entities. Though it seems that a similar dynamic
notion of salience is appropriate on a higher level of abstraction as well, researchers
on the interface of visual and linguistic salience modeled it in terms of size and ab-
solute position (Kelleher, this volume), denying the possibility of shifts of attention
at all. However, it seems to be generally accepted that this static notion of salience
is a heuristic approximation only, a generalization over a longer period of time de-
scribing the likelihood of an area to be salient.
108 Christian Chiarcos

hearer salience (accessibility/givenness) Hearer salient information is known


and easily retrievable for the hearer.
Successful communication crucially depends on the availability of both per-
spectives to a speaker. The aspect of speaker salience or importance is a nec-
essary pre-condition for any conversation, as it covers a speaker’s motivations
to produce an utterance. Besides this, if a speaker aims to produce text that is
directed to the hearer, s/he must have some ideas about the hearer’s current
attentional state, i.e., hearer salience.
Assume an idealized scenario where the speaker has no specific information
about the hearer’s state of mind. Then, we can characterize both dimensions of
salience as follows:
attention control Hearer salience reflects the current status quo, e.g. the atten-
tional state assigned to a discourse referent. Speaker salience arises from
intentions to modify this state.
intentionality Speaker salience is induced by the intentions of the speaker.
As no specific assumptions on the intentions of the hearers are avail-
able, hearer salience depends on contextual information available to both
speaker and hearer, thus, it is a property of the common ground between
them.
temporal scope Due to the lack of additional information, hearer salience
must be approximated from situational factors, world knowledge and the
previous discourse, thus, it is “backward-looking”. As opposed to this,
speaker salience or the underlying intentions affects the planning of the
further discourse; so, it can be estimated heuristically from properties of
the forthcoming discourse. In this respect, speaker salience is “forward-
looking”.
stimulus-dependence As speaker salience arises from intentional states, it can
be independent of the current situational context, whereas hearer salience
is stimulus-induced.
Previously, similar classifications have been proposed:
cognitive vs. surface-based Pattabhiraman (1992) introduced a distinction be-
tween “canonical salience” as a property of surface forms and “instantial
salience” of cognitive concepts. He presented an algorithm for the as-
signment of grammatical devices such that the canonical salience of the
resulting expression corresponds to the instantial salience of the underly-
ing concept as much as possible. As a consequence, canonical salience of
surface forms uttered before (i.e. encoded instantial salience) and instan-
tial salience of cognitive concepts to be uttered in the forthcoming dis-
The Mental Salience Framework 109

course can be distinguished where the former is available to both hearer


and speaker but the latter is private to the speaker alone. However, he
did not explore the implications of this distinction for communication in
general but concentrated on the production perspective only.
perspective In their application of the Prague model of salience onto dia-
logue, Hajičová et al. (1998) proposed a distinction between two different
knowledge stocks (discourse models): The individual stock of dialogue
knowledge (ISDK) and the shared stock of dialogue knowledge (SSDK).
Accordingly, the activation or salience of entities in the SSDK is based
on the common ground between the discourse participants, whereas the
“activation degrees of entities in the ISDK depend on the participant’s
own attention, dialogue intentions, etc.” (p.386). Thus, it is possible to
account for different uses of referring expressions according to perspec-
tives of different discourse participants.
salience indication vs. foregrounding Navaretta (2002) identified two com-
ponents of salience affecting the interpretation of pronominal anaphors in
Danish: givenness and explicit salience marking of antecedent (cf. the
dichotomy of “inherent salience” and “imposed salience” by Mulkern
2003). While givenness (or accessibility) derives from discourse factors
such as frequency and previous mention, explicit salience marking can
be used to boost the salience of a referent that is not sufficiently given.
Accordingly, two functions of grammatical devices can be distinguished:
indication (by using canonical constructions when referring to a referent)
and modification (by using special marked constructions) of the inherent
salience, i.e. the attentional state a referent has for the hearer.

Following a general psychological conception of salience, I consider it to be


not necessarily a property of linguistic expressions and the perceived environ-
ment, but to be a general cognitive conception, thus a matter of the perception
of situational factors, of the interpretation of linguistic cues, and of the men-
tal representation of intentional and emotional states. Especially, salience is a
necessary condition for shifts of attention as described above.

2.2. Phenomenology
Here, I focus on three phenomena: Choice of referring expressions (REF), as-
signment of grammatical roles (GR) and word order effects (WO).
Salience is defined in terms of givenness or accessibility, it has been fre-
quently remarked that the more salient a discourse referent is, the less com-
plex, the less semantically rich and the less emphasized referring expressions
110 Christian Chiarcos

are expected. Especially, pronouns are expected to denote more salient referents
than full descriptions. Similarly, salience rankings of grammatical roles (Grosz
et al., 1995; Givón, 2001) follow a conventional hierarchy of markedness with
subject (nominative case) being more frequent and less phonologically com-
plex than direct object, etc. Following Givón (1995, p.25-69), markedness is
defined in terms of complexity and non-conventionality (inverse frequency).
So, for REF and GR, an iconic mapping can be assumed correlating surface
or empirical measures of markedness with the underlying degree of relative
salience:

The more salient a discourse referent is,


the less marked it is expected to be encoded.

The third dimension under consideration is word order. Both assignment of


grammatical roles and choice of referring expressions are entangled with word
order preferences. Generally, less complex forms (e.g. pronouns) tend to pre-
cede more complex forms (Hawkins, 1992), and subjects tend to precede other
grammatical roles (Greenberg, 1963). To integrate these tendencies into the
iconicity principle, a gradient increase of markedness (and a decrease of under-
lying salience) is assumed along with the sequential order of elements within
a clause from left to right (Sgall et al., 1986). With this hypothesis, a unified
model for the planning of the choice of referring expressions, grammatical roles
and word-order preferences can be developed with iconic mappings as illus-
trated in Fig. 1.

markedness − +
REF pronoun full description
WO left-peripheral right-peripheral
GR subject non-subject
salience + −
Figure 1. Simplified markedness hierarchies and salience.

The notion of salience as defined here involves two different temporal or partici-
pant perspectives, and indeed, similar claims on multi-dimensionality have been
made for the levels of referring expressions (cf. deviations from iconic map-
ping; Ariel, 1990, p. 191ff.) and grammatical roles (cf. the aspects of “discourse
prominence”: indicating a horizon and fore-/backgrounding; Pustet, 1997). With
respect to word-order preferences, multiple factors have been considered related
to both dimensions of salience. On the one hand, it is claimed that given ele-
The Mental Salience Framework 111

ments tend to precede new elements (Sgall et al., 1986), on the other one, it has
been shown that – at least for German – this tendency is not absolute (Weber
and Müller, 2004). Instead, Strube and Hahn (1999) suggested that relative or-
dering has an effect on the accessibility in forthcoming discourse, is it thus a
device of foregrounding similar to grammatical roles.

2.3. A general framework


I suggest that from the interaction of two dimensions of salience, coding pref-
erences can be predicted. A generalized framework is sketched that allows for
the reconstruction of two major theories of referential coherence in discourse:
Givón’s (1983; 2001) topicality approach and Centering (Grosz et al., 1995).
Both approaches distinguish two perspectives on discourse, a backward-looking/
anaphoric aspect on the one hand, and a forward-looking/cataphoric aspect on
the other hand, that can be related to hearer salience and speaker salience re-
spectively.
Generalizing over the observations made in the last two sections, I propose
the following characteristics for an operationalizable framework of attention
control and referential coherence in discourse:
– Salience induces a ranking over entities within a mental models, e.g., a dis-
course model. Here, I distinguish
• hearer salience, i.e., the degree of attention/prominence a speaker as-
sumes that a hearer assigns to a given discourse entity, and
• speaker salience, i.e., the degree of attention/prominence/emphasis a
speaker puts on an entity.
– The prototypical function of hearer salience is the indication of the (as-
sumed) degree of attention that a referent is assigned according to situation
and previous discourse.
– The prototypical function of speaker salience is to announce shifts of atten-
tion, it is thus sensitive to speaker-private knowledge and properties of the
subsequent discourse.
– Hearer salience and speaker salience interact and are mapped iconically onto
grammatical devices according to underlying markedness hierarchies, cf.
Pattabhiraman’s (1992) mapping from instantial onto canonical salience.
– The parameters, hearer salience, speaker salience, and packaging prefer-
ences can be represented by numerical scores, with salience scores and pack-
aging preferences defined as normalization of the weighted sum (linear com-
bination) of parameter values of hearer and speaker salience respectively.
Then, different configurations can be implemented by the assignment of dif-
ferent weights.
112 Christian Chiarcos

– Both grammar-dependent parameter values and preference deduction are


based on markedness hierarchies.
For presentational purposes, salience scores are modeled as real numbers from
the scale [0:1] with 0 representing the lowest degree of salience, and 1 the high-
est degree. As general form for salience scores, the following standard repre-
sentation is proposed:
1
sal(r) = (1)
1 + ∑ i wi xi (r)
In eq. (1), xi denotes the value of the i-th salience factor for the referring ex-
pression r in the actual utterance and wi the corresponding weighting.
This set of assumptions constitutes the Mental Salience Framework.

Figure 2. A minimal parameterized framework, schematically.

3. Adequacy

The proposal sketched in Section 2.3 is a meta-theoretical abstraction. Thus,


I do not claim that the framework is cognitively valid, but just to be adequate
with respect to existing theories that provide independent empirical evidence.
This adequacy claim is justified by the reconstruction of two leading theories.
The Mental Salience Framework 113

3.1. Reconstructing Centering Theory


Centering Theory is a model of local discourse coherence that defines relation-
ships between centers (referring expressions) in subsequent utterances, often
applied with special emphasis on its effect on pronominalization and anaphor
resolution.
In Canonical Centering Theory (CCT) (Grosz et al., 1995), the entities of
an utterance U n constitute the set of forward-looking centers C f (U n ) that sub-
sumes possible antecedents for anaphoric references in the forthcoming dis-
course. The forward-looking centers are ordered according to their relative sal-
ience as represented by their grammatical roles (SBJ > DIR-OBJ > INDIR-OBJ
> OTHER). The backward-looking centerC b (U n+1 ) of the following utterance
U n+1 is then defined as the highest-ranked entity from C f (U n ). Centering The-
ory posits a weak constraint on the usability of pronouns: If a pronoun occurs in
U n+1 , then C b (U n +1) must be pronominalized, too (“Rule 1”). Further, it states
preferences for transitions between utterances based upon salience characteris-
tics in U n and U n+1 (“Rule 2”): keep the same entity as backward-looking cen-
ter in both U n and U n+1 , interpret the subject of U n as the backward-looking
center of U n , and keep the subject (“preferred center”) of U n as the backward-
looking center of the following utterance U n+1 (Kibble, 2003).
In Centering, two aspects of salience are distinguished: Salience of potential
backward-looking centers resulting from the assignment of grammatical roles
in the preceding utterance, and salience of forward-looking centers as expressed
by the assignment of grammatical roles in the actual utterance. The dichotomy
of two types of “centers” follows a similar criterion of temporal scope as the
distinction between hearer salience and speaker salience as introduced above.
Hearer salience hsal CCT (r) of a referring expression r in utterance U n (i.e.
salience of r as a potential backward-looking center) can be modeled as the
relative grade of the grammatical role of the antecedent of r (GRante , cf. Fig. 3)
if it occurred in the directly preceding utterance U n−1 . Accordingly, speaker
salience ssal CCT (r) should be predictable from pronominal references to r in
the directly following utterance U n+1 (REF ana ).
The restriction of the canonical model to relations between directly neigh-
boring sentences seems to be unnatural, so it was suggested that more distant ut-
terances contribute to the salience of a discourse referent, but to a lower degree
than the last utterance. This assumption yields Left-Right-Centering (LRCT;
Tetreault, 1999).
To model hearer salience and speaker salience respectively, an explicit mea-
surement of distance must be integrated. Referential distance is defined as the
number of clauses between an antecedent and an anaphor (Givón, 1983), RDante
114 Christian Chiarcos

is the referential distance of an anaphoric link with the anaphor in U n , RDana is


the distance of a link whose antecedent is in U n .
Using this definition, hearer salience and speaker salience for LRCT can be
modeled as follows:2
1
hsal LRCT (r) = (2)
1 + RDante (r) + (1 − GRante (r))

1
ssal LRCT (r) = (3)
1 + RDana (r) + (1 − REF ana (r))
As demanded in the preceding section, the outcome is normalized so that 1 de-
notes the highest possible salience score. If no antecedent (anaphor) exists,
RDante (RDana ) is infinite, thus hsal LRCT (ssal LRCT ) converges against 0. For
Canonical Centering, the locality constraint can be implemented by replacing
RDante with 1/1/RDante .
As an alternative to the canonical salience ranking, in Functional Center-
ing (FCT, Strube and Hahn, 1999), the ordering of potential backward-looking
centers is replaced by a ranking based on information status, embedding depth
and relative word-order. Following Rambow’s (1993) account on Centering and
word order in German, I concentrate on word order as a determinant of the rank-
ing of forward-looking centers (WO ante ), further abbreviated as WOCT.3

1
hsal WO (r) = (4)
1 + RDante (r) + WOante (r)

Again, speaker salience (i.e. salience of forward-looking centers) can be ap-


proximated by the choice of referring expressions for an anaphor in the fol-
lowing utterance (REF ana ). Then, grammatical roles and word order are pre-
dicted from the relative ranking of discourse referents according to their speaker
salience, whereas referring expressions are predicted from hearer salience di-
rectly.

2. In this formalization, referential distance is the most influential factor on salience


(step-width is 1), with GRante and REF ana providing minor distinctions among cases
with equal distance ((1 − GRante ) < 1, (1 − REF ana ) < 1).
3. WOCT covers only one of Strube and Hahn’s (1999) original salience ranking deter-
minants. However, it is generally assumed that word order in German (with the pos-
sible exception of the vorfeld), also reflects information status (Kruijff et al., 2001),
which is consistent with the simplification of Functional Centering proposed here.
For the sake of clarity, however, it should be noted that WOCT is not to be confused
with Strube and Hahn’s (1999) original proposal
The Mental Salience Framework 115

Consider an utterance U n and a referring expression r with antecedent q in a preceding utterance U k and
anaphor s in a subsequent utterance U l (k < n < l).
properties of antecedent properties of anaphor(s)


∞ iff. r has no antecedent
RDante (r) =
n − k − 1 else 
⎧ ∞ iff. no anaphor to r exists
⎪ 0 iff. r has no antecedent RDana (r) =

⎪ l − n − 1 else



⎨1 iff. q is subject ⎧

⎨0 iff. r has no anaphor
GRante (r) = 0.9 iff. q is direct object

⎪ REF ana (r) = 1.0 iff. s is pronominal

⎪0.8 iff. q is indirect object ⎪

⎪ ⎩
⎩ 0.5 else (i.e. s is a full description)
0.7 else
⎧ #mentions of r within the next 20 utterances
⎨0 iff. no antecedent TP(r) =
20
WOante (r) = #words in U k before q
⎩ else
#words in U k − #words in q

Figure 3. Parameters considered

However, as the pronominalization rule (Rule 1) of Centering is underspecified


with respect to coding decisions, a stronger formulation is needed for practical
application in NLG (Kibble and Power, 2000). As referential distance lower
than 1 is a necessary (but no sufficient) condition for the use of pronouns in
CCT, a pronominalization threshold of 0.5 is suggested as a first approximation,
i.e., if hsal(r) > 0.5, use a pronoun, unless this is prohibited by ambiguity of
reference or a higher-ranked referent has been encoded as a full description
already, otherwise, use a full description.

3.2. Reconstructing Topicality


Following Givón (2001), topicality is a cognitive dimension that has to do with
attention control mechanisms and discourse prominence. The two functional
dimensions underlying topicality are anaphoric (“givenness”) and cataphoric
topicality (“importance”). Heuristically, anaphoric topicality is approximated
by referential distance (RD), whereas cataphoric topicality can be approximated
by topic persistence (TP), i.e. the number of mentions of the referent in the
following (up to) 20 clauses.
In the reconstruction, hearer salience is equated with anaphoric topicality,
with referential distance as its only factor, whereas speaker salience is equated
with cataphoric topicality, with topic persistence as its only factor. Topic per-
sistence is normalized.
1
hsal TOP (r) = (5)
1 + RDante (r)
1
ssal TOP (r) =   (6)
1 + 1 − TP(r)
116 Christian Chiarcos

The interaction of both dimensions on the preference deduction layer seems to


be complex but is not explicitly described. Instead, Givón argues that both di-
mensions of topicality form one single and homogeneous dimension of topical-
ity, illustrating effects by revealing correlations between grammatical devices
and topicality measures directly. Here, hearer salience is taken as the main de-
terminant of REF (a strong correlation of pronominalization with referential
distance has been proven), speaker salience is taken as the only determinant of
WO (Givón claims that the impact of cataphoric topicality is greater than the
impact of anaphoric topicality), but GR preferences are calculated by the inter-
action of both dimensions (according to Givón 2001, both factors contribute).
For this combination, addition of hearer salience and speaker salience is sug-
gested.
As pronominalization threshold, assume 0.5 again.

3.3. A minimal instantiation


For a minimal instantiation of the framework described above, the set of param-
eters as shown in Fig. 3 is considered. Topicality and the instantiations of Cen-
tering Theory can be represented by the choice of weights using these factors or
parameters. Then, hearer and speaker salience are calculated as reciprocal of the
weighted sum of parameter values. As result values, GR and WO preferences
are assigned according to relative differences in salience scores, whereas de-
rived REF preferences depend on absolute values (and ambiguity interference)
as described above.
For the purpose of illustration, consider example sentence (5). Using the
parameter weights as summarized in Fig. 4, hearer and speaker salience are
calculated respectively, and preferences can be derived.

parameters weights for hearer salience weights for speaker salience


LRCT WOCT TOP LRCT/WOCT TOP
RDante 1 1 1
1 − GRante 1 0 0
WOante 0 1 0
RDana 1 0
1 − REF ana 1 0
TP 0 1
Figure 4. Reconstructing Left-Right-Centering (LRCT), Centering with word-order-
based salience ranking of forward-looking centers (WOCT), and topicality (TOP).
The Mental Salience Framework 117

Considering Terry (te), we find that RDante (te) = 1 (his last mention was
in sentence (3)) and GRante (te) = 1.0 (subject in (3)). Inserting these values in
equation (1) using the parameter weights for hearer salience (hsal) in LRCT
reconstruction as summarized in Fig. 4, we achieve a formula identical to equa-
tion (2). Thus, hsal LRCT (te) can be calculated as 0.5. As the proposed pronomi-
nalization threshold is not met, we predict a nominal description. Accordingly,
hsal CCT (te) converges against 0, thus the coding preferences in Canonical Cen-
tering would be identical. The antecedent of Terry is sentence-initial in (3), so
hsal WO (te) is 0.5, too. In the topicality reconstruction, where referential dis-
tance is the only parameter of hearer salience, the same prediction is calculated,
too.
The corresponding parameter values for Tony (to) in (5) are: RDante (to) = 0
(last mention in (4)), GRante (to) = 1.0 (subject) and WOante (to) = 0 (sentence-
initial). So, hearer salience of Tony is calculated as 1 for LRCT, CCT, WOCT
and TOP equally. This exceeds the respective pronominalization threshold. As
the only possible interfering referent Terry has a sufficiently lower degree of
hearer salience, no restrictions arise from ambiguity avoidance strategies. So,
we can safely refer to Tony with a pronoun, just as in Grosz et al.’s original
example.
For speaker salience (ssal), we find anaphoric references to Terry and Tony
are in the directly following utterance (RDana = 0), both with full descriptions
(REF ana = 0.5), but only once in the forthcoming discourse (TP = 1/20). Thus,
speaker salience is identical for both Terry and Tony, in 1 in Centering recon-
structions 1 and 2039 in topicality reconstruction.
As grammatical roles resp. word order preferences are determined in Cen-
tering reconstruction by relative differences between speaker salience scores,
no preferences for GR or WO can be derived here. The same is true for WO
preferences in the topicality reconstruction. However, GR preferences in the
topicality reconstruction are calculated from the interaction (e.g. addition) of
hearer and speaker salience scores, but not from speaker salience alone. There-
fore, Tony’s score (hsal TOP (to) = 1) exceeds Terry’s (hsal TOP (te) = 0.5), and
we predict Tony to be preferred subject and Terry to be non-subject. In fact, the
opposite decision was taken in Grosz et al.’s constructed example. However,
this is very likely to be due to constraints from verbal semantics, as a more
agentive realization of Tony in a sentence semantically roughly equivalent to
ex. (6) would be rather odd (cf. ex. (6’)).
(6’) #Of course, Tonyto has not been intended to get upset by Terryte .
118 Christian Chiarcos

Figure 5 summarizes preferences for the whole text is summarized. Besides the
effects of a heuristic ambiguity resolution rule4 and partial indistinguishability,
few crucial deviations from the original coding decision have been found. Here,
(3) seems to be the most critical instance, where actual word order and gram-
matical role assignment deviate from both Centering and topicality preferences.
However, the interpretation of the pronouns in (3) depends on parallelism with
the previous utterance. A sentence like Heto was called (by himte ) at 6 AM. is
nearly incomprehensible. It would be necessary to use a nominal such as the
name for either Tony or both referents (as suggested by the theories).

Figure 5. Example: Parameter values, hearer salience (hsal), speaker salience (ssal) and
coding preferences.

This short example already showed up some limitations of approaches of this


kind. First, pragmatic preferences for word order, the assignment of grammati-
cal roles, and possibly the choice of referring expressions, too, are by no means
unrivaled. Rather, their application is most likely in cases where no other con-
straints arising from syntax (e.g. binding restrictions), semantics (e.g. valency

4. In Fig. 5, full* means to use a non-pronominal form to avoid ambiguity if the abso-
lute salience score is sufficient for pronominalization, subject* and unspec indicate
that Tony and Terry are ranked equally, deviations from original are marked bold.
The Mental Salience Framework 119

frames of applicable verbs) or higher communicative goals (e.g. to add fur-


ther hearer-new information about a referent within a noun phrase) interfere.
Second, the theories and the corresponding reconstructions rely on surface-
oriented heuristics that are often too coarse-grained to generate clear distinc-
tions as shown for word-order preferences in ex. (5). Third, other factors might
contribute to salience, too, such as parallelism effects and others.

4. The Mental Salience Framework: A summary

4.1. General characteristics


The Mental Salience Framework described in this paper consists of essentially
three components:
– differentiation between hearer salience (reference to attentional states of the
hearer) and speaker salience (modification of attentional states of the hearer),
– hearer salience and speaker salience are modeled as normalized linear com-
bination of different contextual factors, and
– coding preferences are traced back to the weighted sum of hearer salience
and speaker salience scans.

4.2. Adequacy with respect to existing theories


As the linear combination of contextual factors and the interaction between
hearer salience and speaker salience involves a number of parameters, differ-
ent parameter configurations can be considered, and as argued above, different
variants of Centering and Givón’s approach can be reconstructed by choosing
appropriate parameter values. The idea of an adequacy proof for these recon-
structions (for a full proof see Chiarcos, 2009) is as follows:
– Provide a definition of adequacy, i.e. all predictions of the original formu-
lation of the theory are predicted by the reconstruction (completeness), and
no prediction of the reconstruction is incompatible with the predictions of
the original formulation of the theory (compatibility).5

5. This definition of adequacy is inspired by the formal definition of equivalence. How-


ever, equivalence differs from adequacy in that it requires soundness rather than
compatibility. A reconstruction is sound if “all predictions of the reconstruction are
predictions of the original formulation as well”. However, it has been recognized be-
fore that neither Centering Theory nor Givón’s approach are fully specified models
of discourse processing:
As such, Centering does not provide a model describing the cognitive underpin-
nings of the assignment of grammatical roles or other grammatical devices which
120 Christian Chiarcos

– Prove completeness of the reconstruction: Identify the set of empirically ver-


ifiable assumptions and predictions made in the original theory, and prove
that these are predicted by the reconstruction as well. For Centering, we have
to show
• any difference in the ordering of two potential backward-looking centers
entails a difference between the hearer salience scores of the correspond-
ing referents (for Canonical Centering by definition, for Left-Right Cen-
tering by induction),
• if one element is pronominalized, then the backward-looking center is
pronominalized (proof by contradiction, assume that the backward-look-
ing center, i.e. the most hearer-salient referent, is not pronominalized. As
it is more salient than the other pronominalized element, it must exceed

indicate the ranking of forward-looking centers, and thus, it explains only the effect
of these grammatical devices, but not their assignment in discourse. Nevertheless, the
preference to keep the backward-looking center over a sequence of utterances (cf.
the notion of “preferred center”; Strube and Hahn, 1999) can be exploited to predict
the assignment of grammatical roles (Kibble and Power, 2004). It should be noted,
however, that these preference are deducted from preferences of transitions between
utterances only within the same discourse segment (Grosz and Sidner, 1986), and
that it is not clear to what degree these preferences extend towards the discourse as
a whole.
Similarly, Givón’s approach involves a differentiation between anaphoric and
cataphoric aspects of topicality, but he does not describe how these both dimensions
are interacting in the deduction of concrete coding decisions.
Therefore, like any practical application of a theoretical construct, a reconstruc-
tion within a formal framework relies on an interpretation which is maximally pre-
dictive in order to achieve concrete predictions, and thus, researchers are usually not
interested in equivalent reconstructions, but in reconstructions which involve a gain
in predictive power. However, such a reconstruction cannot be equivalent as it sys-
tematically violates the soundness criterion.
As an example, Beaver’s (2004) equivalence proof between Centering (as for-
mulated by Brennan et al. (1987)) and his reconstruction of Centering in Optimality
Theory represents in fact a proof of adequacy, as he claims that the reconstruction
Centering entails additional predictions that were not entailed from the original for-
mulation: “This declarativity means that COT is equally suited for generation or in-
terpretation. In contrast, the BFP algorithm is suited for interpretation only. It could
not be used to generate texts directly …”. As an alternative to partial equivalence
proofs as provided by Bearer, I suggest to distangle equivalence and adequacy and
focus on the adequacy between original formulation and the reconstruction rather
than on equivalence.
The Mental Salience Framework 121

the pronominalization threshold. As it is the most salient element, ambi-


guity does not require nominal realization),6 and
• if one element in the following utterance is pronominalized, its gram-
matical role should be higher in the current utterance than those of non-
pronominalized elements appearing in both clauses (proof by contradic-
tion: in the reconstruction, grammatical roles are assigned depending on
the speaker salience scores. The only factor of speaker salience in the
Centering reconstruction is pronominalization in the following utterance.
In the reconstruction, a violation of this preference is possible only if the
semantics in the current utterance do not permit this relative ranking of
grammatical roles. This can be easily contradicted by enumerating the
band-width of grammatical devices which allow the pragmatically ade-
quate generation of grammatical roles.)
– Prove compatibility of the reconstruction with the original formulation: For
Centering, we have to show
• if two elements differ in their hearer salience scores, then the lower-ranked
one must not have been more salient according to Centering. (proof by
contradiction),
• if a non-pronominal description is predicted by the reconstruction, then
Centering must not predict pronominalization (proof by contradiction,
analogous to completeness proof above), and
• if the reconstruction predicts the highest possible grammatical role, Cen-
tering does predict preferred center (proof by contradiction: assume, Cen-
tering unambiguously predicts another element to be preferred center,
then, it must have been the only pronoun in the following utterance, but
then, it must have been the backward-looking center of the following ut-
terance, then, it must preferably have been the preferred center of the last
utterance.)
For reasons of brevity, I restrict myself to this short sketch of the ideas behind
the proof. Based on these considerations, however, I conclude that the recon-
structions described above are adequate with respect to Centering. A similar
proof for Givón’s approach can be made,7 thus proving that the respective re-

6. However, this argument is dependent on the concrete definition of ambiguity. If am-


biguity is defined on morphological agreement only, then, violations of Centering
predictions are possible. However, ambiguity is often resolved from verbal seman-
tics, and thus, also these factors have to be considered.
7. For Givón, only compatibility can be proven, as Givón’s model is concerned with
the analysis of empirical preferences, without specifying concrete predictions.
122 Christian Chiarcos

constructions within the Mental Salience Framework are adequate, and thus, the
framework is capable to allow the reconstruction of two classical approaches.

4.3. Fields of application


The description of the Mental Salience Framework provided here focused on
methodological and theoretical aspects. Accordingly, only a minimal set of pa-
rameters was considered, capable to reconstruct two classical approaches. In
particular, the current speaker salience metrics are incomplete: it cannot be ex-
pected that the speaker’s intentions can always be recovered from frequency
measures such as topic persistence.
Nevertheless important results can be achieved, as partially elaborated by
Chiarcos (2009). Whereas speaker salience and hearer salience can be plausi-
bly extrapolated from the original formulation of the theories, the derivation
of concrete coding preferences is underspecified (especially for Centering), the
interaction of hearer and speaker salience is not fully clear (TOP) or controver-
sial (derivation of word-order preferences in WOCT and TOP), and the set of
factors considered is incompatible. An integrated framework as suggested here
can be used
– to perform a comparative empirical evaluation of different theories resp.
their reconstructions,
– to identify elementary factors considered in different theories and investi-
gate their respective effect on salience scores,
– to evaluate hybrid or modified models by introduction or re-weighting of
parameter values,
– to provide further insights in the interaction between speaker salience and
hearer salience based on empirical results, and finally
– for practical application in natural language generation (NLG).
With respect to the last point, it seems reasonable to implement speaker salience
in NLG systems as an external parameter providing an interface to integrate ex-
ternal “importance” assignments. Such importance assignments can be used by
a system designer to guide the attention of a user in a goal-directed way. Besides
this, hearer salience provides a mechanism for cohesive coding decisions based
on text-oriented measures. One of the most important results to be achieved in
empirical research is the clarification of the interaction of hearer and speaker
salience and their respective influence for the choice of different grammatical
devices.
The Mental Salience Framework 123

4.4. Extensions and challenges


The original motivation underlying the Mental Salience Framework was the in-
sight that a model of the attentional states of the hearer does not sufficiently con-
strain the choice of referring expressions, but that at least one additional dimen-
sion interfering with “givenness” must be considered as well (Chiarcos, 2011).
This observation has been made before, though, however, different candidates
for this alternative, interfering dimension affecting the use of referring expres-
sions have been proposed, e.g. contrastiveness, emphasis, importance (Givón,
1983; Levelt, 1989; Chafe, 1994), etc.
While the differentiation between common ground (as reflected in hearer
salience) and speaker-private knowledge (from which speaker salience arises) is
well justified and probably uncontroversial, the question remains whether these
aspects of salience, hearer salience and speaker salience, form by themselves
uniform dimensions of attentional states. Instead of defending this specific hy-
pothesis, I motivate this assumption from theoretical minimalism, i.e., method-
ological considerations. The postulation of another distinction between, say,
two kinds of speaker salience, must be justified from empirical findings which
cannot be covered by the existing model. Additional dimensions of salience
arising from other modalities, e.g. visual salience, do exist, but with respect
to salience in discourse, I am currently not aware of any empirical data that
makes a differentiation between more than two dimensions of hearer salience
or speaker salience for discourse referents necessary.
A challenging question, however, is whether the grammatical devices of one
type, say referring expressions, can be characterized by only one linear combi-
nation of salience scores or whether hearer salience and speaker salience dif-
fer in their impact on different grammatical devices. In fact, it has been sug-
gested for demonstratives as compared to personal pronouns, that the condition
licensing the use of demonstratives are more specific than those of personal pro-
nouns. For Finnish, Kaiser and Trueswell (to appear 2011) found a preference
for personal pronouns to co-refer with the subject of the preceding utterance,
whereas demonstratives preferred the last mentioned possible antecedent. They
explained this difference with different interoperating dimensions, i.e. linguis-
tic structure (as indicated by grammatical roles) and information structure (as
indicated by word order in Finnish), that differ in their relevance to the choice of
pronouns as compared to the choice of a demonstrative. In the Mental Salience
Framework, this configuration could be modeled by defining hearer salience in
terms of grammatical roles, but to assume a greater influence of word order (in-
dicating non-salience, i.e. non-givenness) on speaker salience scores. Then, the
observed pattern can be achieved by defining that personal pronouns depend on
124 Christian Chiarcos

hearer salience alone, whereas demonstrative pronouns are sensitive to speaker


salience besides hearer salience.
This specific model of demonstratives, however, requires that not one cu-
mulated salience score for the generation of referring expressions is generated,
but that for certain smaller classes of referring expressions, individual scores
are calculated and then, interpreted as the probability to use a specific kind of
referring expression. Thus, the association between salience score and a certain
grammatical device is no longer a direct one, say, a mapping from a certain score
on a scale to a preference for a certain form, but it is a mapping from a two-
dimensional space onto the preference for a given form, guided by the proximity
between the canonical salience of that form and the scores currently achieved.
The Mental Salience Framework permits this kind of extension, though it is
currently concentrating on the most elementary classes of grammatical devices,
abstracting from more fine-grained differentiations such as the differentiation
between pronouns and demonstratives.

Another possible extension is the combination of the Mental Salience Frame-


work with learning algorithms. As factors, salience scores and coding prefer-
ences are specified by numerical scores, which are retrieved from linear com-
binations, this network can be interpreted as a multi-layer perceptron whose
weights (parameters) can be set by backpropagation.
As a result, the Mental Salience Framework allows not only for the compar-
ative representation and evaluation of different theories, but also for data-driven
parameter weighting.

5. Related research

The Mental Salience Framework represents a model of a specific insight on the


nature of attention control in discourse, that is, the distinction between differ-
ent dimensions of the salience of discourse referents, associated with different
functions in the flow of discourse: hearer salience (part of the speaker’s hearer
model, exploited by the speaker to generate expressions in a way that a hearer
can relate them to elements introduced in the discourse before) and speaker
salience (correlate of the speaker’s intention to guide the hearer’s attention on
certain referents, e.g. their role for the further development of the discourse).
The Mental Salience Framework 125

5.1. Multidimensional models of salience in the generation of referring


expressions
While proposals for multidimensional models of salience have been made be-
fore (e.g. Givón, 1983; Clamons et al, 1993, Mulkern, 2003), these often re-
mained merely theoretical, and, to my knowledge, have not been formalized
within a model for the prediction of the choice of referring expressions, the as-
signment of grammatical roles, and the deduction of word order preferences.
The differentiation between two types of salience in NLG contexts as proposed
by Pattabhiraman (1992) concerns another distinction, that is, the relationship of
the degree of (instantial) salience a cognitive representation has, and the degree
of (canonical) salience a grammatical device, or a given lexeme, is capable to
express. In his terminology, hearer salience and speaker salience are both dif-
ferent aspects of instantial salience, whereas canonical salience is concerned
with the mapping between grammatical devices and salience scores. In fact,
Pattabhiraman’s model of salience in NLG can be used as an alternative to the
deductive linear combination approach presented here.
Pattabhiraman’s canonical salience is related to the notion of salience as
developed in the field of semantics of comparisons and metaphors. In her in-
vestigation of metaphorical and literal readings of potentially metaphorically
interpretable expressions, Giora (1999) introduced the notion of salience as
an assessment for the likelihood of a semantic meaning a given sequence of
words can be assigned. Similarly, in his classical work on comparisons, Tver-
sky (1977) postulated that semantic features differ in their relative salience for
different elements, and that these differences have an effect on the ordering of
elements in a comparison. In a later extension of Tversky’s work, Ortony (1979)
found that feature salience has an effect on the well-formedness of metaphoric
expressions. Horacek’s algorithm for the generation of referential descriptions
(Horacek, 1997) broadened Tversky’s and Ortony’s understanding of salience
by identifying the role of property salience in the generation of referring expres-
sions in general, that is, to account for the observation that referring expressions
often involve attributes that are not primarily motivated by their capability to
distinguish the given referent from a set of semantically compatible distractors,
but from independent considerations. Property salience is, however, a feature
of attributes, not discourse referents (as hearer salience and speaker salience).
Property salience and object salience are independent from each other, and, as
suggested by van der Sluis and Krahmer (2001), it can be assumed that both
dimensions co-operate with other dimensions of salience in the production of
the form of referring expressions. The third dimension of salience considered
by van der Sluis and Krahmer comes from environmental factors, especially the
126 Christian Chiarcos

visual surrounding. Effects of the situational context on the choice of referring


expressions have been observed frequently before. Similar to Bühler’s (1934)
interpretation of deixis as an extension of anaphora, Prince (1981) considers
“situationally evoked” and “textually evoked entities” to form a homogeneous
group of highly activated (evoked) referents. Also in the context of multi-modal
generation of referring expressions, the interaction between visual salience and
linguistic salience has been investigated (cf. Kelleher, this volume).
With respect to other existing multi-dimensional models of salience, we may
conclude that besides hearer salience and speaker salience, additional dimen-
sions of salience can be assumed which differ from salience as understood here
(entity-based salience in discourse) in their domain (canonical salience/feature
salience/property salience) or their modality (visual salience), and that are thus
independent from the dimensions of salience discussed here, which are more
strictly concerned with the flow of discourse. Due to this independence, how-
ever, these are compatible with the differentiation between hearer salience and
speaker salience and thus, they can be regarded as potential augmentations of
the Mental Salience Framework.
The differentiation between hearer salience and speaker salience, however,
is theoretically well justified (Clamons et al., 1993; Mulkern, 2003), but has not
been formalized before, and accordingly, also the Mental Salience Framework
can be regarded to provide a more precise model of linguistic salience as com-
pared to older mono-dimensional accounts of linguistic salience as currently
employed by existing models for the generation of referring expressions, also
in multi-modal contexts, which concentrate on hearer salience, e.g., van der
Sluis and Krahmer (2001).

5.2. Centering in Optimality Theory


It has been shown above that the Mental Salience Framework is capable to rep-
resent adequately existing theories such as classical variants of Centering and
related theories such as Givón’s bi-dimensional account of topicality. Similar
attempts for the integration of previously independent lines of research within
one framework have been proposed before,8 but the Mental Salience Frame-
work differs from these in that its theoretical implications are fairly minimal,
that is, essentially only that speaker-private intentions and beliefs have to be

8. Previous proposals include the attempts of Hajičová and Kruijff-Korbayová (1997),


Krahmer and Theune (2002) and Navaretta (2002) to bring together the Praguian
notion of salience developed by Hajičová and Vrbova (1982) and Centering (Grosz
et al., 1995).
The Mental Salience Framework 127

separated from the assumptions that the speaker has about attentional states of
the hearer.
In this theoretical minimalism, the Mental Salience Framework shares a cer-
tain resemblance with Optimality Theory, which can also be viewed as a formal
apparatus within which existing theories such as Centering (Beaver, 2004) can
be reconstructed.9
Optimality Theory relies on the observation that grammars contain con-
straints on the well-formedness of linguistic structures, and often, these con-
straints are in conflict. The rapid and systematic resolution of such conflicts,
however, entails that constraints are not equal in their violability, and thus, the
existence of a ranking. According to OT, constraints are components of the uni-
versal grammar, and language-specific grammars are instantiations of the UG
in that they represent different possible rankings of universal constraints.
Formally, constraints in OT are conditions on the relationship between an
underlying form, or input, and a set of possible surface candidates, i.e. possible
output. For the generation of referring expressions, the input is an underspec-
ified logical form of an utterance, the output is a candidate utterance. The op-
timal candidate output is selected based on the ranking of violated constraints.
Given two candidate forms A and B, A is more optimal than B if the highest-
ranking constraint which is violated by B is not violated by A, and no violations
of higher-ranked constraints occur for A.
In his Centering in OT (COT), Beaver proposes a set of constraints which
capture the main ideas of Centering following Brennan et al. (1987, BFP):
– pro-top The topic is pronominalized. (Rule 1)
– cohere The topic of the current sentence is the topic of the previous one.
(dis-preference of shifts, Rule 2)
– align The topic is in subject position. (dis-preference of shifts, Rule 2)
Further, Beaver provides a constraint-based definition of the backward-looking
center (“topic”):
– one-sentence-window Only discourse entities mentioned in the previous
sentence are salient. (salience definition)
– arg-salience One discourse entity is more salient than another if the first
was referred to in a less oblique argument position than the second in the
same sentence. (salience definition)

9. The concrete claims by Optimality Theory are more rigid, but only concern the na-
ture of constraints as a component of Universal Grammar.
128 Christian Chiarcos

– unique-topic With respect to any sentence, there is exactly one discourse


entity which is the topic of that sentence. (definition backward-looking cen-
ter)
– salient-topic The topic of a sentence is the most salient discourse entity
referred to in that sentence, and undefined if no previously salient entities
are referred to. (definition backward-looking center)
The minimal version of COT also involves further constraints which are not
directly motivated from Centering:
– fam-def Each definite NP is familiar. This means both that the referent is
familiar, and that no new information about the referent is provided by the
definite.
Using this reconstruction of Centering, Beaver shows the equivalence between
COT and Brennan et al.’s original account with respect to pronominalization.
However, it should be noted, that like the reconstruction of Centering within
the Mental Salience Framework, the predictions made by COT are more specific
and more elaborate than the predictions of Brennan et al. (1987). The constraint
fam-def, though a reasonable assumption, is not motivated from Centering,
and pro-top differs from Rule 1 in that it is indistinctive between two criti-
cal cases, i.e., (a) no pronominalization in the output, and (b) pronominaliza-
tion of non-backward-looking center in the output, but not of backward-looking
center.
For his equivalence proof, Beaver concentrated on proper names and pro-
nouns only (excluding fam-def), and proves equivalence between COT and
Centering with respect to three critical cases:
– Purely anaphoric resolutions breaking syntactic constraints are never COT
optimal, and never correspond to preferred BFP transitions.
– Fully anaphoric resolutions which violate Rule 1 are never COT optimal,
and never correspond to preferred BFP transitions.
– Suppose two fully anaphoric resolutions A and B of a sentence satisfy syn-
tactic constraints and Rule 1. If COT ranks candidate A above candidate B
then BFP ranks candidate A above candidate B and vice versa.
For the third case, however, Beaver’s proof relies on the assumption that “Since
Rule 1 is satisfied by A and B and there are pronouns, PRO-TOP is also sat-
isfied by A and B.” Earlier, he described the motivation of pro-top: “PRO-
TOP has essentially the effect of Centering s Rule 1. … If there are pronouns,
then PRO-TOP will function comparably to Rule 1, providing a preference
for interpretations that make the topic (i.e., CB) into a pronoun.” However,
The Mental Salience Framework 129

the formulation “if there are pronouns” involves a great abstraction, in that
is assumes that pronominalization is triggered only by salience (and agree-
ment filters). As noted in the sketch of the adequacy proof above, this assump-
tion predicts the same results as the original Centering rule only if the defini-
tion of agreement filters may extend beyond strict morpho-syntactic congru-
ency.
Further, aside from the critical cases identified above, we can construct an
example in which pro-top and Centering make different predictions about pro-
nominalization:

Marym watched how Sues crossed the street over to Harry’sh house.
(Mary, Sue > Harry)
(7) Shem/s wondered about the low traffic today.
(8) He/Harryh did not realize herm/s .
(9) Heh did not realize Marym /Sues .
(10) Harryh did not realize Marym /Sues .

The examples (7) to (10) are possible continuations of the first sentence. The
well-formedness of example (9) for both interpretations illustrates that Mary
and Sue are equally possible antecedents of a pronoun in the following sen-
tence, thus, a feminine pronoun would be ambiguous between Mary and Sue.
Therefore, (7) is fully ambiguous between both readings, and thus from cooper-
ativity considerations, we may conclude that it is not a feasible candidate output.
As a consequence, only (9) and (10) are to be considered by COT respectively
Centering. However, Harry is clearly more oblique than Mary and Sue in the
first utterance, and thus, it cannot be the backward-looking center. Therefore,
(9) violates both Rule 1 and pro-top. However, (10) violates pro-top, but does
not violate Rule 1. Therefore, if (8) is not available for reasons of ambiguity,
the Centering-optimal output is (10), whereas COT is indistinctive between (9)
and (10).
At this point, I would like to emphasize that because of the unconditional for-
mulation of pro-top in COT and the existence of the fam-def constraint, COT
makes predictions beyond the original Centering, and thus, must be deemed ad-
equate with respect to Centering, rather than equivalent. The equivalence proof
provided by Beaver is concerned with a subset of critical cases and only with
the differentiation between pronouns and the use of proper names. It is possi-
130 Christian Chiarcos

ble to construct a critical example in which Centering and COT make different
pronominalization predictions.10
Thus, Centering in OT and the reconstruction of Centering within the Mental
Salience Framework are comparable with respect to their adequacy (“equiva-
lence”).
However, the theoretical implications of an OT modeling cannot be underes-
timated. Essentially, all possible constraints must be part of the universal gram-
mar. Postulating a constraint like fam-def entails the assumption that definite
NPs form a universal syntactic category, which is clearly contradicted by the
existence of languages which have no explicit definiteness markers. Further, the
OT reconstruction of Centering, like the original formulation of Centering, are
inherently symbolic, categorial accounts, which are capable to predict a finite
and fixed set of possible categories of referring expressions. One of the central
criticisms of categorial accounts of givenness brought forward by Mira Ariel
(1990; 2001) states that the number of grammatical devices distinguished in
a specific language, is theoretically unlimited, and if all relevant distinctions
among referring expressions are to be captured in an extension of COT similar
to the familiarity criterion of definite NPs by the postulation of the correspond-
ing constraints, the formulation of these categories and their salience characteri-
zation in OT also entails that these categories are also present in universal gram-
mar, which is probably misleading. As opposed to this, in the Mental Salience
Framework, the number of possible referring expressions is not a priori limited,
but can be justified in terms of their salience characterization.
In the OT and in the anaphor resolution communities, further instantiations
of Centering in OT have been developed (Buchwald et al., 2002; Bouma, 2003;
Byron and Gegg-Harrison, 2004; Hardt, 2004). From these, the conceptual
motivations underlying the Recoverability Optimality Theory (ROT) model
(Buchwald et al., 2002) are very closely related to underlying insights of the
Mental Salience Framework. Both share a production perspective which leads
to the assumption of two discourse models, the model of the speaker’s private

10. Of course, this can be compensated by a formulation of pro-top which is closer to


the original Rule 1 and thus, it provides no counter-evidence for the reliability of a
reconstriction of Centering in OT in general.
Note, that also for the Mental Salience Framework, Centering-conformant behaviour
can be achieved only by the use of specialized ambiguity filters. For this example, the
Mental Salience reconstruction of Centering predicts (9), if ambiguity is determined
by morphological agreement only. However, the Centering-optimal prediction can be
achieved if ambiguity is defined without any morphological restrictions, which ulti-
matively leads to the following Centering-conformant, but not very natural, strategy:
pronominalize nothing but the backward-looking center (Kibble and Power, 2004).
The Mental Salience Framework 131

intentions and the discourse model, the salience list or “common ground”. Both
share the assumption that cues from the following discourse must be considered
in order to generate referring expressions in a proper way. And, as well as the
Mental Salience Framework, ROT is a parameterized framework in the sense
that the set of constraints considered is subject to possible extensions. Indeed,
the Mental Salience Framework could be applied for the ranking of the current
and the following salience list, and thus serve as a complement to ROT with
respect to the concrete model of salience which is left unspecified so far.
Nevertheless, the Mental Salience Framework is less constrained in its the-
oretical implications and in its adaptive character. Especially, it supports lan-
guage-specific categories of referring expressions whose treatment in Optimal-
ity Theory is uncertain. In the best case, the integration of additional categories
of referring expressions only requires to associate them with certain salience
scores. Accordingly, the Mental Salience Framework is more oriented towards
a broad-scale practical application.

5.3. Centering as a parametric theory


Besides approaches dealing with the reconstruction of different theories within
a more general framework, also the variation of parameters within one theory
has been considered.
By its impressive acceptance across different disciplines of linguistics, Cen-
tering Theory has become widely adapted throughout a great community. How-
ever, as a necessary consequence of this wide spread, the theory was modified in
certain contexts. As one example, OT approaches abstract from the formulation
of transitions between utterances (Beaver, 2004), and even from the concept of
backward-looking center (Hardt, 2004), thus leaving essentially nothing of the
original theory but the metaphor that attention has to be “centered” during dis-
course processing.
But also in more conservative formulations of Centering Theory, parameters
such as the definition of utterance, the definition of possible forward-looking
and backward-looking centers, the criterion of forward-looking centers to be re-
alized within an utterance, and different salience rankings are varying through-
out the literature. Some of these parameters have been empirically investigated
by Poesio et al. (2004) who considered empirical effects of variation in the def-
inition of utterance (sentence, finite clauses, all clauses with a verb, …), real-
ization (indirect realization: consider not only anaphoric, but also bridging rela-
tions between forward-looking centers and potential backward-looking centers;
considering non-third person pronouns as forward-looking centers), and differ-
ent salience rankings.
132 Christian Chiarcos

While the empirical evaluation of different parameters of Centering is a


worthful and important achievement, it opens the question what concrete claims
of the original theory really remain. From their study, Poesio et al. (2004) mo-
tivate a re-formulation of certain aspects of Centering, which, however, is not
compatible with radical approaches such as Hardt’s Dynamic Centering (2004).
One of the most important results of the study is, however, that Centering can-
not be evaluated without considering concrete instantiations of the different pa-
rameters it involves. As long as these parameters remain not fully specified, it
is unclear to what degree Centering can be falsified at all. Therefore, the central
criticism on Centering is not in any of its specific claims, but only in its the-
oretical status. That is, essentially it must be regarded as a framework which
proposes a certain terminology and formalism, but not as a theory in the strict
sense.
On the other hand, the achievements of Centering cannot be denied. For the
first time, a common terminology on several discourse phenomena has been es-
tablished across different disciplines of linguistics. The Mental Salience Frame-
work, however, differs from Centering in that it does not claim that it represents
a theory, but merely a formalism, or a framework. The crucial difference is that
a theory must be falsible, whereas a “parametric theory” as long as it cannot be
evaluated independently from its parameters, is nothing but a metaphor.
A major difference between Centering and the Mental Salience Framework
is that within the Mental Salience Framework, a numerical account of salience
is provided and explicitly modeled with respect to the choice of referring ex-
pressions, grammatical roles and word order preferences, whereas in Centering,
pronominalization is seen as a bye-product of entity coherence with only very
weak consequences on the choice of referring expressions at all. In the Mental
Salience Framework, however, this relationship is formulated in a very explicit
way, in that numerical scores are mapped onto specific coding preferences. Fur-
ther, it applies beyond the scope of pronominalization as opposed to the choice
of full nominal NPs, in that it is compatible with the fine-grained specification
of an arbitrary number of different grammatical devices in terms of the salience
conditions their appropriate use depends on.
Further, Centering does not provide a model for the assignment of grammat-
ical roles, but only for their effect on local coherence. In functional linguistics,
this function is identified as “foregrounding”, and speaker salience can thus be
described as the need of the speaker to place entities in the foreground of a
scene, e.g. for processing of the following discourse. As Centering relies on
surface-oriented factors indicating foregrounding, it implicitly takes an inter-
pretation perspective on the discourse it is applied to. As opposed to this, the
Mental Salience Framework clearly takes a production perspective in that it in-
The Mental Salience Framework 133

cludes an explicit model of attentional states of the speaker, and thus, it is more
specialized for the needs of Natural Language Generation.

5.4. Centering Games


As an extension of the identification of the parameters of Centering (Poesio
et al., 2004) and the existence of different reconstruction of instantiations of
Centering Theory in Optimality Theory, Kibble (2003) proposed a Game-theo-
retic reconstruction of Centering Theory as a framework for collaborative refer-
ence resolution as a non-cooperative game of incomplete information. With our
approach, the Game-theoretic reconstruction of Centering shares the assump-
tion that two perspectives, hearer perspective and speaker perspective, have to
be distinguished.
The relevant processing modules of the hearer perspective include:
discourse modeler maintains a record of entities mentioned in the discourse
which will be candidates for anaphor resolution. Possible discourse mod-
els include (a) a Centering model, (b) the list of focal referents from the
previous clause, or (c) a fully specified discourse model.
reference resolver identifies the referent of a referring expression with an en-
tity in the discourse model.

The relevant processing modules of the speaker perspective include:


planner/content determination organizes input propositions into a text struc-
ture; plan sentences by e.g. choosing verb forms to realize preferred or-
der of arguments. Possible strategies include to (a) promote arguments
within a clause according to their perceptual salience, (b) plan consecu-
tive clauses to align salience rankings, or (c) plan sequence of clauses to
maximize referential continuity, in addition to salience alignment.
realizer generates appropriate referring expression to denote arguments of
predicates.

Some details of Kibble’s approach remain abstract, and the adequacy of this ap-
proach has not been proven so far. However, with the exception of the reference
resolver which has no direct parallel in the Mental Salience Framework, its con-
cepts can be interpreted in terms of Kibble’s Game-theoretic framework. Hearer
salience is clearly a part of the discourse modeler, though a fully specified dis-
course model involves additional aspects beyond the modeling of attentional
states of the hearer. The strategies enumerated in the planner/content determi-
nation module are partly concerned with the assignment of grammatical roles.
134 Christian Chiarcos

Strategy (a) is concerned with perceptual salience only, but is roughly parallel to
the word order and grammatical role-strategies specified for speaker salience in
TOP. Strategies (b) and (c) involve an “alignment of salience rankings” with
utterances from the following discourse, and seemingly, this corresponds to
the extrapolation of speaker salience from coding decisions in the following
discourse according to the Centering reconstructions. Finally, the realizer cov-
ers the determination of coding preferences (from the linear combination of
salience scores) and their application.
Hence, the conceptions of the Mental Salience Framework seem to be closely
related to Kibble’s Game-theoretic reconstruction (“elimination”) of Centering,
and it might be regarded a more concrete framework for formulation of the
strategies suggested by Kibble.

6. Summary and outlook

A generalized parameterized framework was sketched providing an architecture


for mechanisms of attention control by the salience-based assignment of coding
preferences for referring expressions in discourse.
Relying on the previously noticed multi-dimensionality of salience, the dis-
tinction of two dimensions of salience was suggested which is consistent with
different terminological traditions relating the notion of salience to accessibil-
ity/givenness and importance/newsworthiness respectively. As an illustration
of theoretical adequacy, a minimal instantiation has been proposed capable to
represent Givón’s topicality approach and two instantiations of Centering. Fur-
ther, a proof to the adequacy of these reconstruction was sketched.
Hence, the Mental Salience Framework provides a proper basis for the com-
parative evaluation of these and related theories. Beyond this, the numerical
character of the parameters allows for the application of learning algorithms,
e.g. based upon an interpretation of the architecture as illustrated in Fig. 2 as a
neural network. Thus, a supervised learning algorithm can be applied to assign
parameter weights according to empirical data.
As a result, an integrated architecture for cognitive-pragmatic aspects of at-
tention control in discourse has been suggested. Due to its appealing simplicity
and intuitivity, the implementation for NLG systems becomes likely and is the
perspective aim of this research. In this domain, it provides key mechanisms
for both optimizing coherence/cohesion of automatically generated texts (by
coding preferences due to hearer salience) and the assignment of judgments
of emphasis, relevance or importance (speaker salience, if interpreted as rele-
vance, provides an interface to guide the hearer’s attention onto certain aspects
or entities according to external parameters).
The Mental Salience Framework 135

References
Mira Ariel
2001 Accessibility theory: An overview. In T. Sanders, J. Schilperoord, and
W. Spooren, editors, Text Representation. Linguistic and psycholin-
guistic aspects, volume 8 of Human Cognitive Processing, pages 29–
87. John Benjamins, Amsterdam, Philadelphia.
Mira Ariel
1994 Interpreting anaphoric expressions: A cognitive versus a pragmatic
approach. Journal of Linguistics, 30:3–42.
David I. Beaver
2004 The optimization of discourse. Linguistics and Philosophy, 27(1).
Gerlof Bouma
2003 Doing Dutch pronouns automatically in Optimality Theory. In Pro-
ceedings of the EACL 2003 Workshop on The Computational Treat-
ment of Anaphora, Budapest.
Susan E. Brennan, Marilyn W. Friedman, and Carl J. Pollard
1987 A Centering approach to pronouns. In Proc. of the 25th Annual Meet-
ing of the Association for Computational Linguistics, pages 155–163,
Stanford, Cal., July 1987.
Adam Buchwald, Oren Schwartz, Amanda Seidl, and Paul Smolensky
2002 Recoverability Optimality Theory: Discourse anaphora in a bidirec-
tional framework. In Proceedings of the 6th Ws. on the Semantics and
Pragmatics of Dialogue (EDILOG), Edinburgh, Sep. 2002.
Donna K. Byron and Whitney Gegg-Harrison
2004 Evaluating Optimality Theory for pronoun resolution algorithm spec-
ification. In Proceedings of the Discourse Anaphora and Refer-
ence Resolution Conference (DAARC2004), pages 27–32, September
2004.
Karl Bühler
1934 Sprachtheorie. Die Darstellungsfunktion der Sprache. Gustav Fi-
scher, Stuttgart.
Wallace Chafe
1976 Giveness, contrastiveness, definiteness, subjects, topics, and point of
view. In Charles W. Li (ed.), Subject and Topic. Academic Press, New
York.
Wallace Chafe
1994 Discourse, Consiousness, and Time. The Flow and Displacement of
Conscious Experience in Speaking and Writing. University of Chi-
caogo Press, Chicago and London.
136 Christian Chiarcos

Christian Chiarcos
2009 Mental Salience and Grammatical Form. Toward a Framework for
Salience Metrics in Natural Language Generation. PhD thesis. Uni-
versität Potsdam, Germany.
Christian Chiarcos
2011 On the dimensions of discourse salience. Paper presented at the
DGFS-2011 Workshop Beyond Semantics. Corpus-based Investiga-
tions of Pragmatic and Discourse Phenomena. Göttingen, Feb. 2011.
C. Robin Clamons, Ann E. Mulkern, and Gerald Sanders
1993 Salience signaling in Oromo. Journal of Pragmatics, 19:519–536.
James Raymond Davis and Julia Hirschberg
1988 Assigning intonational features in synthesized spoken directions. In
Proc. ACL-1988, pages 187–193, Buffalo/NY.
Rachel Giora
1999 On the priority of salient meanings: Studies of literal and figurative
language. Journal of Pragmatics, 31:919–929.
Talmy Givón
2001 Syntax. John Benjamins, Amsterdam, 2nd edition (2 vols).
Talmy Givón
1983 Introduction. In Talmy Givón, editor, Topic Continuity in Discourse:
A Quantitative Cross-Language Study. John Benjamins, Amsterdam,
Philadelphia, 1983, pages 5–41.
Talmy Givón
1995 Functionalism and Grammar. John Benjamins, Amsterdam, Philadel-
phia.
Joseph H. Greenberg
1963 Some universals of grammar with particular reference to the order of
meaningful elements. In Joseph H. Greenberg, editor, Universals of
language, pages 73–113. MIT Press, Cambridge, Mass.
Barbara J. Grosz and Candace L. Sidner
1986 Attention, intentions, and the structure of discourse. Computational
Linguistics, 12: 175–204.
Barbara J. Grosz, Aravind K. Joshi, and Scott Weinstein
1995 Centering: A framework for modelling the local coherence of dis-
course. Computational Linguistics, 21(2):203–225.
Eva Hajičová, Ivana Kruijff-Korbayová, and Geert-Jan M. Kruijff
1998 Salience in dialogues. In Svetla Cmejrková, Jana Hoffmannová, Olga
Müllerová, and Jindra Svetlá, editors, Dialogue Analysis VI: Proc. of
the 5th Int. Congress of the Int. Assoc. of Dialogue Analysis, April
17–20 1996, Prague, Czech Republic, pages 381–393, Prague.
The Mental Salience Framework 137

Eva Hajičová and Ivana Kruijff-Korbayová


1997 Topics and centers: A comparison of the salience-based approach and
the Centering Theory. Prague Bulletin of Mathematical Linguistics,
67: 25–50.
Eva Hajičová and Jarka Vrbova
1982 On the role of the hierarchy of activation in the process of natural
language understanding. In Jan Horecký, editor, COLING 82 – Pro-
ceedings of the Ninth International Conference of Computational Lin-
guistics, Prague, pages 107–113, Amsterdam: North Holland.
Daniel Hardt
2004 Dynamic Centering. In Proceedings of the Workshop on Reference
Resolution and its Applications: ACL 2004, pages 55–62, Barcelona.
John A. Hawkins
1992 Syntactic weight versus information structure in word order varia-
tion. In Joachim Jacobs, editor, Informationsstruktur und Grammatik,
Linguistische Berichte. Sonderheft 4/1991-92, pages 196–219. West-
deutscher Verlag, Opladen.
Helmut Horacek
1997 An algorithm for generating referential descriptions with flexible in-
terfaces. In Proc. 35th ACL/EACL, pages 206–213, Madrid.
Elsi Kaiser and John Trueswell
to appear 2011 Investigating the interpretation of pronouns and demonstratives in
Finnish: Going beyond salience. In Edwin Gibson and Neal J. Pearl-
mutter, editors, The Processing and Acqusition of Reference. MIT
Press, Cambridge, Mass.
Rodger Kibble
2003 Towards the elimination of Centering Theory. In Ivana Kruijff-
Korbayovà and Claudia Kosny, editors, Proc. of 7th Workshop on
the Semantics and Pragmatics of Dialogue, pages 51–58, University
of Saarbrücken, September 2003
Rodger Kibble and Richard Power
2000 An integrated framework for textplanning and pronominalisation. In
Proceedings of the International Conference on Natural Language
Generation (INLG), 2000
Rodger Kibble and Richard Power
2004 Optimizing referential coherence in text generation. Computational
Linguistics, 30(4):401–416.
138 Christian Chiarcos

Christof Koch and Laurent Itti


2000 Computational modelling of visual attention. Nature Review Neuro-
science, 2:194–203.
Emiel Krahmer and Mariët Theune
2002 Efficient contextsensitive generation of referring expressions. In Kees
van Deemter and Rodger Kibble, editors, Information Sharing: Refer-
ence and Presupposition in Language Generation and Interpretaion.
CSLI Publications, pages 223–264.
Geert-Jan M. Kruijff, Ivana Kruijff-Korbayová, John Bateman, and Elke Teich
2001 Linear order as higher-level decision: Information structure in strate-
gic and tactical generation. In Helmut Horacek, editor, Proceedings of
the 8th European Workshop on Natural Language Generation, pages
74–83, Toulouse, France, July 5–6 2001.
Willem J.M. Levelt
1989 Speaking: From Intention to Articulation. MIT Press, Cambridge,
Mass.
Ann E. Mulkern
2003 Cognitive Status, Discourse Salience, and Information Structure: Ev-
idence from Irish and Oromo. PhD thesis, University of Minnesota.
beruht auf Clamons et al. (1993).
Costanza Navaretta
2002 Combining information structure and Centering-based models of
salience for resolving intersentential pronominal anaphora. In An-
tonio Branco, Tony McEnery, and Ruslan Mitkov, editors, Proc.
DAARC 2002 – 4h Discourse Anaphora and Anaphora Resolution
Colloquium, pages 135–140, Lisbon, September 18–29 2002. Ediēoes
Colibri.
Andrew Ortony
1979 Similarity in similes and metaphors. In Andrew Ortony, editor,
Metaphor and Thought. Cambridge University Press, Cambridge,
pages 186–201.
Thiyagarajasarma Pattabhiraman
1992 Aspects of Salience in Natural Language Generation. PhD thesis, Si-
mon Fraser University, August 1992.
Massimo Poesio, Barbara Di Eugenio, Rosemary Stevenson, and Janet Hitzeman
2004 Centering: A parametric theory and its instantiations. Computational
linguistics, 30(3):309–363.
Ellen F. Prince
1981 Toward a taxonomy of given-new information. In P. Cole, editor,
Radical Pragmatics, pages 223–256. Academic Press, New York.
The Mental Salience Framework 139

Regina Pustet
1997 Diskursprominenz und Rollensemantik – Eine funktionale Typologie
von Partizipantensystemen. Lincom Europa.
Owen Rambow
1993 Pragmatic aspects of scrambling and topicalization in German. In
Workshop on Centering Theory in Naturally-Occurring Discourse.
Institute for Research in Cognitive Science, University of Pennsylva-
nia, Philadelphia, PA.
Petr Sgall, Eva Hajićová, and Jarmila Panevova
1986 The Meaning of the Sentence in its Semantic and Pragmatic Aspects.
Reidel, Dordrecht.
Michael Strube and Udo Hahn
1996 Functional Centering. In Proc. of 34th Ann. Meeting of the Associa-
tion for Computational Linguistics (ACL’96), pages 270–277, Santa
Cruz/CA, June 1996.
Michael Strube and Udo Hahn
1999 Functional Centering – Grounding referential coherence in informa-
tion structure. Computational Linguistics, 25(3):309–344.
Joel R. Tetreault
1999 Analysis of syntax based pronoun resolution methods. In Proceedings
of the 37th Annual Meeting of the Association for Computational Lin-
guistics (ACL’99), Maryland/MD.
Russel S. Tomlin
1995 Focal attention, voice, and word order. an experimental, cross-
linguistic study. In Mickey Noonan and Pamela Downing, editors,
Word Order in Discourse. John Benjamins, Amsterdam, Philadel-
phia, pages 517–554.
Amos Tversky
1977 Features of similarity. Psychological Review, 84(4):327–352.
Ielka van der Sluis and Emiel Krahmer
2001 Generating referring expressions in a multimodal context: An empi-
rical approach. In Proceedings 11th CLIN-meeting, Rodopi, Amster-
dam/Atlanta.
Andrea Weber and Karin Müller
2004 Word order variation in German main clauses: A corpus analysis. In
Proceedings of the 20th International Conference on Computational
Linguistics, Geneva.
Part II.
Beyond entities in discourse
Discourse-structural salience from a cross-linguistic
perspective: Coordination and its contribution to
discourse (structure)

Wiebke Ramm

1. Introduction

This paper approaches the topic of this volume from a cross-linguistic perspec-
tive, with Norwegian, German and English as example languages. Our aim is
to contribute to a clarification of discourse-structural concepts like the distinc-
tion between subordinating vs. coordinating discourse relations as described in
Segmented Discourse Representation Theory (SDRT, Asher and Vieu 2005)
or nucleus-satellite vs. multinuclear discourse relations in Rhetorical Structure
Theory (RST, Mann and Thompson 1988) and their relation to information-
structural (focus vs. background) and syntactic distinctions (coordination vs.
subordination) in a cross-linguistic perspective. Taking non-correspondences
regarding (clause or verb phrase) coordination in translation as an observa-
tional point of departure, we discuss the interpretation of coordinated structures
as compared to non-coordinated alternatives (sentence sequences and syntactic
subordination) with a view to the relative salience of the conjuncts in discourse.1
We are concerned with two types of translation discrepancy involving Norwe-

1. To some degree, the questions addressed in this contribution are similar to those ad-
dressed by Hinterhölzl and Petrova (this volume): both papers investigate the relation
between structural linguistic features – syntactic coordination vs. non-coordinated
structures in the present contribution, V1 vs. V2 word order in Hinterhölzl and
Petrova’s article – and how they show up in hierarchical discourse structure, as mod-
elled in discourse theories such as SDRT.
There is also a certain relationship to the contribution by Krasavina (this vol-
ume) in that questions of choice between different linguistic options and their im-
plications for the marking of salience are addressed: Krasavina deals with choices
between different types of referential expressions in Russian, the present paper ad-
dresses language-specific choices (showing up as translation mismatches) concern-
ing the linking of discourse units.
144 Wiebke Ramm

gian and German or English, namely coordinated clauses in the source language
(SL) translated as a sequence of sentences in the target language (TL) (Norwe-
gian > German, Section 3.1), and syntactic subordination (adjunction) rendered
as (verb phrase or clausal) coordination in the TL (German > Norwegian, Sec-
tion 3.2, and English > Norwegian/German, Section 3.3). Our data are taken
from three different parallel corpora, the Oslo Multilingual Corpus (OMC),2 as
well as from two smaller corpora of non-fictional texts.
It is a well-known fact that coordination, despite its apparent syntactic sym-
metry (the conjuncts belonging to the same syntactic category), may encode
or “explicate” an asymmetrical relation at the semantic-pragmatic level.3 What
we want to show is that coordination tends to be exploited somewhat differ-
ently in Norwegian than in English or German: Norwegian apparently uses co-
ordination more productively, as a kind of compensation for other grammatical
resources (e.g. adjunction or non-coordinated paratactic structures) used in En-
glish or German; and Norwegian also seems to be less constrained with respect
to what kind of discourse units the coordination marker can link as well as re-
garding the order of foregrounded and backgrounded information in a coordi-
nated structure.
From a theoretical viewpoint our observations raise interesting questions
about the correlation between syntactic coordination/subordination and coordi-
nating/subordinating discourse relations (cf. e.g. Asher and Vieu 2005) as well
as the status of the latter across languages. Our data suggest that either the use of
coordinating/subordinating discourse relations in Norwegian differs from their
use in German and English, or that syntactic coordination signalled by a co-
ordination marker (og/und/and) does not necessarily imply a coordinating dis-
course relation between the conjuncts, contrary to what Asher and Lascarides
(2003) and Asher and Vieu (2005), following Txurruka (2000), seem to assume.
A further – both theoretically and empirically interesting – implication of our
contrastive analyses is that they shed light on the backgrounding role/function
of (certain types of) adjuncts.
In Section 2 we give a brief overview of theoretical concepts to bear on our
topic. Section 3 presents and discusses our translational data. Our conclusions
are summarized in Section 4.

2. See http://www.hf.uio.no/iln/tjenester/sprak/korpus/flersprakligekorpus/omc/index.
html (visited 16 Sep 2010)
3. Asymmetry in coordination is taken up in several of the contributions in Fabricius-
Hansen and Ramm (eds., 2008), see the editor’s introduction (Fabricius-Hansen and
Ramm 2008: 7–11) for an overview.
Discourse-structural salience from a cross-linguistic perspective 145

2. General concepts

2.1. Views on (non-)salience


According to a working definition, salience is “the degree of relative promi-
nence of a unit of information, at a specific point in time, compared to the other
units of information” (cf. the introduction of this volume, pp. 2ff.). Applied to
the translation scenario for written texts on which the present study builds, our
concern is the weighting or relative prominence of portions of information in
syntactically coordinated constructions vs. syntactically subordinated or jux-
taposed equivalents, i.e. the interpretation of these constructions at a specific
point in the SL vs. TL text.
The scenario touches upon various concepts at syntactic, semantic and dis-
course level which either emphasise the nature of the relation between (adja-
cent) pieces of discourse – e.g. “coordination vs. subordination” or “symmetry
vs. asymmetry” – or the status of some element as being more or less promi-
nent than some other element, including the linguistic means employed to give
it this status – e.g. marking some element as “foreground(ed)”, “focus(ed)”,
“background(ed)” or “downgraded” in some way.
Section 2.2 discusses different notions of “background”: background as an
information-structural notion used to characterise the weighting of linguistic
units within a clause and as a discourse-structural notion used to assign (in a
theory-dependent way) a particular relation holding between discourse units of
(at least) clause size. Section 2.3 takes up syntactic coordination at clause and
verb phrase level and how its relation to discourse structure is modelled in dif-
ferent theoretical approaches, in particular with respect to the relative weighting
of the conjuncts. Section 2.4 deals with clause linkage, i.e. the options for com-
plex sentence formation, and the dimensions which are affected in the corpus
examples to be discussed in Section 3.

2.2. Some relevant information-structural and discourse-structural notions


In the discussion of information structure and discourse relations, “background”
is an important but fuzzy term. As part of the so-called focus-background par-
tition (Büring 1997; Rooth 1992), the notion of background concerns informa-
tion structure at sentence level. It is commonly illustrated by question-answer
sequences like (1a–b).4

4. What follows is a very much simplified description, disregarding additional parti-


tions like topic vs. comment (or theme vs. rheme) and the notorious ambiguity of the
term focus itself; see e.g. Vallduvi/Engdahl (1996) for a very useful survey.
146 Wiebke Ramm

(1) a. When did you arrive?


b. I arrived yesterday evening.
c. I arrived yesterday evening with some friends.
The part of (1b) that answers the question posed in (1a) – i.e. the adverbial ad-
junct – expresses focus information; the remaining part is background. Repre-
senting one option among a set of alternative answers, focus information is new
information whereas the background is given from the context. In the question-
answer sequence (1a–c), however, the manner adjunct with some friends –
which is post-focal according to Lambrecht (1994) – encodes information that
is new, i.e. not part of the background, but does not contribute to answering the
relevant question and thus cannot be part of the focus in the strict sense either.
The adjunct, in a way, answers a question that has not been asked. We suspect
that, typically, this type of information represents background information in
the wide discourse-structural sense of that term (see below), as seems to be the
case with the adjuncts discussed in sections 3.2 and 3.3. But to our knowledge,
the focus-background partition – and information structure at sentence level in
general – has not been thoroughly discussed with respect to sentences enriched
by optional adjuncts and occurring in real discourse.5 So we shall leave it at the
level of suspicion.
The notion of background in the focus-background partition discussed above
is quite different in nature from discourse-structural background. As we see it,
“background” or “backgrounding” can be understood in (at least) two ways
on discourse level, namely a) as the discourse relation defined in SDRT and
RST, and b) as referring to discourse subordination in general, i.e. covering all
subordinating discourse relations in the SDRT model, and all nucleus-satellite
relations in RST (see below).
The discourse relation Background, as defined in the framework of SDRT,6
is taken to hold “whenever one constituent provides information about the sur-
rounding state of affairs in which the eventuality mentioned in the other con-
stituent occurred” (Asher and Lascarides 2003: 460). It is generally exemplified
by sentence sequences like (2) where it is the second sentence that describes a
state temporally overlapping the event introduced by the first sentence; that is,

5. Asher (1999) discusses some aspects of the relation between sentential focus and
discourse focus. The issue of optional adjuncts, however, is not taken up here.
6. Since most of this paper was written before July 2007, recent work on the discourse
relation Background in SDRT, in particular Asher, Prévot and Vieu (2008), could
only partially be taken into account. In the following, we will comment (in notes)
where Asher, Prévot and Vieu (2008) deviate from what is said about the SDRT
relation Background here.
Discourse-structural salience from a cross-linguistic perspective 147

S2 conveys background description relative to S1 – Background(S1 , S2 ) (cf.


Asher and Lascarides 2003: 166–167, 460–461).
(2) Max opened the door. The room was pitch dark.
At one point, in fact, Asher and Lascarides (2003: 207–208) distinguish be-
tween two Background relations: Background1 , exemplified by (2), and Back-
ground2 , which holds when it is the first segment of a sequence that provides
information about the “surrounding state of affairs” relative to the subsequent
segment: Background2 (S2 , S1 ). However, Asher and Lascarides do not give
any examples of Background2 , and in practice, they seem to understand Back-
ground as illustrated by (2), i.e. in the narrow sense of Background1 .7
According to Asher and Lascarides (2003), the discourse relation Back-
ground is a coordinating discourse relation, but it differs from the prototyp-
ical coordinating discourse relation Narration by allowing a subsequent seg-
ment S3 (e.g. He looked cautiously around him.) to attach to S1 – which is a
diagnostic property of subordinating discourse relations (cf. Asher and Vieu
2005). Asher and Lascarides (2003: 166–167) overcome the difficulty by as-
suming that the text consisting of S1 and S2 “has a topic whose content is
constructed by repeating (rather than summarizing) the contents” of the two
segments. The topic is understood as related to the background segment (i.e.
S2 ) by a relation called Foreground-Background Pair – which is classified as
a subordinating discourse relation (Asher and Lascarides 2003: 462). In the
end, then, Asher and Lascarides (2003) have it both ways: S2 is related to the
preceding segment S1 by a coordinating discourse relation (Background), but
related by a subordinating discourse relation (Foreground-Background Pair) to
the topic constructed by repeating the contents of the DRSes assigned to S1
and S2 . At any rate, Asher and Lascarides (2003) concede that “[i]ntuitively,
a discourse structure containing Background(π 1, π 2) where Kπ 1 describes a
(foregrounded) event and Kπ 2 describes the (background) state, should encode
the fact that Kπ 1 is the “main story line” or the foreground; Kπ 1 is the thing that

7. Background2 , “forward-looking Background”, and its relation to Background1 ,


“backward-looking Background”, is addressed in more detail in Asher, Prévot and
Vieu (2007). They now propose a single Background relation, but differentiate the
discourse structures in which the relation appears: whereas in backward-looking
Background situations (Background1 ) two constituents are directly linked by a Back-
ground relation, in forward-looking Background situations an additional constituent
representing a so-called “Framing Topic” is constructed in order to account for the
characteristic function of setting a (temporal, spatial etc.) frame against which the
following discourse unit(s) is/are to be interpreted.
148 Wiebke Ramm

“matters” in that events from subsequent utterances will be related to it” (Asher
and Lascarides 2003: 166).8 Thus understood, the foreground-background dis-
tinction seems related to the distinction between Hauptstruktur (‘main struc-
ture’) and Nebenstruktur (‘side structure’) made by Klein and von Stutterheim
(1987) within the so-called quaestio model.
As a cover term for forward-looking and backward-looking Background,
i.e. as a discourse relation that can attach in both directions, the SDRT rela-
tion Background would also be similar to the discourse relation Background as
defined in RST (Mann and Thompson 1988). In RST Background is an (asym-
metric) nucleus-satellite relation – roughly corresponding to what is called a
subordinating discourse relation in SDRT but not formally defined – where the
function of what is presented in the satellite is to increase the reader’s ability
to comprehend what is presented in the nucleus (see definitions on the RST
webpage).9 Although the RST definition is not restricted to a particular order
of nucleus and satellite, in typical RST examples of the Background relation
the satellite precedes the nucleus (see examples on the RST webpage.)10 This
means that, after all, Background as a discourse relation is understood quite dif-
ferently in SDRT and RST.11 In any case, the SDRT notion of Background is
a narrower concept, being defined solely by way of temporal overlap between
an event(uality) and a state – which makes it problematic for the analysis of
non-narrative texts, i.e. texts that are not primarily structured by temporal rela-
tions.12

8. The question of whether Background is coordinating or subordinating is also taken


up in Asher, Prévot and Vieu (2007). As a reaction on Vieu and Prévot (2004) and
Fabricius-Hansen et al. (2005) who (independently) challenged the hypothesis that
Background is coordinating, they seem to adapt the view that Background is in fact
subordinating, rather than coordinating (Asher, Prévot and Vieu 2008: 9).
9. URL: http://www.sfu.ca/rst/01intro/definitions.html (visited 7 Oct 2009)
10. URL: http://www.sfu.ca/rst/02analyses/index.html (visited 7 Oct 2009)
11. The relation between the discourse relations Background as defined in RST vs. SDRT
is also discussed in Asher, Prévot and Vieu (2008: 5–6). Referring to the defini-
tions of RST relations given in Carlson and Marcu (2001), they view the SDRT
relation Background as covering two relations in RST, Circumstance (requiring co-
temporality of the events described) and Background (also allowing the events to
occur at distinctly different times).
12. Still another Foreground(ing)-Background(ing) pair, related to salience and atten-
tion, is found in the work of Talmy (e.g. Talmy 2000). To him a concept or a cat-
egory of concepts like Manner (of motion) is backgrounded, i.e. less salient, if it is
expressed as part of – “conflated with” – the main verb root, but foregrounded if it
Discourse-structural salience from a cross-linguistic perspective 149

2.3. Syntactic coordination in discourse representation


Having addressed different notions of background on sentence and discourse
level, the question is how syntactic coordination (clause/verb phrase) fits into
this picture. The SDRT model seems to presuppose a strong correlation between
syntactic coordination and coordinating discourse relations, see e.g. the (narra-
tive) examples containing and-coordination in Asher and Vieu (2005: 604, 605,
606). In fact, Asher and Lascarides (2003: 170) follow Txurruka (2000) in as-
suming that “and is a discourse marker for a coordinating relation; it doesn’t
correspond to a single rhetorical relation, but rather it signals a number of dif-
ferent possibilities such as Narration or Result”; Asher and Vieu (2005) appar-
ently maintain this assumption, although they do point to data suggesting that
the inference from and-coordination to discourse coordination may be defeasi-
ble (Asher and Vieu 2005: 598–599). Also in text analyses based on RST – e.g.
those published on the RST web site13 – coordinated structures (not containing
other discourse markers) are typically assigned a multinuclear discourse rela-
tion, e.g. Joint or Conjunction,14 i.e. the two conjuncts are assigned the same
discourse salience and they are not hierarchically related by a nucleus-satellite
relation. We would like to question whether this actually always holds cross-
linguistically, and whether this appropriately represents actual discourse struc-
ture. The examples in Section 3 indicate that coordination can also be used to
link elements with different salience in discourse, for example, with respect to
the continuation in discourse in the following sentence.
Interesting in this context is research on coordinated vs. non-coordinated
sentences within the framework of Relevance Theory (Blakemore 1987, 2002;
Blakemore and Carston 2005), pointing out the (possibly) non-symmetric na-
ture of coordination and showing that coordination is possible in certain cases
while blocked in others. In particular, they show that using coordination instead

is encoded as an independent constituent; cf. I flew to Hawaii last month vs. I went
by plane to Hawaii last month (Talmy 2000 II: 128).
13. URL: http://www.sfu.ca/rst/02analyses/index.html (visited 7 Oct 2009)
14. Conjunction is not defined as a discourse relation on the “official” RST web site,
but is contained in the “extMT – extended Mann/Thompson” set of RST relations
in the RST tool developed by O’Donnell (URL: http://www.wagsoft.com/RSTTool/
index.html, visited 7 Oct 2009) which is widely used for text analyses across the RST
community. Although not precisely defined there either, according to the developer
of the tool (answer to a mail request, June 04) this relation is meant to cover con-
structions with and connectives. See also the related discussion on the definitions
of Conjunction and Disjunction on the RST mailing list, September 2006: URL:
http://lloyd.emich.edu/cgi-bin/wa?A1=ind0609&L=rstlist (visited 7 Oct 2009).
150 Wiebke Ramm

of a sequence of non-coordinated (“full stop”) sentences sends two types of sig-


nal to the reader, namely, (i) that the two conjuncts should be processed as a unit,
both conjuncts functioning together as premises in the derivation of a joint cog-
nitive effect, and (ii) that certain inferences are licensed regarding the semantic-
pragmatic relations holding between them, the first conjunct always functioning
as a background to the processing of the second. In narrative examples, for in-
stance, a temporal-causal relation (of “consequentiality”, cf. Sandström 1993)
is often inferred without any explicit mention of such a relation; and a non-
narrative use of coordination can be seen in argumentative examples, where the
conjuncts make a joint contribution as steps in an argumentation (Blakemore
and Carston 2005). Relevance Theory does not distinguish between coordinat-
ing and subordinating discourse relations – and prefers to avoid the notion of
discourse relations at all (Blakemore 2002: Sect. 5.3) – but most of the narrative
as well as the argumentative examples given in Blakemore and Carston (2005)
would probably be classified as coordinating discourse relations in the SDRT
framework.
Further relevant research on the discourse properties of sentential and-coor-
dinations vs. asyndetic sentence connections has recently been done by Jasin-
skaja (2009), who combines ideas from SDRT and Relevance Theory in order
to account for the inference of implicit discourse relations, i.e. the inference
of discourse relations which may be communicated without explicit signalling
by a connective or other linguistic means. Whereas theories such as SDRT fo-
cus on the restrictions and-coordinations impose on the semantic relations that
may hold between the conjuncts, Jasinskaja (2009: 68) argues that it is rather
asyndetic connection which is associated with some non-trivial constraints on
the inference of implicit discourse relations, namely towards interpreting the
second sentence as an Elaboration or Explanation of the first. Central chapters
of her thesis concentrate on discourse relations in oral communication where
prosody makes an important contribution to the signalling of discourse rela-
tions. But the two principles Jasinskaja argues to be guiding discourse inter-
pretation (of utterances not containing additional connectives) might also be
interesting for the interpretation of written texts: The Principle of Exhaustive
Interpretation (Jasinskaja 2009: 289) says that, by default, an utterance is inter-
preted exhaustively, i.e. as if it were a complete answer to the “question under
discussion” (QUD) which is defined similar to the notion of “quaestio” in Klein
and v. Stutterheim’s (1987) quaestio model mentioned above. The Principle of
Topic Continuity (Jasinskaja 2009: 291) says that, by default, discourse top-
ics do not change. These default principles can only be overridden by special
linguistic mechanisms (Jasinskaja 2009: 300). In combination, the two princi-
ples imply that a new utterance should be interpreted as addressing the original
Discourse-structural salience from a cross-linguistic perspective 151

“unsettled” question unless there is a discourse marker that indicates something


else (Jasinskaja 2006: 291). In this view, Restatement would emerge as a default
discourse relation, i.e. a relation that can be inferred without explicit signalling
and fulfilling both the Exhaustivity and Topic Continuity principle. The coor-
dination marker and, however, functions as a (weak) linguistic marker that a
non-default discourse relation (e.g. Narration) holds. Although basically devel-
oped with reference to English, Jasinskaja’s ideas can be interesting for one of
the translation scenarios to be discussed in the following section, namely the
translation of sentence coordination by juxtaposed (asyndetically and syndeti-
cally connected) sentences (Section 3.1).

2.4. Coordination, subordination, and clause linkage


In his typology of clause linkage Lehmann (1988) describes the options for
complex sentence formation cross-linguistically along six syntactic-semantic
parameters. Some of these parameters seem to be useful to characterize the
structural changes found in the translation examples we are going to discuss
in the following section and may help to relate the syntactic concepts of subor-
dination/coordination15 and hypotaxis/parataxis16 to their discourse-structural
counterparts: “Hierarchical downgrading” describes the degree to which a hi-
erarchical relation between the linked segments holds (“parataxis” and “em-
bedding” forming the two poles of this continuum), “desententialisation” refers
to the degree to which the subordinate clause is expanded or reduced (with
“sententiality” and “nominality” as its extremes), and “explicitness of linking”
refers to the presence/absence and type of a connective device between two
clauses/segments (with “syndesis” and “asyndesis” at the two ends of the con-
tinuum).17
The examples presented in 3.1, sentential coordination translated as a sen-
tence sequence, show changes along the syndesis-asyndesis continuum: In those
examples where the discourse relation holding between the sentences in the

15. Lehmann (1988: 182) conceives subordination as a form of clause linkage, while
coordination is seen as a “relation of sociation combining two syntagms of the same
type and forming a syntagm which is again of the same type” and is thus not restricted
to hold on clause-level only.
16. Hypotaxis is understood by Lehmann (1988: 182) “as the subordination of a clause
in the narrow sense (which problably includes its finiteness)”, while parataxis refers
to the coordination of clauses, with no further restrictions “on the kind or structural
means of coordination. In particular, parataxis may be syndetic or asyndetic”.
17. Lehmann (1988: 210) points out that explicitness of linking has nothing to do with
parataxis vs. hypotaxis. As examples for linking devices with decreasing explicitness
152 Wiebke Ramm

translation is not explicitly signalled, e.g. by a discourse connective, the transla-


tion is more asyndetic than the original. In the cases where the discourse relation
is explicitly signalled by a discourse connective other than the coordinator og,
however, the translation is more explicit/syndetic than the original, where the
relation holding between the conjuncts (typically a narrative/temporal-causal
one) has to be inferred from the propositional content of the conjuncts (see
Blakemore and Carston 2005, and Section 2.3 on which relations might be li-
censed in a sentential coordination without an explicit mention of it). In the
examples discussed in 3.2 and 3.3 the structural changes are more visible: In
both cases one of the linked elements is both hierarchically upgraded (i.e. less
dependent on the other) and more sentential in the translation.

3. Syntactic coordination and discourse subordination:


Three contrastive perspectives

What happens at the level of discourse structure when syntactically coordinated


structures are translated as non-coordinated sequences of sentences, or subor-
dinated structures are translated as coordinate, and why do translators choose
these options in certain cases? In this section we present and discuss certain
types of translation mismatch that might challenge the discourse representation
approaches presented in Section 2.

3.1. From coordinated clauses to sentence sequences (Norwegian > German)


One case at hand is sentential coordination in Norwegian translated as a non-
coordinated sequence of sentences in German. The corpus contains several ex-
amples of clause coordination in the Norwegian original such as (3) and (4)
below, where coordination would sound odd in German, obviously due to lan-
guage-specific differences regarding the use of sentential coordination in the
two languages, as we will show.18

of linking he mentions the following: anaphoric subordinate clause referring back to


the preceding discourse (maximal syndesis), gerundial verb, prepositional phrase,
connective adverb, specific conjunction, universal subordinator, and nonfinite verb
form (asyndesis) (Lehman 1988: 211).
18. In a corpus study investigating sentence boundary adjustments in translations of pop-
ular science texts between Norwegian and German, sentential coordinations turned
out to be among the most frequent causes of sentence splitting (i.e. the translation
of one SL clause or clause complex as two or more independent sentences in the
TL) for the translation direction Norwegian-German (Ramm 2010). Moreover, the
analysis has shown that 48,7 % of all sentential coordinations in the Norwegian orig-
Discourse-structural salience from a cross-linguistic perspective 153

(3) a. Legene hadde sitt eget reisemønster, som er analysert[i] . Studierei-


ser til utlandet var viktige for profesjonell anseelse og autoritet[ii] ,
og totalbildet av reisemønsteret er entydig[iii] : Tyskspråklige uni-
versiteter var de viktigste reisemål for norske leger som ønsket
videreutdannelse eller spesialisering[iv] .
‘The doctors had their own travel pattern, which is analysed[i] . Ed-
ucational trips abroad were important for professional reputation
and authority[ii] , and the overall picture of the travel pattern is
clear[iii] : German-speaking universities were the most important
destinations for Norwegian doctors who wanted further education
or specialisation[iv] .’19
b. Die Ärzte hatten ihr eigenes, heute analysiertes, Reisemuster[i] .
Studienreisen ins Ausland wurden als wichtig für berufliches Anse-
hen und Autorität angesehen[ii] . Das Gesamtbild der Reisen ist ein-
deutig[iii] : Deutschsprachige Universitäten waren die wichtigsten
Reiseziele norwegischer Ärzte, die eine Weiterbildung oder Spe-
zialisierung wünschten[iv] .
‘The doctors had their own, today analysed, travel pattern[i] . Ed-
ucational trips abroad were viewed as being important for profes-
sional reputation and authority[ii] . The overall picture of the travels
is clear[iii] : German-speaking universities were the most important
destinations for Norwegian doctors who wanted further education
or specialisation[iv] .’

inal texts are not translated by a corresponding coordination in German. Two general
translation strategies can be identified for these examples of translation mismatch: In
22,4 % of the Norwegian coordination examples the coordination marker is dropped,
and the discourse relation holding between the clauses is left implicit in the German
version (as in ex. (3), (4) and (8) in this paper); in the remaining 26,3 % of the mis-
match examples some alternative element (e.g., a pronominal adverb or a demon-
strative nominal phrase) is used to signal the discourse relation holding between the
clauses; this strategy is illustrated by ex. (5), (6), (7) and (9) in this paper. For some of
the non-correspondence examples in the corpus translation by coordination in Ger-
man certainly could have been an option. The frequency of this type of translation
mismatch in the corpus, however, should be seen as a strong indication for some
language-specific difference regarding the use of coordination.
19. In this and the following examples English glosses of the Norwegian and German text
examples keeping the word order particularities of the original languages are given
in single quotes. Due to the length of the examples, we refrained from presenting
interlinear glosses and idiomatic translations.
154 Wiebke Ramm

In (3a) the lack of a common topic between the two conjuncts seems to block
the use of coordination in the German translation (3b). A further problem is the
fact that the second conjunct alone is elaborated by the sentence following the
colon. In the translation the coordinated clauses are split into two separate sen-
tences which leads to a change of the discourse structure assigned to the text:
In the RST model, the German translation can be analysed as a Background
or Circumstance relation – with (3b[ii] ) as satellite, its nucleus covering (3b[iii] )
and (3b[iv] ) –, and the span (3b[ii] )–(3b[iv] ) functioning as Elaboration to (3b[i] ).
The analysis of the Norwegian original, however, would possibly have to as-
sign a (multinuclear) Conjunction (or Joint) relation to (3a[ii] ) and (3a[iii] ). But
where does this span attach to its discourse context? To the left (as Elabora-
tion or Background of (3a[i] ) – which does not fit very well), or to the right (as
Background)? But then – at least as a non-native speaker of Norwegian – one
runs into problems with how to coherently interpret the sentence following the
colon, since (3a[iv] ) clearly elaborates the second conjunct (3a[iii] ), but not the
first (3a[ii] ). Thus, the grouping of (3a[ii] ) and (3a[iii] ) as a joint, non-hierarchical
span leads to attachment problems with the following discourse segment.
Using the SDRT approach one runs into similar problems: In the Norwe-
gian version the reader probably first tries to interpret (3a[ii] ) as providing back-
ground information (Background1 in the sense of Asher and Lascarides 2003,
backward-looking Background in the sense of Asher, Prévot and Vieu 2008)20
or as adding some kind of explanation (Explanation being one of the subordi-
nating discourse relations in SDRT) to the preceding sentence (3a[i] ). But which
relation holds between (3a[ii] ) and (3a[iii] )? In English or German the use of the
coordination marker would presuppose the existence of some kind of common
topic between the linked elements, but obviously Norwegian is not that strict in
this respect. For the German version, an SDRT-style analysis is less problem-
atic: a relation of Background1 /backward-looking Background or Explanation
may be assigned between the independent sentence corresponding to the first
conjunct (3b[ii] ) and the sentence preceding it (3b[i] ), whereas the counterpart of
the second conjunct (3b[iii] ) can be interpreted as elaborating sentence (3b[i] ).
Jasinskaja’s (2009) suggestion to treat the coordination marker as a signal
that the current utterance (sentence) is not yet completed (i.e. does not conform
to the exhaustiveness condition) would not work properly for the interpreta-
tion of the Norwegian og-coordination either. The German version, however
does: the independent sentence corresponding to the first SL conjunct (3b[ii] )

20. A problem with the relation Background in this example could be the restriction that,
according to the definition in SDRT, it is restricted to hold between an event and a
state.
Discourse-structural salience from a cross-linguistic perspective 155

can be interpreted as a complete utterance, connecting it to the preceding con-


text (3b[i] ). And also the counterpart of the second conjunct (3b[iii] ) can be in-
terpreted as an utterance with independent discourse contribution, namely as
an elaboration of the preceding sentence (3b[i] ), an interpretation which is not
compatible with a coordinated structure.

(4) a. Andre problemer var ikke mindre alvorlige[i] . Malmforekomstene


holdt ikke hva de lovet[ii] , og tapte raskt sin edelhet nedover i fjel-
let[iii] . Driften gikk med underskudd, og innskyterne trakk seg etter
hvert ut[iv] .
‘Other problems were not less serious[i] . The ore deposits were not
what they promised[ii] , and lost quickly their preciousness down-
wards in the mountain[iii] . The operation ran with deficit, and the
financial supporters gradually backed down[iv] .’
b. Andere Probleme waren nicht weniger gravierend[i] . Die Vorkom-
men hielten nicht, was sie versprachen[ii] ; der Metallgehalt nahm
mit zunehmender Tiefe rasch ab[iii] . Die Erzgewinnung war ein
Zuschussgeschäft und die Geldgeber machten nach und nach einen
Rückzieher[iv] .
‘Other problems were not less serious[i] . The deposits did not hold
what they promised[ii] ; the metal content decreased quickly with
increasing depth[iii] . The ore winning was a lossmaking business
and the financial supporters gradually backed down[iv] .’

Similar problems occur in (4), where the second conjunct (4a[iii] ) should be sub-
ordinated (as an Explanation in SDRT, and as an Evidence satellite in RST) in
relation to the first conjunct (4a[ii] ), since the following sentence (4a[iv] ) obvi-
ously is related only to (4a[ii] ) and not to (4a[iii] ). This discourse representa-
tion is precisely what we get in the German translation (4b) – where the co-
ordination marker og (and) is replaced by a semicolon. But which discourse
structure should be assigned to the Norwegian version, where both SDRT and
RST would be urged to assign a coordinating/multinuclear discourse relation
to the sentential coordination, blocking the right frontier (in the SDRT frame-
work) or not providing an appropriate nucleus (in the RST framework) to attach
(4a[iv] )?
The two examples above illustrate that Norwegian seems to be less restricted
as to the types of elements that can be coordinated. They are evidence to the ef-
fect that the universality of the definition of discourse relations in theories like
SDRT or RST is questioned (cf. 3.4). Our examples show that at least the func-
tion of the coordination marker (og/und/and) is not precisely the same cross-
156 Wiebke Ramm

linguistically: syntactic coordination seems to be compatible with discourse re-


lations like Background or Explanation in Norwegian, while blocked in Ger-
man.21
(5) a. Reformasjonen bragte etterhvert denne direkte norsk-tyske forbin-
delse til opphør. Den dansk-norske konge ønsket å sentralisere
presteutdannelsen til universitetet i København (grunnlagt 1479),
og det tok slutt med at norske studenter dro til tyske universiteter
for å få sin utdannelse. 1500- og 1600-tallets universitet ble et in-
strument for å befeste den sentraliserte fyrstestat, ved å gi utdan-
nelse for statens embedsmenn.
‘The reformation brought eventually this direct Norwegian-Ger-
man connection to a stop. The Danish-Norwegian king wished
to centralise the priest education to the university in Copenhagen
(founded in 1479), and it took an end with that Norwegian stu-
dents went to German universities to get their education. The 16th
and 17th century’s university became an instrument to stabilise the
centralised princely state, by giving education to the officials of the
state.’
b. Diese direkte norwegisch-deutsche Verbindung wurde von der
Reformation nach und nach zum Erliegen gebracht. Der dänisch-
norwegische König wünschte eine Konzentration der Pfarreraus-
bildung an der 1479 gegründeten Universität Kopenhagen.
Damit reisten keine norwegischen Studenten mehr zur Ausbil-
dung an deutsche Universitäten. Die Universitäten des 16. und
17. Jahrhunderts wurden durch die Ausbildung höherer Staats-
beamter zu Instrumentarien der Stärkung des zentralisierten
Fürstenstaates.

21. We are aware of the fact that the data presented in this paper are based on parallel
corpora only, i.e. we are comparing linguistic features of original texts with features
of their respective translations only, and that the properties of translations might
deviate from the properties of original texts of a language – e.g., in translations the
original language may “shine through” (cf. Teich 2003: 61) in some way. We have
not analysed the use of sentential coordination in comparable corpora of Norwegian
and German, but if we assume that there is at least some “shining through” from
the Norwegian SL texts regarding the use of coordination, it can be expected that
the differences of use are even clearer in comparable corpora. (see again note 18.
on the frequency of non-correspondence regarding coordination in Norwegian and
German).
Discourse-structural salience from a cross-linguistic perspective 157

‘This direct Norwegian-German connection was by the reforma-


tion eventually brought to a stop. The Danish-Norwegian king
wished a concentration of the priest education to the in 1479
founded university in Copenhagen. Damit (‘with/by this’) trav-
elled no longer Norwegian students for education to German uni-
versities. The universities of the 16th and 17th century became by
the education of higher state officials instruments for the strength-
ening of the centralised princely state.’
(6) a. Den andre kraftlinjen, like fra middelalderen av, har gått til Eng-
land og den angel-saksiske verden. Bruddet etter 1945 førte til at
denne forbindelsen ble dominerende, både politisk og kulturelt,
og Norge fremstår i dag som trolig et av de mest amerikaniserte
samfunn i Europa. Men Tyskland får igjen økende betydning, som
Norges største handelspartner og den viktigste støttespiller innen
EU-systemet.
‘The second line of power, directly from the Middle Ages on, has
gone to England and the Anglo-Saxon world. The breaking-off af-
ter 1945 led to that this connection became dominating politically
as well as culturally, and Norway appears today as probably one
of the most americanised societies in Europe. But Germany gets
again increasing importance, as Norway’s largest trade partner and
the most important supporter within the EU system.’
b. Die zweite Kraftlinie, ebenfalls seit dem Mittelalter, nimmt ihren
Ursprung in England und der angelsächsischen Welt. Der Bruch
nach 1945 führte dazu, dass diese zweite Verbindung zur wichtig-
sten wurde, sowohl politisch als auch kulturell. Deswegen ist Nor-
wegen heute wahrscheinlich eines der am stärksten amerikanisier-
ten Länder Europas. Doch Deutschland gewinnt wieder als der
größte Handelspartner Norwegens und als die wichtigste Stütze im
EU-System an Bedeutung.
‘The second line of power, also from the Middle Ages on, has
its origin in England and the Anglo-Saxon world. The breaking-
off after 1945 led dazu (pron.adv., lit. ‘there-to’), that this sec-
ond connection became the most important one, both politically as
well as culturally. Therefore is Norway today probably one of the
most americanised countries in Europe. But Germany gains again
as Norway’s largest trade partner and the most important support
within the EU system.’
158 Wiebke Ramm

Also in examples (5) and (6) a sentential coordination in the Norwegian original
text is translated by a sequence of independent sentences, but in these examples
a connective (pronominal adverb) explicitly signalling the temporal-causal re-
lation between the sentences corresponding to the two conjuncts in the Norwe-
gian version is added. The same relations (of “consequentiality”, cf. Section 2.3)
hold between the conjuncts in the Norwegian text, but here they have to be con-
textually inferred, i.e. they are less explicitly marked. A structurally equivalent
translation by a coordination would have been possible for both examples, i.e.
would not have been in contradiction with the discourse relations licensed by
und-coordination in German. However, the option chosen in the actual trans-
lation seems to be more natural with respect to the standard patterns of text
organisation in German.
(7) a. I Bergen hadde det mektige tyske Kontoret hindret tyskere i å ta
norsk borgerskap av frykt for at de skulle bli konkurrenter[i] . I 1560
måtte Kontoret oppgi denne politikken[ii] , og en stadig strøm av
tidligere hanseater tok i den følgende tida frivillig norsk borger-
skap[iii] . I 1766 ble den siste vintersitteren borger i Bergen, det var
Jochen Krämer fra Bremen[iv] .
‘In Bergen had the powerful Comptoir hindered the Germans to
acquire Norwegian citizenship because of fear that they might be-
come competitors[i] . In 1560 the Comptoir had to give up this poli-
tics[ii] , and a continuous stream of previous Hanseats acquired in
the following time deliberately Norwegian citizenship[iii] . In 1766
became the last winter-sitter a citizen of Bergen, it was Jochen
Kämer from Bremen[iv] .’
b. In Bergen hatte das mächtige Comptoir aus Angst vor deren
Konkurrenz Deutsche daran gehindert, norwegische Bürger zu
werden[i] . 1560 musste das Comptoir diese Politik aufgeben[ii] .
Während der darauf folgenden Zeit ließ ein ständiger Strom
ehemaliger Hanseaten sich freiwillig einbürgern[iii] . 1766 wurde
der letzte Wintersitzer, Jochen Krämer aus Bremen, Bürger von
Bergen[iv] .
‘In Bergen had the powerful Comptoir because of fear of com-
petition hindered Germans to acquire Norwegian citizenship[i] . In
1560 the Comptoir had to give up this politics[ii] . In the follow-
ing time acquired a continuous stream of previous hanseats delib-
erately Norwegian citizenship[iii] . In 1766 became the last winter-
sitter, Jochen Kämer from Bremen, a citizen of Bergen[iv] .’
Discourse-structural salience from a cross-linguistic perspective 159

A similar type of sentence splitting of a Norwegian sentential coordination lead-


ing to a more explicit marking of the discourse relations and the progression
of the text in the German version, is illustrated by (7). As in (5) and (6), the
discourse relation between the Norwegian conjuncts (here a purely temporal
one) is compatible with the meaning of og/and/und, so sentential coordination
would in principle have been an option for the German translation (in contrast
to (3) and (4)). Nevertheless the sentential coordination is split up and the con-
stituent order in the sentence corresponding to the second conjunct is changed
in the translation, placing the temporal adverbial während der darauf folgenden
Zeit (‘in the following time’) in sentence-initial position. This leads to a more
explicit signalling of the thematic progression in this text fragment (temporal
sequence), indicated by the parallelism of three temporal adverbials in sentence-
initial position in (7b[ii]–[iv] ): 1560 – während der darauf folgenden Zeit (‘in the
following time’) – 1766. The temporal progression is not as explicitly marked
in the Norwegian version, since the choice of a coordinated construction only
leaves a thematically less prominent position for the adverbial i den følgende
tida (‘in the following time’). So this example seems to give further evidence for
the preference of German texts (or at least of translations from Norwegian) to
signal the type of text progression and the relations holding between discourse
units more explicitly than the corresponding Norwegian original texts.
(8) a. Riktignok prøvde noen av bergmennene å drive videre i 1550-åre-
ne[i] . Men det er tvilsomt om de lyktes[ii] , og driften var nok i alle
tilfelle svært beskjeden[iii] . Noen av dem slo seg ned i Skien[iv] .
‘Indeed tried some of the mineworkers to continue to run (the mine)
in the 1550-ies[i] . But it is doubtable whether they succeeded[ii] ,
and the running was in any case very limited[iii] . Some of them
settled in Skien[iv] .’
b. Zwar versuchten einige der Bergleute, den Betrieb nach 1550 noch
fortzusetzen[i] . Doch es ist nicht sicher, ob sie Erfolg hatten[ii] .
In jedem Fall war die Ausbeute recht bescheiden[iii] . Einige der
Deutschen ließen sich in Skien nieder[iv] .
‘Indeed tried some of the mineworkers to continue running (the
mine) after 1550[i] . But it is not sure whether they succeeded[ii] .
In any case was the profit very limited[iii] . Some of the Germans
settled in Skien[iv] .’
(9) a. Jeg vil ikke påta meg å besvare spørsmålet[i] . Mange har vært opp-
tatt av det unike ved den tyske universitetsmodell[ii] ,ogdet forelig-
ger en stor litteratur som det vil føre for langt å gjøre rede for
160 Wiebke Ramm

her[iii] . Men det kan være interessant å peke på noen forhold som
kan ha betydning[iv] .
‘I will not try to answer the question[i] . Many have been engaged
to stress the uniqueness of the German university model[ii] , and
there exists big literature which it would take too long to discuss
here[iii] . But it might be interesting to point to some circumstances
that might have importance[iv] .’
b. Ich werde hier nicht versuchen, diese Frage zu beantworten[i] . Vie-
le Forscher haben sich damit beschäftigt, worin das Einmalige des
deutschen Universitätsmodells bestand[ii] . Zu diesem Thema liegt
eine umfangreiche Literatur vor, deren eingehendere Erörterung
hier zu weit führen würde[iii] . Doch dürfte es von Interesse sein,
auf einige Verhältnisse, die für die Beantwortung der Frage von
Bedeutung sein könnten, etwas genauer einzugehen[iv] .
‘I will not try to answer the question[i] . Many researchers have been
engaged to define what characterised the uniqueness of the German
university model[ii] . On this topic exists a vast amount of literature
the discussion of which would take too long here[iii] . But it might
be of interest, for some circumstances which are important for an-
swering the question, to go into more detail[iv] .’
Sentence splitting in the translations of (8) and (9) above seems so be motivated
by different preferences regarding whether the conjoined clauses are expected
to contribute to the incrementally constructed discourse representation as one
joint unit or whether it is also possible that only one of the conjuncts consti-
tutes a discourse relation with the preceding or following discourse units. In
(8a) the first conjunct (8a[ii] ) is in a (concessive) discourse relation to the pre-
vious sentence (8a[i] ), indicated by the sequence of the connectives riktignok
(‘although, indeed’) and men (‘but’), but it is not clear whether this concessive
relation also holds for the second conjunct (8a[iii] ). This indeterminacy does
not exist in the German version, where only the sentence corresponding to the
first conjunct (8b[ii] ) is in a concessive relation to the previous sentence (8b[i] )
(zwar ‘although, indeed’ – doch ‘but, however’). The sentence corresponding
to the second conjunct (8b[iii] ) rather implies an Evidence (RST) or Explanation
(SDRT) relation to the sentence corresponding to the first conjunct, signalled
by es ist nicht sicher (‘it is not sure’) and the (topicalised) adverbial in jedem
Fall (‘in any case’). Thus, again, the discourse relations holding between some
of the discourse units are more clearly inferable in the German version of the
text. In (9) the coordination in the Norwegian version leads to some indetermi-
nacy with respect to the interpretation (attachment in discourse structure) of the
Discourse-structural salience from a cross-linguistic perspective 161

following sentence: The following sentence (9a[iv] ), starting with men (‘but’),
is in a contrastive relation to the non-restrictive relative clause som det vil føre
for langt å gjøre rede for her (‘which it would take too long to discuss here’)
which is a part of the second conjunct (9a[iii] ), but this is somewhat blurred in the
Norwegian version due to the existence of the coordination which might imply
some joint contribution of the two conjuncts to the further development of the
discourse. By dropping the coordination in the German translation the sentence
corresponding to the second conjunct (9[iii] ) is interpreted as an elaboration of
the sentence corresponding to the first conjunct (9b[ii] ) – further emphasised by
adding zu diesem Thema (‘on this topic’) and placing it in sentence-initial po-
sition. The inference of the contrastive connection between the relative clause
deren eingehendere Erörterung hier zu weit führen würde (‘the discussion of
which would take too long here’) and the following sentence (9b[iv] ) is not dis-
turbed by the attempt to assign some joint relevance to the conjuncts of a sen-
tential coordination. So, as in (8), sentence splitting in the German translation
guarantees a clearer correlation between sentence boundaries and the attach-
ment of discourse units to the incrementally growing discourse representation.
These examples of Norwegian sentential coordination translated by non-
coordinated sequences of sentences in German indicate that sentential coordi-
nation serves somewhat different functions in discourse in the two languages:
1. The discourse relations compatible with (licensed by) coordination seem
to be more constraint in German than in Norwegian, as illustrated by (3)
and (4). In German the use of the coordination marker und appears to be
restricted to the types of relations (e.g. additive and temporal-causal) also
compatible with and in English (cf. Blakemore and Carston 2005, see Sec-
tion 2.3), whereas in Norwegian these constraints obviously are not taken
that seriously.
2. In cases where sentential coordination would be compatible with a discourse
relation this is often not the preferred realisation in German. Rather, an op-
tion is chosen which more explicitly signals the discourse relation holding
between adjacent discourse units, e.g. by using a connective as in (5) and (6).
3. German also seems to take und as a means to signal that two conjuncts
should be processed as a joint unit and jointly contribute to discourse struc-
ture more seriously than Norwegian does, as illustrated by (8) and (9).
Paratactic clause linkage with og seems to function as a kind of default se-
quentialisation strategy in Norwegian, which is applied without imposing too
much meaning to the coordination marker. In this way the use of og in writ-
ten Norwegian texts seems to be similar to the functions and (and its equiva-
lents in other languages) can take in oral narratives, i.e. signalling that the story
162 Wiebke Ramm

goes on without being too specific about the relation holding between the dis-
course units, or explicitly marking the transitions between discourse units (cf.
e.g. Schiffrin 1986 on the functions of and in (English) conversations). This
observation would also fit into the picture of written Norwegian as still being
under the pressure of the norms holding for oral language (see e.g. Torp and
Vikør 2000: Chapt. 14; Solfjeld 2000: 46–48). In German written genres, how-
ever, sentential coordination and sentence boundaries in general appear to be
taken more seriously as signals indicating the structuring of the discourse into
hierarchically and non-hierarchically related units.

3.2. From verb phrase/nominal phrase adjunction to coordination


(German > Norwegian)
In this and the following section examples are discussed where coordinated
structures are found as translations of syntactically subordinated structures.
(10) and (11) below are typical examples of what Fabricius-Hansen (1999)
has termed backward information extraction, which occurs quite frequently in
translations from German into Norwegian (Solfjeld 2004): Syntactically down-
graded information encoded in an adjunct at verb phrase level in the source
sentence is rendered in a conjunct to the left of the conjunct corresponding
most closely to the main predicate of the source sentence, the latter having
neutral focus. (The source-text adjunct and its target-text counterpart are given
in bold face.)
(10) a. Für die Trennung des Kindes von der Mutter wurden medizinische
und pädagogische Begründungen angeführt und anhand […] be-
glaubigt. Eine perfekte medizinisch-technische Versorgung be-
kam die größte Bedeutung. Im Interesse der Infektionsverhü-
tung […] wurde die Sterilität groß geschrieben.
‘For the separation of the child from its mother were medical and
pedagogical reasons given and by means of […] supported. A per-
fect medical-technical care got vital importance. In the interest
of infection prevention […] was sterility emphasized.’
b. Det ble anført medisinske og pedagogiske grunner til at mor og
barn skulle skilles ad, og dette ble forklart ved […]. En perfekt
medisinsk-teknisk omsorg ble av største betydning. Infeksjoner
skulle unngås […], og steriliteten ble skjøvet i forgrunnen.
‘It were given medical and pedagogical reasons for that mother
and child should be separated, and this was explained by […]. A
perfect medical-technical care became of vital importance. Infec-
Discourse-structural salience from a cross-linguistic perspective 163

tions were to be avoided […], and sterility was moved into the
foreground.’
(11) a. Als es feststand, daß die Alliierten nicht hier, sondern an der Ka-
nalküste landen würden, disponierte man um und schickte alle
Boote dorthin. Der Gegner, uns überhörend, faßte seine Beob-
achtungen präzise zusammen.
‘When it was clear that the Allies not here, but on the Channel
coast would land, we reorganized and sent all the boats there. The
opponent, us bugging, summarised his observations precisely.’
b. Da det nå ble klart at de allierte ikke ville lande her, men i Nor-
mandie, ble vi omdirigert dit. Motstanderne våre avlyttet våre
radiomeldinger og samlet omhyggelig sammen opplysninger.
‘As it now got clear that the Allies would not land here, but in
Normandy, we were redirected to-there. Our opponents bugged
our radio messages and gathered information carefully.’
In both examples a structurally equivalent Norwegian translation would not
have been possible or would at least have been stylistically marked. Although
both languages are V2, Norwegian is less open to place informationally “heavy”
constituents in Vorfeld position than German is, making it difficult to render the
prepositional phrase in (10a), which furthermore is based on nominalisations
(Interesse ‘interest’, Infektionsverhütung ‘infection prevention’), by a corre-
sponding prepositional phrase in Norwegian. Neither is it possible to translate
the present participle construction in (11a), uns überhörend (‘us bugging’), by
a corresponding participle construction in Norwegian. Choosing a coordinated
structure in (10) and (11) in the Norwegian translation can be see as a strat-
egy which tries to compensate for the lack of equivalent structural options or
preferences in Norwegian. The Norwegian versions are more sentential than the
original texts and exploit the inference mechanisms triggered by the coordina-
tive structure (cf. 2.3, Blakemore 2002) in order to gain a similar interpretation
as the German version. The syntactically downgraded function of the German
adjunct is “simulated” by the first conjunct which gets the discourse function
of “leading up to” the second, i.e. entering into a consequentiality relation with
the second conjunct. In this way coordination works as a backgrounding device,
establishing the second conjunct as part of the “main story” – equivalent to the
source text. The frequent use of coordination also illustrates the tendency that
Norwegian prefers to organize discourse paratactically where German tends to
use hypotactic/hierarchical structures (Fabricius-Hansen 1996).
164 Wiebke Ramm

3.3. From ing-adjuncts to coordination (English > German/Norwegian)


Free ing-adjuncts are adjuncts of some sort but more “sentential” and less inte-
grated (see 2.4, Lehmann 1988), than the German adjectival/adverbial adjuncts
translated as a sentential coordination in (10) and (11) above. Quite often such
adjunct constructions are rendered as verb phrase coordination in German and
Norwegian (cf. Behrens 1998 for English/Norwegian). This is the case in (12),
for instance, where the ing-adjunct, representing backgrounded information,
precedes its matrix clause and is rendered as first conjunct in both target texts.
(12) a. Then, using a flat pack of slim steel files from his top pocket he
started to work on the softer metal of the skeleton.
b. Dann holte er einen Satz dünner Stahlfeilen aus der Brusttasche
und bearbeitete damit den Weichmetallteil des Dietrichs.
‘Then took he a set of thin steel files from his top pocket and
worked with it (lit. ‘there-with’) the soft metal part of the skeleton
key.’
c. Så tok han en flat pakke tynne stålfiler opp av brystlommen og
ga seg til å arbeide på det bløtere metallet i nøkkelen.
‘Then took he a flat pack of thin steel files up from his top
pocket and started to work with the softer metal in the key.’
However, also when postponed to their matrix clause, ing-adjuncts are often
subordinated from a discourse structural point of view, describing e.g. an “ac-
companying circumstance”22 to the matrix clause eventuality as in (13a) and
(14a). In such cases, German translations by coordination may preserve the
order of the two segments but explicitly mark the relation of temporal over-
lap between them by adding the connective dabei (lit. ‘there-by’, ‘at the same
time / on the same occasion’) in the second conjunct, as in (13b), thus block-
ing a (con)sequential interpretation which might otherwise be preferred. But
the order of presentation may also be switched so that the first conjunct in the
translation corresponds to the postponed ing-adjunct in the original, as in (14b)
and (15b).
The Norwegian translations in (14c) and (15c), on the other hand, use co-
ordination without changing the order of the verb phrases corresponding to the
matrix clause and the ing-adjunct of the source text – and without overtly mark-
ing the temporal relation between the eventualities described in the two con-
juncts. It may be objected that the translations are ambiguous and/or not par-

22. The relation “accompanying circumstance” is discussed in more detail in Behrens


and Fabricius-Hansen (2010).
Discourse-structural salience from a cross-linguistic perspective 165

ticularly good. But nevertheless these examples seem to give further evidence
for the hypothesis that coordination functions somewhat differently in Norwe-
gian than in German and English. The dispensability of a marker of the tempo-
ral overlap in (13c) indicates that Norwegian may be less biased to interpreting
clause/verb phrase coordination as a temporal sequence (in narration) than Ger-
man is. And (14c) and (15c) show that Norwegian possibly is also more open to
placing background(ed) information in the second conjunct, the position where
focused/foregrounded information is strongly preferred in German.
(13) a. He smiled slyly, nodding.
b. Er lächelte verstohlen und nickte dabei.
‘He smiled furtively and nodded thereby.’
c. Han smilte litt lurt og nikket.
‘He smiled somewhat slyly and nodded.’
(14) a. Tony went home, taking his tool box with him.
b. Tony griff nach seinem Werkzeugkasten und ging nach Hause.
‘Tony reached for his tool box and went home.’
c. Tony gikk hjem og tok med seg verktøykassen sin.
‘Tony went home and took his tool box with him.’
(15) a. Things suddenly got very tense in the bar and Dad drank heavily,
sweating..
b. Auf einmal wurde die Atmosphäre in der Bar äußerst angespannt,
und Papa schwitzte und trank immer mehr.
‘Suddenly the atmosphere got very tense in the bar, and Dad
sweated and drank more and more.’
c. Stemningen i baren ble plutselig meget spent, og pappa drakk tett
og svettet.
‘The atmosphere in the bar got suddenly very tense, and Dad
drank heavily and sweated.’

3.4. Discussion: coordination and the marking of (non-)salience in discourse


The examples discussed in the previous sections illustrate that clause and verb
phrase coordination can serve various discourse functions in Norwegian (as
source and target language), and some of them seem to be different from those
in German or English. This allows for some cross-linguistic reflections on how
coordination relates to the marking of (non-)salience in discourse. In Section 3.1
we related “salience” to the use of coordination with respect to the assignment
166 Wiebke Ramm

of a coordinating/multinuclear vs. subordinating/nucleus-satellite discourse re-


lation between two clauses. The examples presented in this section indicate
that sentential coordination can be used in Norwegian in cases where only a
subordinating/nucleus-satellite discourse relation would be possible in German.
This observation may lead to two conclusions, a) that the coordination marker
og not always functions as a marker of equal salience of two clauses in Norwe-
gian (where “equal salience” is understood as implying a coordinating/multi-
nuclear discourse relation), or b) that the differentiation between the two types
of discourse relations simply is not that strict (is not taken that seriously) in
Norwegian as it is in German.
The translation changes made in the examples in 3.1 illustrate a further as-
pect of discourse structur(ing) that might be interesting in the context of a dis-
cussion of the notion of salience. It seems that sentence boundaries marked by
full stop are taken more seriously as a discourse segmentation signal in Ger-
man than in Norwegian, i.e. as a delimited step in the incremental construction
of the discourse representation, or – viewed from the opposite direction – that
not using such a segmentation signal as in the case of sentential coordination
is also taken more seriously as a signal that the incremental construction of a
representation of the respective discourse segment is not yet finished. If this ob-
servation is correct, this would also put Jasinskaja’s (2009) claim that the most
interesting constraints on discourse relations (i.e. which relations are inferred
by default) are imposed by the full stop, not by the coordination marker, into a
new perspective: maybe Norwegian behaves a bit different than other languages
also in this respect (cf. Fabricius-Hansen 1999: 212, for a similar view)?
In any case, the examples in 3.1 support the view that discourse-structural
salience and the necessity to mark the salience status of a discourse unit (such
as a clause) might be a relative or language-dependent concept: whereas some
languages (such as German) operate with clear structuring signals (segmenta-
tion, hierarchical and non-hierarchical organisation of discourse structure) to
indicate how pieces of discourse should be put together, others (such as Nor-
wegian) organise text by relying less on sentence boundaries and explicit struc-
turing signals such as the coordination marker or discourse connectives.
The examples in Section 3.2 and 3.3 illustrate the choice of coordination
as a translation strategy compensating for language-systematic differences re-
garding the realisation of certain types of adjuncts. Here we assumed that the
adjuncts in the SL version function as some kind of “background(ed)” or less
salient type of information (correlating with its syntactically downgraded func-
tion in syntax), and that this downgradedness is remodelled/simulated by ex-
ploiting the inference mechanisms triggered by the use of the coordination
marker in the TL. As mentioned before, the adjuncts in Section 3.2 and 3.3
Discourse-structural salience from a cross-linguistic perspective 167

differ as regards the clause linkage type they realise (cf. Section 2.4) – the ad-
juncts in 3.3 being more sentential and less integrated than the adjuncts in 3.2.23
This means that more hierarchical upgrading (towards parataxis) and more sen-
tentialisation is required in (10) and (11) than in (12) to (15). This implies also
that – at least in (10) and (11) – the relations expressed in the SL vs. TL text
change their status from semantic relations holding between units/constituents
within a clause to discourse relations holding between propositions/clauses.
This can be viewed as a change in salience associated with a piece of informa-
tion in the SL vs. TL text – from non-propositional (or less propositional), con-
tributing to sentence semantics (in the first place) to propositional, contributing
to discourse semantics/structure.
In sum, discourse salience emerges as a concept that can have many facets
in a contrastive perspective. Such a multi-dimensional nature of salience (more
specifically, of “nuclearity”) – yet not in a contrastive perspective – has also
been argued for by Stede (2008), who demonstrates that discourse units may
be assigned salience on different levels of description and that various factors –
such as referential structure, thematic development, intentional structure or ex-
plicit linguistic markers – come into play here.
Another “factored” approach to discourse is pursued by Webber and her col-
leagues (e.g. Webber, Knott, and Joshi 1999, 2003). Working in the framework
of Tree-Adjoining Grammar, they distinguish between (discourse) relations that
are induced structurally by punctuation or (coordinating or subordinating) con-
junctions like and, although on the one hand, and relations that are established
by presupposition-bearing anaphoric adverbials like then, instead, otherwise
on the other hand. Whereas relations of the former type hold between the inter-
pretation of adjacent or conjoined discourse units, thus creating a (discourse)
structure in the strict sense, anaphoric adverbials signal “a relation between the
interpretation of their matrix clause and an entity in or derived from the dis-
course context” (Webber, Knott, and Joshi 2003: 547) which may cross such
structural dependencies. Webber, Knott, and Joshi suggest that this “factored”
approach may have “a better chance of providing a cross-linguistic account of
discourse than one that relies on a single premise” (Webber et al., 1999: Sect. 5).
Their approach does not, as we see it, offer an immediate solution to the specific
problems discussed in connection with examples (3) and (4) (Sect. 3.1). But the
proposed distinction between structurally induced discourse relations (triggered
by punctuation and conjunctions) creating discourse structure in the strict sense

23. The semantics and discourse properties of adjuncts of various kinds are taken up in
detail in Fabricius-Hansen and Haug (eds., in prep.). The problem of “competing
structures” across languages is particularly discussed in Chapter 5.
168 Wiebke Ramm

and the relations triggered by presupposition-bearing anaphoric adverbials may


give a lead as to how the “strange” discourse behaviour of the Norwegian con-
junction og could be explained: possibly og doesn’t always function as a con-
junction, i.e. doesn’t (always) create discourse structure in the same way as the
corresponding conjunctions und and and in German or English do.

4. Conclusions

We have shown that special conditions seem to hold as regards the use of sen-
tential and verb phrase coordination with (counterparts of) and in Norwegian
as compared to German and English. In translations from German or English
into Norwegian, coordination is often used as a compensation for language-
specific – structural and stylistic – restrictions on hypotactic complexity at sen-
tence level (Sections 3.2 and 3.3). Apparently, Norwegian is also less con-
strained as to which kinds of (discourse) elements can be linked by the co-
ordination marker (Section 3.1) and in which order the conjuncts appear (Sec-
tion 3.3).
To put it the other way round, it appears that the function of the coordination
marker (og/und/and) is not precisely the same cross-linguistically, so that e.g.
syntactic coordination may be compatible with discourse relations like Back-
ground, Explanation or Elaboration in Norwegian, while blocked in German
or English. These observations cast some doubt on the cross-linguistic valid-
ity of the definition of discourse relations in theories like SDRT or RST. In
particular, they seem to challenge the assumption (see 2.3) that syntactic coor-
dination with (equivalents of) the connective and necessarily implies a coordi-
nating/multinuclear discourse relation.
The translation examples furthermore illustrate that salience can be express-
ed by various linguistic means and that these means may differ cross-linguisti-
cally (3.4). Salience may be assigned by the hierarchical vs. non-hierarchical
organisation of discourse in form of subordinating vs. coordinating discourse
relations holding between clauses/propositions. But salience can also manifest
itself by the choice of the size/granularity of the linguistic unit to communicate a
piece of information, in particular by the choice between propositions (clauses)
and linguistic units that do not have proposition status, e.g. as phrases. Finally,
also discourse segmentation into (complex) sentences separated by full stop (or
other “major” punctuation marks such as question mark and exclamation mark)
relates to the assignment of salience in discourse. Segmentation into sentences
provides the “temporal” dimension of discourse interpretation, by determining
which “portions” of information should be integrated into the incrementally
Discourse-structural salience from a cross-linguistic perspective 169

constructed (mental) discourse representation at a certain point in the develop-


ment of the text.

Acknowledgements
This article is a modified and extended version of Ramm and Fabricius-Hansen
(2005), and many of the ideas presented here have been developed in cooper-
ation with Cathrine Fabricius-Hansen. Moreover, the work has profited from
cooperation with Bergljot Behrens (Univ. of Oslo) and Kåre Solfjeld (Østfold
Univ. College, Halden) who have contributed with examples and helpful discus-
sions. I am also grateful to the Faculty of Humanities at the University of Oslo,
for supporting me by a PhD scholarship (2003–2006). The research has been
carried out in connection with the project SPRIK (Språk i kontrast / Languages
in Contrast)24 at the University of Oslo, Faculty of Humanities funded by the
Norwegian Research Council under project number 158447/530 (2003–2008).

References

Asher, Nicholas
1999 Discourse and the focus/background distinction. In: Peter Bosch and
Rob A. van der Sandt (eds.), Focus: Linguistic, Cognitive, and Com-
putational Perspectives, 247–267. Cambridge/New York: Cambridge
University Press.
Asher, Nicholas and Alex Lascarides
2003 Logics of Conversation. (Studies in Natural Language Processing.)
Cambridge/New York: Cambridge University Press.
Asher, Nicholas, Laurent Prévot and Laure Vieu
2007 Setting the background in discourse. Discours 1: 1–29.
URL: http://discours.revues.orig/index301.html
Asher, Nicholas and Laure Vieu, Laure
2005 Subordinating and coordinating discourse relations. Lingua 115: 591–
610.
Behrens, Bergljot
1998 Contrastive discourse: An interlingual approach to the interpretation
and translation of free ING-participial adjuncts. Ph.D. dissertation,
Department of Linguistics, University of Oslo.

24. Project URL: http://www.hf.uio.no/ilos/forskning/projekter/sprik//index.html (vis-


ited 16 Sep 20109)
170 Wiebke Ramm

Behrens, Bergljot and Cathrine Fabricius-Hansen


forthc. The discourse relation Accompanying Circumstance across langua-
ges. Conflict between linguistic expression and discourse subordina-
tion? In: Dingfang Shu and Ken Turner (eds.), Contrasting Meaning
in Languages of the East and West. Frankfurt: Peter Lang
Blakemore, Diane
1987 Semantic Constraints on Relevance. Oxford: Blackwell.
Blakemore, Diane
2002 Relevance and Linguistic Meaning: The Semantics and Pragmatics
of Discourse Markers. (Cambridge studies in linguistics 99.) Cam-
bridge: Cambridge University Press.
Blakemore, Diane and Robyn Carston
2005 The pragmatics of sentential coordination with “and”. Lingua 115:
569–589.
Büring, Daniel
1997 The meaning of topic and focus: The 59th Street Bridge accent. (Rout-
ledge studies in German linguistics.) London: Routledge.
Carlson, Lynn and Daniel Marcu
2001 Discourse tagging manual. Technical Report ISI-TR-545, ISI.
Fabricius-Hansen, Cathrine
1996 Informational density: A problem for translation and translation the-
ory. Linguistics 34: 521–565.
Fabricius-Hansen, Cathrine
1999 Information packaging and translation. Aspects of translational sen-
tence splitting (German – English/Norwegian). In: Monika Doherty
(ed.), Sprachspezifische Aspekte der Informationsverteilung, 175–
213. Berlin: Akademie-Verlag.
Fabricius-Hansen, Cathrine and Dag T. T. Haug (eds.)
in prep. Big Events, Small Clauses: The Grammar of Elaboration. (Language,
Context and Cognition.) Berlin/New York: Mouton de Gruyter.
Fabricius-Hansen, Cathrine, Wiebke Ramm, Kåre Solfjed and Bergljot Behrens
2005 Coordination, discourse relations and information packaging – cross-
linguistic differences. In: Aurnague, M., Bras, M., Le Draoulec, A.,
and Vieu, L. (eds.), First International Symposium on the Exploration
and Modelling of Meaning (SEM-05), 85–93.
Fabricius-Hansen, Cathrine and Wiebke Ramm
2008 Editor’s introduction: Subordination and coordination from differ-
ent perspectives. In: Cathrine Fabricius-Hansen and Wiebke Ramm
(eds.), ‘Subordination’ versus ‘coordination’ in sentence and text. A
cross-linguistic perspective. (Studies in Language Companion Series
98). Amsterdam/Philadelphia: John Benjamins.
Discourse-structural salience from a cross-linguistic perspective 171

Jasinskaja, Ekaterina
2009 Pragmatics and prosody of implicit discourse relations: The case of
restatement. Ph.D. dissertation, University of Tübingen.
Klein, Wolfgang and Christiane von Stutterheim
1987 Quaestio und referentielle Bewegung in Erzählungen. Linguistische
Berichte 109: 163–183.
Lambrecht, Knud
1994 Information Structure and Sentence Form: Topic, Focus, and the
Mental Representations of Discourse Referents. (Cambridge studies
in linguistics 71.) Cambridge: Cambridge University Press.
Lehmann, Christian
1988 Towards a typology of clause linkage. In: John Haiman and Sandra
A. Thompson (eds.), Clause Combining in Grammar and Discourse,
181–225. Amsterdam/Philadelphia: John Benjamins.
Mann, William C. and Sandra A. Thompson
1988 Rhetorical Structure Theory: Toward a functional theory of text orga-
nization. Text 8: 243–281.
Ramm, Wiebke
2010 Satzgrenzenveränderungen in der Übersetzung: Satzverbindung und
lokale Diskursorganisation im Norwegischen und Deutschen. Ph.D.
(submitted), University of Oslo.
Ramm, Wiebke and Cathrine Fabricius-Hansen
2005 Coordination and discourse-structural salience from a cross-linguistic
perspective. In: Manfred Stede, Christian Chiarcos, Michael Grabski
and Luuk Lagerwerf (eds.), Salience in Discourse: Multidisciplinary
Approaches to Discourse 2005, 119–128. Münster: Stichting/Nodus.
Rooth, Mats
1992 A theory of focus interpretation. Natural Language Semantics 1: 75–
116.
Sandström, Görel
1993 When-clauses and the temporal interpretation of narrative discourse.
Ph.D. dissertation, Department of General Linguistics, University of
Umeå.
Schiffrin, Deborah
1986 Functions of ‘and’ in discourse. Journal of Pragmatics 10: 41–66.
Solfjeld, Kåre
2000 Sententialität, Nominalität und Übersetzung. Eine empirische Unter-
suchung deutscher Sachprosatexte und ihrer norwegischen Überset-
zungen. Frankfurt M.: Peter Lang.
172 Wiebke Ramm

Solfjeld, Kåre
2004 Informationsspaltung nach links in Sachprosaübersetzungen Deutsch-
Norwegisch. In: Eva Lambertsson Björk and Sverre Vesterhus (eds.),
Kommunikasjon, 111–130. Halden: Høgskolen i Østfold.
Stede, Manfred
2008 RST revisited: Disentangling nuclearity. In: Cathrine Fabricius-Han-
sen and Wiebke Ramm (eds.), ‘Subordination’ versus ‘Coordination’
in Sentence and Text. A Cross-linguistic Perspective, (Studies in Lan-
guage Companion Series 98.) Amsterdam/New York: John Benja-
mins.
Talmy, Leonard
2000 Toward a Cognitive Semantics. Volume I: Concept Structuring Sys-
tems. Volume II: Typology and Process in Concept Structuring. Cam-
bridge, MA: The MIT Press.
Teich, Elke
2003 Cross-Linguistic Variation in System and Text. A Methodology for the
Investigation of Translations and Comparable Texts. (Text, Trans-
lation, Computational Processing 5.) Berlin/New York: Mouton de
Gruyter.
Torp, Arne and Lars S. Vikør
2000 Hovuddrag i norsk språkhistorie. Oslo: Gyldendal.
Txurruka, Isabel G.
2000 The semantics of ‘and’ in discourse. Technical Report ILCLI-00-LIC-
9, University of the Basque Country.
Vallduvi, Enric and Elisabeth Engdahl
1996 The cross-linguistic realization of information packaging. Linguistics
34: 459–519.
Vieu, Laure and L. Laurent Prévot
2004 Background in Segmented Discourse Representation Theory. In:
Workshop Segmented Discourse Representation Theory, 11th Con-
ference on Natural Language Processing (TALN), 485–494.
Webber, Bonnie, Alistair Knott, Matthew Stone and Aravind Joshi
1999 Discourse relations: A structural and presuppositional account using
lexicalised TAG. Paper presented at 1999 Meeting of the Association
for Computational Linguistics, College Park MD.
Webber, Bonnie, Alistair Knott, Matthew Stone and Aravind Joshi
2003 Anaphora and discourse Structure. Computational Linguistics 29:
545–587.
Rhetorical relations and verb placement
in Old High German

Roland Hinterhölzl and Svetlana Petrova

1. Introduction

The present paper approaches the issue of salience in discourse from the per-
spective of historical linguistics and the theory of language change. In particu-
lar, we are interested in discerning and describing linguistic phenomena which
are formal correlates of salience and related notions in the system of Old High
German (henceforth OHG). In particular, we are interested in finding out how
the expression of features related to salience influences the development of
novel forms and patterns in the history of German.
According to the common definition employed in this volume, salience re-
flects “the degree of relative prominence of a unit of information, at a specific
point in time, compared to other units of information” (Introduction, p. 2ff.).
A variety of linguistic factors which determine the referent’s current degree of
salience have been discussed in the literature, foremost cognitive status (given
vs. new), grammatical role (subject vs. non-subject) and animacy (animate vs.
non-animate). It is also claimed that there is a special matching relation between
the referent’s current degree of salience and the form of the linguistic expression
used to refer to it (Gundel et al. 1993) also called ‘referential choice’ (Krasav-
ina, this volume). At the same time, languages employ special strategies to mark
shifts in the degree of salience with respect to the preceding context, e.g. when
a referent with a lower degree of salience is promoted to a higher degree of
prominence at a particular stage of the discourse, also called ‘salience promo-
tion’ (see also Filchenko, Chiarcos, all this volume ). Addressing the issue of
referential choice and the form of anaphoric expressions in OHG, Petrova and
Solf (2010) have argued that salience promotion as a main principle governing
the use of demonstratives vs. personal pronouns in modern German (see Bosch
et al. 2003, Bosch and Umbach 2007), has applied already at the earliest stages
of the language.
Yet the use of anaphoric expressions is only one domain in which salience-
related features find a formal expression in the system of OHG. In the following
174 Roland Hinterhölzl and Svetlana Petrova

contribution, we will argue that pragmatic factors related to salience and dis-
course coherence take formal realization in syntax as well, more precisely in
the structure of the left periphery of main clauses in OHG. In particular, we
will focus on the principles determining the position of the finite verb in the
sentence. In this respect, the notion of salience and its realization are crucial
for the explanation of structural variation in the left periphery of main clauses
in OHG.
On the basis of evidence from the OHG Tatian, a major representative of the
OHG corpus (see section 2 below), we distinguish verb-initial (V1) and verb-
second (V2) as the two basic word order patterns at this particular stage of the
development of German. In approaching the principles governing the distribu-
tion and functional properties of these patterns, we first draw the attention to
the correlation between salience and syntactic position in the clause. Following
initial observations outlined in Hinterhölzl et al. (2005), we show that the po-
sitional realization of referring expressions in OHG is sensitive to the degree
of salience of the particular referent in the sense of givenness and accessibil-
ity in the discourse. So expressions referring to salient, i.e. pre-mentioned or
situationally inferable, referents are realized in clause-initial position followed
immediately by the finite verb, which results in V2 structures on the surface. In
contrast, non-salient, i.e. discourse-new referents are placed postverbally yield-
ing V1 on the surface. Following this, we conclude that V2 is used as a means
of marking prominence on the constituent placed in clause initial position and
separated from the rest of the utterance by the finite verb.
However, this correlation can be overwritten by discourse-structural factors,
as is evidenced by the occurrence of V1 orders with given discourse referents.
In some of the cases, the factors leading to V1 clearly pertain to discourse or-
ganization proper, i.e. they mark the beginning of a new chapter or episode
in the structure of the text. With Grüning and Kibrik (2005), we can assume
that referential distance across episode/paragraph boundaries lowers the status
of salience of the antecedent which results in postverbal realization of the re-
ferring expression. In this case, the process of ‘salience demotion’ takes place
(see also Filchenko, this volume). But in other cases, V1 with given referents
occurs within one and the same episode. In these cases, however, the sentence
conveys an especially important event or state which is crucial to the further
development of the discourse. In attempting to provide a unified account for
all cases of V1, we invoke the distinction between coordination vs. subordina-
tion in discourse as outlined in the Segmented Discourse Representation Theory
(SDRT, Asher and Lascarides 2003; see also Ramm, this volume). We analyze
the instances of V1 and V2 from the perspective of the features viewed as con-
stitutive for the definition of two basic types of rhetorical relations in discourse.
Rhetorical relations and verb placement in Old High German 175

As a result, we relate V2 to subordination, while all types of V1 are attributed to


the realization of coordination in discourse. We conclude that word order and
especially verb placement in OHG contribute to the realization of a dynamic,
multi-layered discourse structure and are therefore best described as a formal
correlate of text coherence and discourse relations in the system of OHG.
The implications of this study are twofold. For language theory, it outlines
the interaction between the word order of constituents and their rhetorical and
discourse-functional contribution in the text. For historical linguistics, it pro-
poses an alternative approach to the research on word order variation and the
development of V2 in the Germanic languages which sheds new light on these
issues.

2. Philological issues and empirical data base

The OHG corpus comprises texts of different length, genre, and quality of trans-
mission composed in the time between around 750 and 1050. Of course not all of
them are equally appropriate for syntactic research (cf. Fleischer 2006). One of
the largest prose texts from the beginning of the OHG period is the Tatian text,
a gospel harmony translated from Latin and written down in the scriptorium of
Fulda by at least 6 scribes. This text has been deliberately chosen for the purpose
of the present investigation. Although having been considered for a long time a
slavish word-for-word translation of the Latin original and therefore unsuitable
for any investigation on word order, this text has been rediscovered as a good
basis for research due to novel insights into the main principle of translation
applied in it. In the manuscript, as Figure 1 of the Appendix shows, the Latin
source and the OHG translation are attested as two juxtaposed columns. Only
recently, it has been observed that each line in the OHG text translates exactly
the same material found in the corresponding Latin line; departures from this
basic principle are extremely rare within the whole text. A new diplomatic edi-
tion made available by Masser (1994) reflects these major characteristics and
makes it possible to compare the source and target text, cf. Figure 2 of the Ap-
pendix. The translating technique applied in the Tatian text certainly imposes
restrictions on the possibility of rendering genuine word order patterns in the
translation (cf. Masser 1997 a and b), while the deviations from the Latin source
can be viewed as evidence for genuine OHG structures (cf. Dittmer and Dittmer
1998; Fleischer, Hinterhölzl and Solf 2008).
Therefore, we base our study on deviating examples from the Tatian text ex-
clusively. The corpus of the study comprises the complete sample of deviations
in constituent order found in the text portions of three scribes, a total of 1.658
176 Roland Hinterhölzl and Svetlana Petrova

clausal structures. These examples were fed into a corpus and annotated for
various morpho-syntactic and information-structural features by project B4 of
Collaborative Research Center (SFB) 632 “Information Structure” at Humboldt
University Berlin. The corpus is searchable via the ANNIS database (Chiarcos
et al., 2008; Zeldes et al., 2009) developed by project D1 of SFB 632 (Uni-
versity of Potsdam, Humboldt University Berlin). For more details concerning
the design of this corpus and the use of the ANNIS database see Petrova at al.
(2009).

3. The point of departure

3.1. Distribution of patterns and aim of the study


Some of the most puzzling questions in the diachronic syntax of the Germanic
languages in general, and of German in particular, concern the principles de-
termining the placement of the finite verb in the earliest records as well as the
subsequent establishment of the word order regularities in the modern systems
of these languages. To illustrate the degree of word order variation in early Ger-
manic, we provide some examples from one of the earliest OHG records, the
Isidor translation dated back to the time around 800. Here, the finite verb may
occur in any position in a main declarative clause, for example in initial posi-
tion (1), in second position (2), or in a later position, following more than two
and sometimes all of the remaining constituents of the clause (3). Note that all
sentences deviate in word order from the corresponding Latin original:1
(1) Quhad got, see miin chnecht (V1)
spoke God behold my child
‘God spoke: “Behold my child”’ (I 330)
Latin Ecce, inquit, puer meum
(2) Ih faru dhir fora (V2)
I go you-dat. before
‘I’ll go before you’ (I 156)
Latin Ego ante te ibo

1. The examples from the Isidor [I] text are cited by line number according to the edition
of Eggers (1964). The examples from the Tatian [T] text are cited by manuscript
page and line number according to Masser (1994). A slash in the Tatian examples
represents the end of line according to the manuscript. The inflected verb in both
OHG and Latin is underlined for clarity throughout the paper.
Rhetorical relations and verb placement in Old High German 177

(3) Dher selbo forasago auh in andreru stedi chundida (Vend)


the same prophet also in another place announced
‘The same prophet announced in another place too’ (I 348)
Latin […] alias […] testatur idem propheta
Table 1 provides the absolute number of word order patterns in main declarative
clauses formed against the Latin original in Isidor. This overview shows that the
patterns like in (2) and (3) appear with a considerable frequency in the document
while V1 is found only rarely in clauses formed against the Latin word order.2
Table 1. Frequency of word order patterns in main declarative clauses in Isidor formed
against the Latin original

type of pattern V1 V2 Vlate/end


number of occurrence in Isidor 6 74 45
(against the Latin structure)

Exploring the frequency of these word order types in the Tatian database de-
scribed in section 2 above, we discover a rather different picture. Here, mainly
V1 and V2 occur in considerable numbers against the structure of the Latin
original.3 Patterns in which the verb occurs in a position later then the second
one like in (4) are formed against the original only rarely, and cases with the
verb at the absolute end of the sentence as in (5) are mere exceptions:4

2. Here, we only briefly refer to some previous accounts on some of these patterns in
Isidor. First, we do not subscribe to the view expressed by Robinson (1994) who
claims that V1 represents a foreign pattern used exclusively in the translation of the
biblical citations rather than of the commentary parts of the treatise in order to signal
foreign speech. Rather, we regard V1 as a common Germanic pattern which abounds
both in the remaining texts of the OHG tradition as well as in all other early Germanic
languages, i.e. in Old English, Old Saxon and Old Norse. Second, with respect to
Vlate/Vend, we deny the view of Tomaselli (1995) reducing such examples to cases
involving pronominal or other prosodically light constituents which she explains as
clitics attached to the left of the verb after a full constituent in initial position. As our
example in (3) shows, Vlate/Vend in main clauses in Isidor also appears in sentences
with full constituents before the finite verb.
3. Note that the cases of V1 included in this statistics do not comprise elliptic non-initial
conjuncts sharing the subject of the preceding clause and therefore showing surface
V1-order.
4. In this example, the synthetic passive of the Latin original is represented by an ana-
lytic construction involving the finite form of the auxiliary sîn ‘be’ + Past Participle.
As the semantics of the Latin main verb is reflected in the OHG participle, the finite
178 Roland Hinterhölzl and Svetlana Petrova

(4) thanan tho zacharias uuard gitruobit


then then Zacharias became troubled
‘Then, Zacharias was troubled’ (T 26, 20)
Latin & zacharias turbatus est
(5) min tohter/ ubilo fon themo tiuuale giuuegit ist
my daughter badly by the devil.DAT tortured is
‘My daughter is badly tortured by the devil’ (T 129, 10–11)
Latin filia mea/ male a demonio uexatur

Table 2. Frequency of word order patterns in main declarative clauses in Tatian formed
against the Latin original

type of pattern V1 V2 Vlate/end


number of occurrence in Tatian 96 382 11
(against the Latin structure)

From this we can conclude that a process towards stricter verb fronting in main
declarative clauses and a considerable reduction of the Vlate/end pattern has
taken place already within the OHG period. One question arises from this ob-
servation, namely whether the distribution of the main competing patterns, V1
and V2, obeys certain rules in the system that emerges in the Tatian, and if so,
what kind of principle may be made responsible for the choice of one pattern
over the other. This question will be addressed in the following section.

3.2. Previous accounts


In the most recent investigation on the structure of the sentence left periphery
in OHG, Axel (2007) claims that the verb-second property typical for modern
German has already developed at this early stage of the language. In the gener-
ative framework which Axel adopts, a constitutive feature of the verb-second
rule is that the inflected verb obligatorily moves to the head C◦ of the maxi-
mal projection CP. Additionally, in main clauses, the specifier position of CP
is filled either by i) movement of a phrase bearing one of the operator features
+topic/+focus/+wh (operator movement), or ii) movement of a phrase that oc-
cupies the highest position in the middlefield of the corresponding structure
(stylistic fronting, cf. Fanselow, 2003). If none of these movement operations
applies, a non-referential expletive es is merged in SpecCP.

auxiliary has to be regarded as an additional constituent not present in the original.


Therefore, its placement in the OHG part is a matter of free choice.
Rhetorical relations and verb placement in Old High German 179

Turning to OHG, Axel shows that both operator movement as well as stylis-
tic fronting occur, while the third option, the placement of a base-generated ex-
pletive in Spec,CP has not emerged yet. As a consequence, sentences in which
neither operator movement nor stylistic fronting can apply remain as V1 (ana-
lyzed as the verb moving to C◦ with Spec,CP remaining empty). This implies
that the rule of V2 was not fully grammaticalized yet in OHG.
But what is then constitutive of the word order in OHG? To explain why
Spec,CP remains empty in OHG, Axel refers to the fact that in most of the cases
of V1, the sentence contains the adverbial tho ‘then’ in postverbal position tak-
ing the function of a narrative-expressive particle indicating sentence type just
like other particles, e.g. the interrogative particle inu/eno, the affirmative parti-
cle ia or the imperative particle nu etc. Once sentence type has been indicated by
the particle, the application of stylistic fronting is unnecessary leaving Spec,CP
empty in the corresponding cases.
Expressivity as a factor leading to V1 in early Germanic is known from a
number of previous works on the matter. In his very influential study, Fourquet
(1974) has put forward the idea that verb fronting in early Germanic is used to
highlight the entire contents of a sentence. Much earlier, Ries (1880, 19) had
observed for Old Saxon that V1 occurs in sentences reporting an outstandingly
important event or property. As for Old English, van Kemenade (1987, 44) re-
ports that in the Anglo-Saxon Chronicle, V1 is especially characteristic of one
particular section which is “famous for its lively narrative style”.
But expressivity, or stylistic vividness are rather vague terms when it comes
to differentiating the domains in which the two main patterns in declaratives
in OHG apply. All accounts mentioned before shift the attention to the broad
field of pragmatics as the source of additional factors influencing word order
in early Germanic. In this respect, they are representative of a long tradition of
research whose attempts in explaining this issue should be reconsidered from
the perspective of modern linguistic theory. In this respect, we want to ana-
lyze more thoroughly the functional domains in which the two main patterns of
OHG main-clause syntax occur in order to be able to isolate operational features
associated with each of them in OHG.

4. Information structure and word order in OHG

Hinterhölzl et al. (2005) launch a large-scale investigation on the sensitivity


of word order in OHG to factors pertaining to information structure. In line
with the account proposed by Molnár (1993) and Krifka (2007) among others,
information structure is understood as a complex linguistic phenomenon com-
180 Roland Hinterhölzl and Svetlana Petrova

prising functional distinctions of categories on the following three layers: i) the


informational status of referents (theme vs. rheme or given vs. new); ii) the
predicational structure of the utterance (topic vs. comment); and iii) the com-
municative weight or relevance of sentence constituents (focus vs. background).
These layers of information structure are viewed to function independently in
the language but to interact with each other in yielding the full picture of the
information-structural shape of an utterance.
In a first step, Hinterhölzl et al. (2005) investigate the relationship between
the informational status of discourse referents and their positional realization
with respect to the finite verb in the clause. The notion of ‘discourse referents’
is understood in the sense of Karttunen (1976) who applies this term to indi-
viduals (persons, events, facts) that can be referred back to in a coherent dis-
course by coreferential definite expressions, i.e. pronouns or full noun phrases.
The identification of the informational status of discourse referents is based on
taxonomies proposed by Prince (1981) and Dik (1989) who argue for a more
fine-grained system in which ‘given’, i.e. explicitly pre-mentioned material, and
‘new’, i.e. novel, non-inferable information represent the two endpoints of a
scale including different sub-types of textually or situationally accessible enti-
ties in between.
The investigation of a possible correlation between verb placement and dis-
course status of constituents in instances of the OHG Tatian text reveals two
striking tendencies. On the one hand, there is a regular preference for V1 in
presentational sentences which introduce new referents to the context. This is
shown in (6) through (8). It can be observed that V1 in OHG is the constant
pattern corresponding to a variety of different orders in the Latin original:
(6) [The forty-days’ old Infant is presented to the Lord in the temple in
Jerusalem and blessed there by Simeon. After that, the holy family meets
the prophetess Anna.]
uuas thô thâr anna uuizzaga
was then there Ann prophetess
‘There lived there at that time the prophetess Anna’ (T 38, 22)
Latin & erat anna proph&issa
(7) [in the Nativity of Christ]
uuarun thô hirta In thero lantskeffi
were then shepherds in this region
‘There were shepherds in the same country’ (T 35, 29)
Latin Et pastores erant In regione eadem
Rhetorical relations and verb placement in Old High German 181

(8) [Jesus tells a parable about an unjust judge who was asked by a widow
to avenge her against her adversary]
uuas thar ouh sum uuitua/ In thero burgi
was there also certain widow in this town
‘There was a widow too in that city’ (T 201, 2)
Latin Vidua autem quaedam erat/ In ciuitate illa

On the other hand, sentences maintaining an already introduced discourse refer-


ent as in (9) or involving a referent considered accessible via a bridging relation
to an already established entity as in (10) show a regular tendency for V2 against
the underlying word order of the Latin original. In other words, V2 appears to
be bound to referents that are already salient in the discourse:

(9) [Jesus compares himself with a shepherd. ih bin guot hirti = ‘I am a


good shepherd’]
guot hirti/ tuot sina sela furi siniu scaph
good shepherd does his soul for his sheep
‘The good shepherd gives his soul for his sheep’ (T 225, 16–17)
Latin bonus pastor/ animam suam dat pro ouibus suis
(10) [The previous sentence introduces Zacharias who is married to one of
the daughters of Aaron]
Inti ira namo uuas elisab&h
and her name was Elizabeth
‘and her name was Elizabeth’ (T 26,2)
Latin & nomen eius elisab&h

The text also provides numerous examples of ‘minimal pairs’ where the initial
placement of the verb in the first sentence introducing new discourse referents
is immediately suspended for a V2 clause in the following utterance making a
statement on the referents just established. Consider the following small dis-
course:

(11) [the beginning of the story about the Nativity of John the Baptist]
a. uuas In tagun herodes […]/ sumer biscof […]/
was in days Herod.GEN some bishop
Inti quena Imo
and wife him.DAT
b. siu uuarun rehtiu beida fora gote
they were righteous both before God.DAT
182 Roland Hinterhölzl and Svetlana Petrova

‘In the days of Herod […], there was a certain priest […] and his
wife […]. They were both righteous before God’ (T 26, 3)
Latin a. Fuit in diebus herodis regis/ […] quidam sacerdos/ […]/
& uxor illi […]/ b. erant autem iusti ambo ante deum
This evidence provides significant points in favor of the interdependence be-
tween verb placement and information structure in OHG. It shows that new
referents follow the verb, while referents already salient in the context precede
it. What kind of generalization can we draw from these observations?
Looking at the data from the perspective of the model developed by Sasse
(1995), we discover that the sentences we are dealing with are typical represen-
tatives of the thetic vs. categorical type of judgments. By definition, categorical
sentences have a bipartite structure divided into a predication base, or topic of
the sentence and a comment on this topic. This is the case in (9), (10), and (11b).
Here, the finite verb separates from the rest of the utterance exactly that con-
stituent which provides the sentence topic (both in line with the familiarity as
well as the aboutness concept, for a discussion see Frey 2000, 137–138). By
contrast, the presentational sentences in (6) through (8) and in (11a) are typical
instances of the thetic type of judgments. The most significant feature of such
instances is that they represent “monominal predications” (Sasse 1995, 4) in
which no particular constituent is taken as the predication base of the utterance;
rather, the entire sentence, including all participants, is asserted as a unitary
whole.5
Therefore, we can conclude that the position of the finite verb in OHG is
firmly related to the realization of the topic-comment structure in a sentence.
As a rule, the finite verb separates the topic expression from the comment of the
utterance. In the most cases, this position is occupied by an expression referring
to the most salient referent in the context, which is either previously mentioned
or situationally accessible at that particular point of the discourse. Remarkably,
novel referents serving as the predication base of a categoric utterance also share
the positional properties of canonical (i.e. salient) topics in OHG. Consider the
bare plural fohún ‘foxes’ in (12) which is not previously established in the con-
text but is nevertheless placed in preverbal position. The sentence receives an
interpretation according to which it makes a statement about a set of individuals
of the denoted kind. Thus, the kind-refering bare plural in fohún is the aboutness
topic of the utterance:6

5. Drubig (1992) and Lambrecht (1994, 137–146) also argue that in thetic utterances
no topic-comment division applies.
6. In this respect, we follow Endriss and Hinterwimmer (2007) who argue that given-
ness is not necessary for topicality. They argue that novel constituents may provide
Rhetorical relations and verb placement in Old High German 183

(12) [a chain of coordinate conjuncts claims that every creature has a home
to stay over night except the Son of the Lord]
fohún habent loh
foxes have holes
‘The foxes have holes’ (T 85, 25)
Latin vulpes foueas habent
In case no topic-comment distinction applies, the verb moves to the position in
front of all arguments to indicate that none of them functions as the sentence
topic and that the entire proposition has to be interpreted as wide (sentence)
focus.
These observations are summarized in (13):
(13) a. [Vfin….DRnew …]FOCUS (V1)
b. [DR]TOP [Vfin……]COMMENT (V2)
Lenerz (1984, 151–153) and Ramers (2005, 81) also observe that V1 is typical
for presentational sentences in OHG. They conclude that V1 in OHG is used
when the sentence conveys discourse-new, or rhematic material only. Looking
at the examples above, we nevertheless discover that new information is estab-
lished only in the subject expressions, while the remaining part of the sentence
is given; see e.g. the adverbials In thero lantskeffi ‘in this country’ in (7), or
In thero burgi ‘in this town’ in (8). From this perspective, the notion that V1
occurs in all-new sentences cannot be maintained. Rather, verb fronting signals
that none of the constituents provided in the sentence takes over the function of
the sentences topic because no topic-comment division applies in these utter-
ances.

5. Discourse structure and the distribution of word order patterns


in OHG

5.1. Evidence for discourse relations


On a closer look, it turns out that V1 is frequent in sentences with given argu-
ments as well. Consider the subjects in (14) and (15):

the aboutness topic of an utterance if the utterance allows for a topic-comment divi-
sion in which the respective constituent takes the role of the subject of the predica-
tion.
184 Roland Hinterhölzl and Svetlana Petrova

(14) [A Pharisee invites Jesus to dine in his house. Jesus enters the house
and sits down to eat. The Pharisee realizes that Jesus has not washed
his hand before dinner and criticizes him on that occasion]
bigonda ther phariseus […] quedan
began this Pharisee speak.INF
‘The Pharisee began to speak’ (T 126, 5–6)
Latin Phariseus autem coepit […] dicere
(15) [Jesus starts telling a parable on whether it is allowed to heal on Sab-
baths]
Quad her tho zi then giladoten/ ratissa
spoke he then to the invited parable
‘Then he told to the guests a parable’ (T 180, 9–10)
Latin Dicebat autem & ad Inuitatos/parabolam

The full definite expression ther phariseus ‘the Pharisee’ in (14) as well as the
personal pronoun her ‘he’ in (15) refer to entities already introduced in the
previous discourse. But although they display pragmatic properties of sentence
topics like givenness/accessibility, definiteness and referentiality, they fail to
occupy the topic position established in (13b) above.
To explain these data, we need to find a common basis to account for the
postverbal placement of both given and new referents in OHG. In our opinion,
this may be achieved if one broadens the account on information packaging
beyond the scope of the informational status of individual discourse referents
in the sentence and takes into consideration the discourse-functional role of the
utterance in the narrative structure of the text.

5.2. Basic notions of discourse analysis


We shall briefly outline some basic notions and distinctions in current research
on discourse structure in order to show in our analysis that important discourse-
related features of utterances correlate with the two main word order patterns
in OHG, thus allowing the conclusion that variation in verb placement in OHG
is pragmatically driven.
A particular model relating word order in early Germanic to discourse or-
ganization is proposed by Hopper (1979a and b) who distinguishes between
the part of main action, i.e. foregrounding, and the part of supportive infor-
mation, i.e. backgounding in text structure. Hopper identifies some distinctive
features associated with these notions. Typically, foregrounding is conveyed by
dynamic, perfective verb meanings providing temporal progression on the level
of main action. By contrast, backgrounding establishes temporal relations of si-
Rhetorical relations and verb placement in Old High German 185

multaneity to main actions induced by the durative semantics of the predicates


involved. In this way, Hopper establishes a relation between discourse structure
and aspectuality of the verb in the sentence, a feature which shall turn out to be
important in our interpretation of the examples as well.
Moreover, in his survey of formal realizations of foregrounding and back-
grounding in a variety of non-related languages, Hopper comes across a funda-
mental matching relation between word order, especially verb placement, and
discourse structure in the text of the Old English Anglo-Saxon (Parker) Chron-
icle as a representative of the early Germanic tradition. He observes that back-
grounding parts employ SVO order, i.e. medial verb placement, whereas fore-
grounding parts generally display peripheral verb placement, either verb-final
or verb-initial. The distribution of the latter two patterns is said to be a matter
of further “discourse considerations” (cf. Hopper 1979b, 221): verb-initial is
viewed to occur in introductory parts, that is, at the beginning of new episodes,
whereas verb-final is bound to episode-internal sentences.
Recent approaches to discourse semantics also take into consideration the
temporal relation between clauses as a major device for text organization and
coherence (see Claus, this volume, for the role of discourse participants in im-
posing a temporal structure on the narrated world). Two approaches that we
will take into account are the Rhetorical Structure Theory (RST) by Mann and
Thompson (1988) and the Segmented Discourse Representation Theory (SDRT)
by Asher and Lascarides (2003). A basic assumption in both of them is that dis-
course coherence is achieved only if each utterance makes an illocutionary con-
tribution to another utterance in the context. This is achieved when discourse
units establish different kinds of rhetorical relations among each other, thus cre-
ating a dynamic hierarchical structure in discourse.
According to RST and SDRT, the rhetorical relations linking together the
contents of single discourse units can be of the following two kinds:
– two units can display no dependency relation among each other but share
the same level of discourse hierarchy thus creating a multi-nuclear relation
in the terms of RST or a relation of coordination in the terms of SDRT
– two units can build a dependency relation creating a hierarchical structure
in discourse, i.e. a nucleus-satellite relation in RST or a relation of subordi-
nation in SDRT.
In order to show how verb placement participates in achieving discourse hierar-
chy in texts of the early Germanic tradition, we chose the model of SDRT. Al-
though the inventory of individual discourse relations is still under discussion,
there is overwhelming agreement on the basic features distinguishing coordina-
tion vs. subordination as the two basic types of linking. Both are associated with
186 Roland Hinterhölzl and Svetlana Petrova

a particular prototypical rhetorical relation displaying some well-defined, com-


plementary features (Asher and Vieu 2005). Subordination is typically repre-
sented in elaboration, i.e. when a unit β provides more detail on another unit α
situated on a higher level of discourse structure. In this case, the two events (α ,
β ) temporally overlap. Further, the rhetorical relation of continuation applies
if two or more subsequent units β and γ are equally situated on a lower level of
dependency with respect to a higher unit α such that both β and γ elaborate on
α . By contrast, coordination, which holds between units situated on the same
level of discourse hierarchy, is typically represented in the relation of narration.
Narration is established e.g. if two discourse units (α , β ) display a temporal re-
lation of succession and β continues the narrative sequence in discourse.
Looking at the distinctive features of coordination vs. subordination in
SDRT, we discover a number of parallels between them and the discourse prop-
erties of the word order patterns discussed in the foregoing data analysis. These
will be discussed in turn in the following two sections.

5.3. V2 as a means of subordination in discourse


From the perspective of the discourse relations distinguished above, the in-
stances of V2 in (9), (10) and (11b) immediately evoke parallels to the sub-
ordinative type of linking. Consider also the following small discourse:7
(16) [Jesus and his disciples approach the gates of a city called Nain and
witness the following scene]
a. senu arstorbaner/ uúas gitragan einag sun sinero
behold dead man was carried only son his.GEN
muoter
mother
b. Inti thiu uuas uuituuua
and she was widow
‘behold, a dead man was being carried out, the only son of his
mother and she was a widow’ (T 84, 22–24)
Latin a. ecce defunctus/ efferebatur. filius unicus/ matris suae.
b. & haec uidua erat.

7. Unfortunately, any significant reordering of constituents in the first sentence of this


small discourse is impossible for reasons of the line-for-line principle of translation
outlined in section 2 above. Therefore, the placement of the indefinite subject ex-
pression arstorbaner ‘a dead man’ introducing a new referent does not illustrate the
distributional properties of such constituents outlined in this study.
Rhetorical relations and verb placement in Old High German 187

In (16b), the finite verb is shifted from the sentence final position in the Latin
source to the position between the topic and the comment in the OHG trans-
lation. Clearly, the sentence in (16b) provides additional information on the
discourse referent muoter ‘mother’ introduced by the preceding sentence. With
respect to the temporal relation of the two sentences, we can observe that the
event in (16b) overlaps with the event in (16a). Taken together, all these features
favor the identification of elaboration among (16b) and (16a) as the prototype
of the subordinating kind of linking in discourse.
In other parts of the text, we discover chains of utterances equally depending
on a higher unit in discourse structure. Consider (17b–e) which assign different
properties to the referent scrîbera ‘the scribes’ introduced in the opening sen-
tence (17a). V2 is established by the regular insertion of the pronominal subject
referring to the topic referent of the entire text portion (topic continuity):

(17) a. obar stuol/ moyses sâzzun scrîbera/ Inti


over seat Mose.GEN sat scribes and
pharisej […]
Pharisees
b. sie quedent/ Inti nituont
they say and NEG.PRT.do
c. sie bintent suuara burdin […]/
they bind heavy burdens
d. sie breitent Iro ruomgiscrib/ […]
they make broad their phylacteries
e. sie minnont furista sedal
they love first seats
‘in Mose’s seat sit the scribes and the Pharisees. They say and
they do not do, they bind heavy burdens, they make their phylac-
teries broad, they love the best places at feasts’(T 242, 18–243, 5)
Latin a. super cathedram/ moysi sederunt scribe/ & pharisej. […]
b. dicunt enim/ et non faciunt. c. Alligant autem onera grauia […]
d. dilatant enim philacteria sua/ […]/ e. Amant enim primos re-
cubitos

We interpret instances like these as cases of continuation, i.e. as a series of


utterances serving to elaborate on the same unit situated higher in the discourse.
In other cases, a discourse unit provides additional, explanatory information
with respect to a previous event. Consider (18b) which provides a motivation
for the proposition denoted in the previous utterance (18a):
188 Roland Hinterhölzl and Svetlana Petrova

(18) [an angel prophesies to Zacharias the near birth of his son, John the
Baptist, and explains that he will be a person of special qualities]
a. Inti manage in sineru giburti mendent
and many in his birth have joy
b. her ist uuârlihho mihhil fora druhtine
he is truly great before God.DAT
‘And many people will rejoice at his birth. For he will be great in
the eyes of the Lord’ (T 26, 29–30)
Latin & multi in natiuitate eius gaudebunt/ erit enim magnus coram
domino

To conclude, we relate the distribution of V2 in OHG to sentences establish-


ing relations of subordination in discourse. First, V2 appears in sentences as-
signing properties to individuals or explaining the circumstances of events or
actions established in previous discourse units. Second, the events provided by
V2 sentences temporally overlap with those of the discourse units on which
they elaborate. In terms of discourse hierarchy, V2 creates units that depend on
higher units in discourse structure, thus instantiating subordination in discourse.

5.4. V1 as a means of coordination in discourse


Previous descriptive accounts, summarized in Schrodt (2004, 144–145), pro-
vide the following two conditions favoring the use of V1 in OHG: first, V1
occurs in text-initial sentences or at the beginning of new episodes; and second,
V1 is frequent with certain types of predicates like verbs of motion, verbs of
saying etc. We shall look in more detail for a unified explanation of these func-
tions of V1 in OHG, especially with respect to the kind of rhetorical relations
they constitute in discourse.

5.4.1. V1 signals episode boundaries


The use of V1 as an indication of episode boundaries directly invites the as-
sumption that this pattern functions as a discourse-structuring device. As re-
ported for some modern colloquial registers as well as for some orally trans-
mitted genres like jokes etc. (Lenerz 1984, 153; Önnerfors 1997, 53), V1 has
survived in text-opening sentences to the present day. The most numerous ex-
amples for this function in the Tatian involve the introductory formula uuard thô
for Latin factum est ‘it happened‘ followed by an extraposed subject clause. In
the following example, both the original and the translation involve the con-
struction ‘auxiliary + past participle’. However, the scribe of the OHG text
Rhetorical relations and verb placement in Old High German 189

opted for V1 although a precise corresponding linearization pattern would have


been possible by leaving the participle in the sentence-initial position, as in the
original:
(19) uuard thô gitân In then tagon
became then done in those days
‘It happened in those days’ (T 35, 7)
Latin Factum est autem In diebus illis
But also apart from this introductory formula, V1 applies more widely as a
text-structuring device in OHG (cf. Petrova 2006; Petrova and Solf 2008). In
the Tatian text which combines the events of the four gospels in one harmony
episode onsets are signaled by concordance notes in the left-hand margin of the
Latin column or between the Latin and the OHG text (see Figure 1, Appendix).
Additionally, as is known for both Latin and vernacular manuscripts of Car-
olingian provenance, the beginnings of new text units are marked by different
size and color of the initial letter (cf. Bästlein 1991, 59 and 214–242). As for
the Tatian manuscript, Simmler (1998, 306–307) observes that the strategy of
dividing episodes and sub-episodes by means of initial capital letters predom-
inantly applies for the Latin section of the text and only rarely occurs in the
OHG part. Petrova (2006, 158–159) notices that the graphical distinction of
new episodes in the Latin original correlates with the regular preposing of the
finite verb in the OHG translation. Consider (20), next to (14) and (15) given
above, which demonstrates that the syntactic means of verb fronting systemati-
cally applies for marking episode boundaries in OHG as a functional equivalent
to the graphical highlighting of the episode onsets in the Latin original:
(20) [Joseph of Arimathea and Nicodemus take the body of Jesus to conduct
a Jewish burial]
Intfiengun sie tho thes heilantes lichamon
took they then the.GEN Saviour.GEN body
‘Then they took the body of Jesus’ (T 321, 29)
Latin Acceperunt autem corpus ihesu
This example is remarkable in some more respects. First, it shows that the strong
preference for V1 at the beginnings of new episodes does not only account for
the post-verbal position of full subject constituents as in (14), but quite obvi-
ously affects the positioning of pronominal subjects inserted against the Latin
original like sie ‘they’ in (20) or her ‘her’ in (15) above. Second, it shows that
V1 in episode-initial position applies generally, not only with impersonal intran-
sitive predicates as in (19) but also with transitive verbs like the one in (20).
190 Roland Hinterhölzl and Svetlana Petrova

The fact that V1 is used to indicate the beginning of a new episode is rather
suggestive for the role of this pattern in the structuring of the discourse. In
particular, it is clear that no elaboration on the discourse referents involved
in the sentences is at issue here. Rather, the information in the sentences un-
der scrutiny is part of the core scheme of the narrative, providing the basis for
further elaboration in the discourse.

5.4.2. Types of predicates favoring V1


Next to its function to mark episode boundaries, V1 is said to be frequent with
certain groups of predicates. According to our empirical investigation, the most
common groups of predicates favouring V1 – apart from existential verbs in
presentational sentences discussed in section 4 above – are motion verbs, verbs
of saying as well as perfective, inchoative verbs signaling the initiation of a new
state of affairs, very often a new physical or cognitive state of the referent.
Among these predicate groups, verbs of motion constitute the largest class.
Some of the examples as in (21) introduce novel discourse referents and thus
functionally overlap with the type of presentational sentences. But in a great
number of other cases, the appearance or withdrawal of a given discourse ref-
erent is reported, cf. (22) and (23):

(21) [Zacharias conducts service as a priest when suddenly an angel appears


in the temple]
quam thara gotes engil
came there God.GEN angle
‘There came God’s angel’ (T 35, 32)
Latin & ecce angelus domini
(22) [A centurion asks Jesus to heal his servant. Jesus demands his faith and
sends him back to his house.]
uuarb tho ther centenary in sin hús
returned then this centurion to his home
‘Then the centurion returned to his home’ (T 84, 8)
Latin & reuersus est centurio in domum suam
(23) [The archangel Gabriel departs from Mary after the revelation]
Inti arfuor tho/ fon Iru ther engil
and flew away then from her this angle
‘And then the angel left her’ (T 29, 6–7)
Latin & discessit/ ab illa angelus
Rhetorical relations and verb placement in Old High German 191

Furthermore, V1 is attested in clauses with motion verbs selecting an inanimate


subject as in (24). It is not the appearance or withdrawal of a discourse referent
that is reflected here, but rather the establishment of a new state in the overall
development of the plot:

(24) [Jesus has healed lots of people and performed many miracles]
Inti argieng thó úz thiu liumunt
and spread then out this fame
‘And this fame spread around’ (T 97, 5)
Latin & exiuit fama haec

Next to verbs of motion, verbs of saying form another group of stable V1 occur-
rences in sentences involving context-given referents. The instances indicate a
change of interlocutors in a dialogue sequence and therefore a shift in perspec-
tive. Consider (25):

(25) [Within a dialogue scene]


antlingota thô sîn muoter Inti quad
responded then his mother and said
‘Then his mother responded and said’ (T 30, 24)
Latin & respondens mater eius & dixit

Finally, V1 regularly occurs in contexts where a previously given discourse


referent undergoes a transition into a new mental or physical state. Verbs of
cognitive or sensual perceptions are common representatives of this group of
predicates triggering V1:

(26) [A woman suffering from a flow of blood becomes healed by touching


secretly the garment of Jesus]
furstuont siu thó in ira lihhamen/ thaz siu heil uuas
realizes she then in her body that she healed was
fon theru suhti
from this.DAT plague
‘She realized on her body that she was recovered from this plague’
(T 95, 14–15)
Latin & sensit corpore/ quod sanata ess& a plaga
(27) [Jesus heals a paralyzed boy]
uuard tho giheilit ther kneht in thero ziti
became then healed the boy in this moment
192 Roland Hinterhölzl and Svetlana Petrova

‘Then the boy was healed at this very moment’ (T 84, 7)


Latin & sanatus est puer in illa hora
(28) [Salomé demands from King Herod the head of John the Baptist on
a platter. The king is troubled because he has promised to fulfill any
wish of the girl]
Inti uuard gitroubit ther kuning
and became troubled this king
‘And the king was troubled’ (T 116, 21)
Latin & contristatus est rex
These instances show that V1 is a wide spread syntactic pattern in OHG, which
on the first glance appears to be highly heterogeneous in use. But from the per-
spective of discourse relations, the uses of V1 in the examples above actually
allow for a unified interpretation. On the one hand, it is evident that the sen-
tences with verbs of motion and verbs of saying affect the narrative setting of
the situation with respect to the participants involved in the action or the speaker
from whose perspective the event or action is reflected. As such, sentences in-
cluding a predicate of one of these groups automatically indicate a change in
the narrative situation. On the other hand, the inchoative predicates convey im-
portant, extraordinary or unexpected events which reveal a turning point in the
course of the story and therefore establish the initiation of a new situation in the
structure of the narrative.
From this perspective, sentences with V1 do not provide more information
on a discourse referent distinguished as the predication base of the utterance, but
assert the contents of the entire proposition, including all participants, as new
information representing a unitary whole. In this respect, V1 sentences with
these predicates represent thetic judgments with no topic-comment division.
From the point of view of temporal relation to the previous context, the
examples with V1 discussed here also reveal one important common feature.
Without any exceptions, they establish relations of temporal succession with
respect to the previous context, quite often indicated by temporal adverbials
like tho ‘then, after that’ included in the sentence.
From this, we can conclude that sentences with V1 serve to establish new
situations by providing narratively important information and carrying forward
the discourse. We assume that they continue the discourse on the level of main
action and share important properties with coordinative discourse linking like
temporal succession and progress in narration.
Rhetorical relations and verb placement in Old High German 193

6. Implications for the generalization of V2 in modern German

If the distribution of V1 and V2 was ruled by discourse-organizational princi-


ples and each of these patterns was associated with one particular, well-defined
functional field in the system of early German, then the question arises why and
how this functional opposition was lost in the course of language development
and how V2 became generalized in main clauses.
We assume that the reason for this development is already present in the sys-
tem of OHG. Note that V2 has already been generalized in wh-interrogatives
at the stage of development represented in the Tatian text (cf. Petrova and Solf
2009). Apart from this, we encounter cases of variation in one functional do-
main of the opposition described for V1, namely in the domain of the coordina-
tive type of discourse relations. Here, next to V1, V2 structures with a sentence-
initial adverbial co-occur. This pattern mainly applies to thô ‘then’ used as a
connective marking the coordinative relation to the previous event. Note that
(29) through (31) have the same discourse function as the V1 clauses discussed
above:
(29) tho uuas man In hierusalem
then was man in Jerusalem
‘There was a man in Jerusalem’ (T 37, 23)
Latin & ecce homo erat In hierusalem
(30) thó uuvrdun sie gifullte […]/ gibuluhti
then became they filled anger.DAT
‘then they became full of anger’ (T 115, 7)
Latin & repl&i sunt omnes/ in sinagoga ira
(31) tho fragata inan petrus
then asked him.DAT Peter
‘then Peter asked him’ (T 128, 18)
Latin interrogabat eum p&rus
This means that we encounter competition between V1 and thô+V2 in the do-
main of coordinative linking in OHG. This is represented in (32):
(32) coordination in discourse:
a. [Vfin….DRnew/giv …]FOCUS (V1)
b. thô [Vfin….DRnew/giv …]FOCUS (thô+V2)
We have to consider these two structures as optional varieties in OHG. This
can be inferred from the fact that according to the database, in 52 of the 96
194 Roland Hinterhölzl and Svetlana Petrova

instances involving V1, the adverbial thô is put independently of the original in
the position after the finite verb thus supporting V1 on the surface, see (6), (7) as
well as (24) through (27) above. However, in 122 of the 382 V2-cases included
in the database, the structure in (32b) occurs. We assume that this situation
shows the beginning of a process whereby the initial position in a sentence,
which was originally preserved for the most salient constituent of sentences
with a topic-comment division, was reanalyzed and extended by analogy to
adverbials used to link the sentence to the discourse situation established in the
previous discourse. Note that adverbials in anaphoric relation to a previously
mentioned location or goal share the positional properties of nominal referential
expressions as topics described so far. See thar ‘there’ referring to the pre-
established place of the wedding ceremony in (33):
(33) [at the Cannae wedding]
thar uuas thes heilantes muoter
there was the.GEN Saviour.GEN mother
‘The mother of the Saviour was also there’ (T 81, 15)
Latin erat mater ihesu & ibi
As a result of this unification process, the preverbal position cannot be identified
with any specific information-structural category anymore and is neutralized
leading to V2 in modern German declaratives.
Note that there was a different preference for one or the other structure in
(32) among the different scribes of the Tatian text. Although it has to be clarified
if the scribes are the actual translators of the text, we can detect some interesting
patterns. First, within the text portion supplied by the scribe ε , there is a 100 per
cent of consistency as to using the structure in (32b) in sentences indicating a
change of speaker in dialogue. The investigation of the same amount of text in
the portions of three different scribes reveal quite different preferences for V1
against thô+V2 in sentences with verbs of saying, namely 16 to 3 for scribe α ,
3 to 9 for scribe β and 1 to 12 for scribe ζ , respectively.
The fact that we encounter variation within one and the same functional
domain indicates a language change in progress. In the framework of Lightfoot
(1999), language change is viewed as a new type of parameter setting in the
internal grammar of young generations of speakers resulting from a shift in the
frequency relation of competing structures in the input data during language
acquisition. In this sense, the existence of competing structures in the domain
of sentences attributed to the coordinative type of discourse relations can be
viewed as a pre-condition and indication of language change.
Rhetorical relations and verb placement in Old High German 195

7. Conclusion

In the Old High German (OHG) Tatian text we find systematic variation be-
tween V1 and V2 clauses that is pragmatically driven. In particular, the distri-
bution of V1 and V2 clauses correlates with coordination and subordination as
the two basic types of discourse relations in the framework of SDRT by Asher
and Lascarides (2003). First, instances of V2 are regularly found in structures
providing additional descriptive or explanatory information on a discourse ref-
erent representing the topic of the sentences. These clauses provide additional
information about elements located higher in discourse structure. From this we
conclude that V2 correlates with elaboration and continuation, more precisely
with the realization of subordination in discourse structure.
In contrast, V1 comes in two main functions signaling main line sequential-
ity and progress in narration: i) it provides information which constitutes the
basis for a subsequent elaboration on a lower level of discourse hierarchy; or
ii) it signals that a previous chain of subordinative units is suspended and that
the discourse returns to the main line of the story. In both cases, we assign to
V1 properties of coordination in discourse.
Thus, we arrive at the conclusion that verb placement in the earliest stages
of German was governed by pragmatic, more precisely, by discourse-related
properties. Our main claim is that at a certain stage in the history of the Ger-
manic languages, the position of the verb was a means for distinguishing the
type of rhetorical relation which the sentence holds with respect to the previ-
ous context. In this way, word order and verb placement were involved in the
creation of dynamic text structure and discourse coherence.

Acknowledgements
This is a modified and extended version of Hinterhölzl and Petrova (2005).
The investigation was conducted during our work in Research Project B4 of
the Collaborative Research Center 632 “Information Structure” at Humboldt-
University Berlin8 and was presented at the International Workshop on Salience
in Discourse in Chorin, Germany, on 5th–8th Oct 2005. We are grateful to all
participants in the workshop for a fruitful exchange of ideas. For useful com-
ments and discussions we also thank Karin Donhauser, Milena Kühnast, Sonja
Linde, Michael Solf, Eva Schlachter as well as two anonymous reviewers.

8. Project URL: http://www.linguistik.hu-berlin.de/sprachgeschichte/forschung/in


formationsstruktur/index.php; http://www.sfb632.uni-potsdam.de/projects_b4ger.
html).
196 Roland Hinterhölzl and Svetlana Petrova

Appendix

Figure 1. The beginning of Luke 2, 8 in the manuscript of the St. Gallen, Stiftsbibl. Cod.
56. Facsimile, pag. 35. In Sonderegger (2003, 130).

Figure 2. The same part of the text in the edition of Masser (1994, 85).

References

Asher, Nicholas and Alex Lascarides


2003 Logics of Conversation. Cambridge: Cambridge University Press.
Asher, Nicholas and Laure Vieu
2005 Subordinating and coordinating discourse relations. Lingua 115:591–
610.
Axel, Katrin
2007 Studies in Old High German Syntax. Left sentence periphery, verb
placement and verb-second, Amsterdam/Philadelphia: John Benja-
mins Publishing Company.
Bästlein, Ulf Christian
1991 Gliederungsinitialen in frühmittelalterlichen Epenhandschriften.
Studie zur Problematik ihres Auftretens, ihrer Entwicklung und Funk-
tion in lateinischen und volkssprachlichen Texten der Karolinger-
und Ottonenzeit. Frankfurt a. M.: Peter Lang.
Rhetorical relations and verb placement in Old High German 197

Bosch, Peter, Tom Rozario and Yufan Zhao


2003 Demonstative Pronouns and Personal Pronouns. German der vs. er. In
Proceedings of the EACL Workshop on the computational treatment
of anaphora in Budapest 2003.
Bosch, Peter and Carla Umbach
2007 Reference Determination for Demonstrative Pronouns. ZAS Papers
in Linguistics 48:39–51.
Chiarcos, Christian
this vol. The Mental Salience Framework: Context-adequate generation of re-
ferring expressions. this volume, 105–139.
Chiarcos, Christian, Stefanie Dipper, Michael Götze, Ulf Leser, Anke Lüdeling, Julia
Ritz and Manfred Stede
2008 A Flexible Framework for Integrating Annotations from Different
Tools and Tagsets. TAL (Traitement automatique des langues) 49.
Claus, Berry
this vol. Establishing salience during narrative text comprehension: A simula-
tion view account. this volume, 291–277.
Dik, Simon C.
1989 The Theory of Functional Grammar. Part I: The Structure of the
Clause. Dordrecht: Foris Publications.
Dittmer, Arne and Ernst Dittmer
1998 Studien zur Wortstellung - Satzgliedstellung in der althochdeutschen
Tatianübersetzung. Göttingen: Vandenhoeck & Ruprecht.
Drubig, H. Bernhard
1992 Zur Frage der grammatischen Repräsentation thetischer und kategori-
scher Sätze. Linguistische Berichte Sonderheft 4 / 1991–92:142–195.
Eggers, Hans ed.
1964 Der althochdeutsche Isidor. Nach der Pariser Handschrift und den
Monseer Fragmenten. Altdeutsche Textbibliothek 63. Tübingen: Nie-
meyer.
Endriss, Cornelia and Stefan Hinterwimmer
2007 Direct and Indirect Aboutness Topics. In The notions of information
structure, eds. Caroline Féry, Gisbert Fanselow and Manfred Krifka,
83–96. Potsdam: Universitätsverlag.
Fanselow, Gisbert
2003 Münchhausen-style head movement and the analysis of verb-second.
In Syntax at Sunset: Head movement and Syntactic Theory. UCLA
Working Papers in Lingusitics 10, ed. Anoop Mahajan, 40–76.
198 Roland Hinterhölzl and Svetlana Petrova

Filchenko, Andrey Y.
this vol. Parenthetical Agent-demoting Constructions in Eastern Khanty: Dis-
course Salience vis-à-vis Referring Expressions. this volume, 57–79.
Fleischer, Jürg
2006 Zur Methodologie althochdeutscher Syntaxforschung. Beiträge zur
Geschichte der deutschen Sprache und Literatur 128:25–69.
Fleischer, Jürg, Roland Hinterhölzl and Michael Solf
2008 Zum Quellenwert des AHD-Tatian für die Syntaxforschung: Über-
legungen auf der Basis von Wortstellungsphänomenen. Zeitschrift für
germanistische Linguistik 36:210–239.
Fourquet, Jean
1974 Genetische Betrachtungen über den deutschen Satzbau. In Studien
zur deutschen Literatur und Sprache des Mittelalters. Festschrift für
Hugo Moser zum 65. Geburtstag, eds. Werner Besch, Günther Jung-
bluth, Gerhard Meissburger and Eberhard Nellmann, 314–323. Berlin:
Erich Schmidt.
Frey, Werner
2000 Über die syntaktische Position der Satztopiks im Deutschen. ZAS Pa-
pers in Linguistics 20:137–172.
Grüning, André and Andrej A.Kibrik
2005 Modelling Referential Choice in Discourse: A Cognitive Calculative
Approach and a Neutral Network Approach. In Anaphora Process-
ing. Linguistic, Cognitive and Computations Modelling, eds. António
Branco, Tony McEnery and Ruslan Mitkov, 163–197. Amsterdam /
Philadelphia: John Benjamins Publishing Company.
Gundel, Jeanette K., Nancy Hedberg and Ron Zacharsky
1993 Cognitive status and the form of referring expressions in discourse.
Language 69 (2):274–307
Hinterhölzl, Roland and Svetlana Petrova
2005 Rhetorical Relations and Verb Placement in Early Germanic Lan-
guages. Evidence from the Old High German Tatian translation (9th
century). In Salience in Discourse. Multidisciplinary Approaches to
Discourse, eds. Manfred Stede, Christian Chiarcos, Michael Grabski
and Luuk Lagerwerf, 71–79. Münster: Stichting / Nodus.
Hinterhölzl, Roland, Svetlana Petrova and Michael Solf
2005 Diskurspragmatische Faktoren für Topikalität und Verbstellung in
der althochdeutschen Tatianübersetzung (9. Jh.). In Interdisciplinary
Studies on Information Structure (ISIS) 3, eds. Shinichiro Ishihara,
Michaela Schmitz and Anne Schwarz, 143–182. Potsdam: Univer-
sitätsverlag.
Rhetorical relations and verb placement in Old High German 199

Hopper, Paul J.
1979a Some Observations on the Typology of Focus and Aspect in Narrative
Language. Studies in Language 3.1:37–64.
Hopper, Paul J.
1979b Aspect and Foregrounding in Discourse. In Syntax and Semantics, ed.
Talmy Givón, 213–241. San Diego / New York / Berkeley / Boston /
London / Sydney / Tokyo / Toronto: Academic Press, INC.
Karttunen, Lauri
1976 Discourse Referents. In Syntax and Semantics 7: Notes from the Lin-
guistic Underground, ed. James McCawley, 363–385. New York /
San Francisco / London: Academic Press.
Kemenade, Ans van
1987 Syntactic Case and Morphological Case in the History of English.
Dordrecht: Foris Publications.
Krasavina, Olga
this vol. Demonstratives and salience: Towards a functionally taxonomy. this
volume, 31–55.
Krifka, Manfred
2007 Basic notions of information structure. In The Notions of Information
Structure, eds. Caroline Féry, Gisbert Fanselow and Manfred Krifka,
13–56. Potsdam: Universitätsverlag.
Lambrecht, Knud
1994 Information structure and sentence form. Topic, focus and the mental
representations of discourse referents. Cambridge: Cambridge Uni-
versity Press.
Lenerz, Jürgen
1984 Syntaktischer Wandel und Grammatiktheorie. Eine Untersuchung an
Beispielen aus der Sprachgeschichte des Deutschen. Tübingen: Max
Niemeyer.
Lightfoot, David
1999 The Development of Language. Acquistion, Change, and Evolution.
Malden / Oxford: Blackwell.
Mann, William C., and Sandra A.Thompson
1988 Rhetorical Structure Theory: Toward a functional theory of text orga-
nization. Text. An interdisciplinary journal for the study of discourse
8:243–281.
Masser, Achim ed.
1994 Die lateinisch-althochdeutsche Tatianbilingue Stiftsbibliothek St. Gal-
len Cod. 56. Göttingen: Vandenhoeck & Ruprecht.
200 Roland Hinterhölzl and Svetlana Petrova

Masser, Achim
1997a Syntaxprobleme im althochdeutschen Tatian. In Semantik der syntak-
tischen Beziehungen. Akten des Pariser Kolloquiums zur Erforschung
des Althochdeutschen 1994, ed. Yvon Desportes, 123–140. Heidel-
berg: Carl Winter.
Masser, Achim
1997b Wege zu gesprochenem Althochdeutsch. In Grammatica Ianua Ar-
tium. Festschrift für Rolf Bergmann zum 60. Geburtstag, eds. Elvira
Glaser and Michael Schlaefer, 49–70. Heidelberg: Carl Winter.
Molnár, Valéria
1993 Zur Pragmatik und Grammatik des TOPIK-Begriffs. In Wortstellung
und Informationsstruktur, ed. Marga Reis, 155–202. Tübingen: Max
Niemeyer.
Önnerfors, Olaf
1997 Verb-erst-Deklarativsätze. Grammatik und Pragmatik. Stockholm:
Almquist & Wiskell International.
Petrova, Svetlana
2006 A discourse-based approach to verb placement in early West-Germa-
nic. In Interdisciplinary Studies on Information Structure (ISIS) 5,
eds. Shinichiro Ishihara, Michaela Schmitz and Anne Schwarz, 153–
185. Potsdam: Universitätsverlag.
Petrova, Svetlana and Michael Solf
2008 Rhetorical relations and verb placement in early Germanic. A cross
linguistic study. In ‘Subordination’ vs. ‘coordination’ in sentence and
text – from a cross-linguistic perspective, eds. Cathrine Fabricius-
Hansen and Wiebke Ramm, 329–351. Amsterdam / Philadelphia:
John Benjamins Publishing Company.
Petrova, Svetlana and Michael Solf
2009 Die Entwicklung von Verbzweit im Fragesatz. Die Evidenz im Al-
thochdeutschen. Beiträge zur Geschichte der deutschen Sprache und
Literatur 131 (1):6–49.
Petrova, Svetlana and Michael Solf
2010 Pronominale Wiederaufnahme im ältesten Deutsch. Personal-
vs. Demonstrativpronomen im Althochdeutschen. In Historische
Textgrammatik und historische Syntax des Deutschen: Traditionen,
Innovationen, Perspektiven, ed. Arne Ziegler and Christian Braun,
339–365, Berlin: Walter de Gruyter.
Petrova, Svetlana, Michael Solf, Julia Ritz, Christian Chiarcos and Amir Zeldes
2009 Building and using a richly annotated interlinear diachronic corpus:
the case of Old High German Tatian. In Natural Language Process-
ing for Ancient Languages, eds. Joseph Denooz and Serge Rosmor-
Rhetorical relations and verb placement in Old High German 201

duc, special issue of Traitement Automatique des Langues / Natural


Language Processing 5 (2):47–71
Prince, Ellen F.
1981 Toward a Taxonomy of Given-New Information. In Radical Prag-
matics, ed. Peter Cole, 223–255. New York: Academic Press.
Ramers, Karl Heinz
2005 Verbstellung im Althochdeutschen. Zeitschrift für Germanistische
Linguistik 33:78–91.
Ramm, Wiebke
this vol. Discourse-structural salience from a cross-linguistic perspective: Co-
ordination and its contribution to discourse (structure). this volume,
143–173.
Ries, John
1880 Die Stellung von Subject und Prädicatsverbum im Heliand. Nebst
einem Anhang metrischer Excurse: Quellen und Forschungen zur
Sprach- und Culturgeschichte der germanischen Völker. Strassburg /
London: Karl J. Trübner.
Robinson, Orrin W.
1994 Verb-First Position in the Old High German Isidor Translation. Jour-
nal of English and Germanic Philology 93:356–373.
Sasse, Hans-Jürgen
1995 “Theticity” and VS order: A case study. Sprachtypologie und Univer-
salienforschung 48, 1/2:3–31.
Schrodt, Richard
2004 Althochdeutsche Grammatik II. Syntax: Sammlung kurzer Grammati-
ken germanischer Dialekte. A. Hauptreihe, Nr. 5/2. Tübingen: Max
Niemeyer.
Simmler, Franz
1998 Makrostrukturen in der lateinisch-althochdeutschen Tatianbilingue.
In Deutsche Grammatik. Thema in Variationen. Festschrift für Hans-
Werner Eroms zum 60. Geburtstag, eds. Karin Donhauser and Lud-
wig M. Eichinger, 299–335. Heidelberg: Carl Winter.
Sonderegger, Stefan
2003 Althochdeutsche Sprache und Literatur. Eine Einführung in das äl-
teste Deutsch. Darstellung und Grammatik. 3. durchgesehene und
wesentlich erweiterte Auflage. Berlin, New York: Walter de Gruyter.
Tomaselli, Alessandra
1995 Cases of Verb Third in Old High German. In Clause Structure and
Language Change, eds. Adrian Battye and Ian Roberts, 345–369.
New York / Oxford: Oxford University Press.
202 Roland Hinterhölzl and Svetlana Petrova

Zeldes, Amir, Julia Ritz, Anke Lüdeling and Christian Chiarcos


2009 ANNIS: A Search Tool for Multi-Layer Annotated Corpora. In Pro-
ceedings of Corpus Linguistics 2009, July 20–23, Liverppool, UK.
Part III.
Beyond purely linguistic salience
Visual salience and the other one

John D. Kelleher

This paper describes a salience based approach to visually situated reference


resolution. The framework uses the relationship between referential form and
preferred mode of interpretation as a basis for a weighted integration of lin-
guistic and visual salience scores for each entity in the multimodal context. The
resulting integrated salience scores are then used to rank the candidate referents
during the resolution process, with the candidate scoring the highest selected as
the referent. One advantage of this approach is that the resolution process oc-
curs within the full multimodal context, in so far as the referent is selected from
a full list of the objects in the multimodal context. As a result situations where
the intended target of the reference is erroneously excluded, due to an individual
assumption within the resolution process, are avoided.

1. Introduction

In a dialog human participants expect their discourse partner to construct and


maintain a model of the evolving discourse context. This discourse model pro-
vides a context against which the references in the dialog can be understood.
A referring expression is a natural language expression that denotes an entity,
called a referent, in the discourse context. Each referring expression introduces
a representation into the semantics of its utterance and this representation must
be bound to an element in the context for the utterance’s semantics to be fully
resolved.
Situated dialog occurs within a shared visual context. In a situated dialog,
a referring expression may denote not only entities introduced through the lin-
guistic interaction but also entities within the spatio-temporal context of the di-
alog. As a result, the scope of the discourse model must expand to include both
linguistic and perceptually available entities and the resolution process must be
able to match linguistic referring expressions to both linguistic and perceptual
entities.
Referring expressions come in a variety of forms including: definite descrip-
tions, indefinites, pronouns, demonstratives. Referring expressions that access
206 John D. Kelleher

11.7: M : I see an engine i and a boxcar j both at Elmira k


12.1: S: right
13.1: M: this looks like the best thing to do
13.2: M: so we should get
13.3: M: ... the eng / engine i to picks up the boxcar j
13.4: M: and head for Corning l
13.5: M: ’s that sound reasonable
14.1: S: sure

Figure 1. Excerpt from dialog d91-1.1 of the TRAINS-91 corpus.

Figure 2. Map of TRAINS Domain

a representation in the linguistic context are interpreted anaphorically. Refer-


ring expressions that access a representation of an object that has not previously
been referred to in the dialog but has entered the context through a non-linguistic
modality (such as vision) are interpreted exophorically.
The dialog excerpt listed in Figure 1, taken from the TRAINS corpus, Allen
and Schubert (1991); Heeman and Allen (1995), illustrates the distinction be-
tween anaphoric and exophoric references. The excerpt is taken from a collab-
orative dialog between two participants, M and S, who are trying to ship goods
within a railroad freight system. Figure 2 illustrates the schematic representa-
tion of the railroad freight system that provided the visual context for the dialog.
In this example, the indices i, j, k and l indicate that all the referring expressions
marked by a particular index refer to the same entity.
The references an engine, a boxcar, and Elmira in 13.4 and Corning in 11.7
are examples of exophoric references. The entities these expressions denote
have not been previously mentioned in the dialog. As a result, these references
must be resolved relative to a set of representations in the context model that
Visual salience and the other one 207

entered the model via the non-linguistic modalities, in this instance the visual
context of the dialog. By contrast, the references the engine and the boxcar in
13.3 are examples of anaphoric references. The reference the engine can be re-
solved relative to the linguistic context by binding it to the representation of an
engine introduced to the linguistic context by the resolution of 11.7. Similarly,
the reference the boxcar can be resolved relative to the linguistic context by
binding it to the representation of a boxcar introduced by the resolution of 11.7.
Most forms of referring expression have a preferred mode of interpretation,
for example, anaphoric, exophoric. For example, pronouns are typically in-
terpreted anaphorically. However, there is no one-to-one relationship between
form and mode of interpretation. For example, definite descriptions can be used
either anaphorically or exophorically. Indeed, the two most common cases of
definite descriptions in the TRAINS corpus of situated dialogue were anaphoric
and exophoric definites (Poesio 1993). One consequence of the one-to-many
relationship between referential form and mode of interpretation is that a multi-
modal reference resolution process should define a strategy to deal with cases
where different mode of interpretations are suggested for the same reference.
One solution, to this issue, is to define a preference ordering over the dif-
ferent interpretation rules. In Sect. 2 several reference resolution frameworks
that adopt such an approach are reviewed: Poesio (1993); Kievit et al. (2001);
Salmon-Alt and Romary (2001); Landragin and Romary (2003); Gorniak and
Roy (2004); Kelleher et al. (2005).
In contrast with these rule based approaches, in this paper, we develop a
probabilistic framework that addresses the issue of selecting between different
modes of interpretation through a saliency based modelling of the attentional
spread across the set of entities in the discourse domain. This attention based ap-
proach is inspired by psycholinguistic findings, see Sect. 2, that point to a strong
interaction between attention and linguistic reference. We will use the concept
of salience to describe the factors and associated processes that direct attention.
The framework consists of a set a salience models, one for each modality, and
an reference resolution process that for each entity in the discourse context com-
putes an overall attention score by integrating the scoring of that entity by each
of the salience models. The attention score assigned to an entity represents the
probability of that entity being the intended referent of the referring expression.
Thus, the entity with the highest overall attention score after the processing of
a referring expression is selected as the referent for that reference.
The paper is structured as follows: Section 2 reviews related work; Sect. 3
describes the reference resolution framework; Sect. 4 describes an implemen-
tation of the framework; Sect. 5 contains a worked example illustrating how
208 John D. Kelleher

the framework functions; the paper finishes, in Sect. 6, with conclusions and
future work.

2. Related work

In recent years a number of psycholinguistic experiments have pointed to the


interaction between language and vision. For example, Spivey-Knowlton et
al. (1998) and Tanenhaus et al. (1995) indicate that language comprehension
affects visual attention. More recently, the interaction between visual atten-
tion and linguistic reference has been highlighted. Studies, such as Duwe and
Strohner (1997) and Strohner et al. (2000), have shown that people often use
perceptual salience to resolve linguistic references.
These experimental findings support attention based theories of discourse
processing. Grosz (1977) is arguably the seminal work on language and vision
integration. Grosz’s work highlighted that attention constrained and structured
the processing of discourse. Moreover, Grosz was the first to observe the rela-
tionship between focus of attention and the use of exophoric definite descrip-
tions: when an object is in the current mutual focus of attention it can be referred
to by means of a definite description even though other objects fulfilling the de-
scription have been introduced into the linguistic discourse or are present in the
shared visible context.
Building on this work, Grosz and Sidner (1986) developed a focus stack
model of global discourse attentional state. According to this model the com-
mon ground1 can be divided into three parts: the linguistic structure, which con-
tains information about the linguistic structure of utterances in the dialog; the
intentional structure, which contains information about the goals of the partici-
pants in the conversation; and the attentional structure, which contains informa-
tion about the objects introduced into the discourse and their relative salience.
Furthermore, due to attentional constraints, discourse is segmented or chunked
and when a definite description is used anaphorically, the only antecedents2
considered are those in the same discourse segment.
Assuming Grosz and Sidner’s (1986) focus stack model to be generally cor-
rect as a model of global discourse structure,3 the issue of how focus of attention
and reference interact within a discourse segment must still be addressed.

1. The dialog participants mutually developed public view of what they are talking
about.
2. The antecedent of an anaphoric reference is the representation of the reference’s
referent that was introduced to the discourse model by a prior referring expression.
3. For alternate models see Hobbs (1985); Mann (1987); Asher and Lascarides (2003).
Visual salience and the other one 209

Several theories of discourse reference have attempted to address this issue


by providing accounts of the relationship between types of referential expres-
sions on the one hand, and degrees of mental activation of discourse referents
on the other (e.g. Alshawi 1987, Ariel 1990, Gundel et al. 1993, Hajicová 1993,
Lappin and Leass 1994, Grosz et al 1995).4 A common theme among these ac-
counts is that referential expressions need more coding material as the referent
is less activated. However, none of these models explicitly accommodate multi-
modal contexts.
Poesio (1993) reformulates the attentional model of Grosz and Sidner (1986)
in situation theoretic terms. Interestingly, Poesio’s framework separates the at-
tentional common ground into several anchoring resource situations. For ex-
ample, one anchoring resource is called the discourse situation and consists of
a record of what has been said. This anchoring resource is used to interpret
anaphoric references. Another anchoring resource situation called the situation
of attentionmodels the subset of information in the visual field of the discourse
participants that they are attending to and is used to interpret exophoric5 def-
inite descriptions. Furthermore, he defines rules within a default logic, called
principles for anchoring resource situations, that predict whether a definite de-
scription is going to be interpreted anaphorically or exophorically. However,
one of the issues with this approach is how to deal with conflicting defaults.
Consequently, the framework cannot handle situations in which two principles
of anchoring resource situations apply, one suggesting an anaphoric interpreta-
tion the other an exophoric interpretation.
Many computational frameworks for multi-modal reference resolution have
also been developed. Recent systems that focus on multi-modal reference reso-
lution include: Kievit et al. (2001), Salmon-Alt and Romary (2001), Landragin
and Romary (2003), Gorniak and Roy (2004) and Kelleher et al. (2005).
Kievit et al. (2001) define separate resolution strategies for each form of re-
ferring expression. A strategy consists of one or more resolution steps applied
in a predefined order. A resolution step consists of 4 stages: (1) the selection of
possible referents from a single sub-context (dialog, visual domain, etc.), (2) the
filtering of this set of candidates, (3) the ordering of the candidates based on
saliency, (4) an evaluation of the result. The algorithm halts as soon as one of
the resolution steps finds a unique object or finds several objects and cannot
choose which is the intended one. This approach is equivalent to a preference

4. See Kruijff-Korbayová and Hajicová (1997) for a comparison of several of these


approaches.
5. Poesio uses the term visible situation use to describe to exophoric definite descrip-
tions.
210 John D. Kelleher

ordering being defined over the different modes of interpretation for each form
of reference. One issue with this approach is that the set of candidates consid-
ered during any one resolution step is constrained to the set of entities within
the sub-context the resolution step uses to construct the initial set of candidates.
As a result, the system cannot recognise situations where a reference may be
ambiguous between two entities in different sub-contexts, and, consequently, it
may resolve a reference incorrectly rather than initiate a clarification process.
Gorniak and Roy (2004) focus on the resolution of references containing
spatial descriptions. They propose a feed-forward filtering process to reference
resolution. In their framework, each lexical item in the system’s lexicon is asso-
ciated with one or more composer functions. A composer function takes one or
more candidate referents as input and filters this set of candidates by computing
how well each of the candidates fulfils the semantic model defined for the lexi-
cal term. Reference resolution is carried out by chaining the composer functions
associated with the lexical terms in the reference together, i.e. the filtered set of
candidates output by one composer function is used as the input set by the next
composer function in the chain. Gorniak and Roy note that this strategy can
fail if one of the composer functions excludes the target object from the set of
candidates. For example, when interpreting “the leftmost one in the front” the
composer for “leftmost” selects the leftmost objects in the scene, not including
the obvious example of “front” that is not a good example of “leftmost”.
The reference resolution frameworks presented in Salmon-Alt and Romary
(2001), Landragin and Romary (2003) and Kelleher et al. (2005) use the notion
of a reference domain. A reference domain is a structured contextual subset of
the multi-modal dialog context. Reference domains are created in the context
model due to perceptual or linguistic events or conceptual knowledge and are
intended to reflect the mental representation of the event they model. In these
frameworks the resolution process involves: (1) the construction of an under-
specified reference domain, using templates associated with the form of the
reference given; (2) the unification of this underspecified domain with a suit-
able reference domain within the context model; (3) the selection of one of the
elements within the unified reference domain to function as the referent. How-
ever, similar to the frameworks proposed in Kievit et al. (2001) and Gorniak
and Roy (2004), there is the potential for these frameworks to overcommit to a
particular subset of the context during the resolution process. As the resolution
process occurs within a sub-context, whose selection is at least partially driven
by the form of the reference being interpreted, if the wrong reference domain
is selected the intended target object and/or plausible distractor referents, that
may indicate the need for reference clarification, may be excluded from consid-
eration.
Visual salience and the other one 211

3. Approach

Resolving a referring expression involves two main tasks:


1. creating and maintaining a model of the discourse context (this model should
contain representations for all the entities that are available for reference)
2. matching/binding the representation introduced by a given referring expres-
sion to an element (or elements) in the set of possible referents
The matching or binding of a referring expression’s representation to a ele-
ment in the discourse model depends to a large extent on the how the system
searches the model. The multi-modal frameworks reviewed in Sect. 2 search the
discourse model by incrementally identifying a subset of the model that fulfils
a set of constraints defined by the form of reference being resolved and then
selecting an element within that subset as the referent.
Building on the psycholinguistic evidence and theoretical models of dis-
course, reviewed in Sect. 2, that relate attention and linguistic reference, we
propose a reference resolution framework that treats a given referring expres-
sion as a set of instructions that directs how the spread of attention across the set
of objects within the discourse context should be modified before the selection
of the referent. The binding of the representation of the referring expression
consists of selecting the entity with the highest attention within the updated
context as the referent.
A concept closely related to attention is salience. In this paper, the concept
of salience is used to describe the factors and associated processes that direct
attention. From a computational perspective, a model of an entity’s salience
within a context predicts the amount of attention that will be given to that en-
tity. Moreover, the mechanism that drives the focusing of attention towards the
intended referent during the interpretation of a referring expression is the align-
ing of the parameters of the salience models underpinning the framework with
the features described within the reference. For example, interpreting a refer-
ring expression such as the blue house will result in all the blue entities in the
context becoming more salient.
Once the salience models underpinning the framework have been updated
to reflect the selection preferences encoded in the referring expression, the next
stage in the interpretation process is to integrate the salience scores for each en-
tity in the discourse model. This integration enables the framework to predict the
overall focus of attention within context provided by the referring expression.
Reflecting the relationship between the form of referring expression and pre-
ferred mode of interpretation, noted in Sect. 1, the integration process weights
the salience scores of an entity before integration. Equation 1 defines how the
212 John D. Kelleher

linguistic and visual salience models are integrated. The weighting factors α
and β accommodate the preferential relationship between the form of referring
expression and the mode of interpretation. These weights range in value be-
tween 0 and 1. As an example of the possible values these weights might take,
when resolving a pronominal reference setting α = 0.1 and β = 0.9 would pref-
erence the resolution process towards entities with a high linguistic salience.
Ideally, however, these weights should be derived from an empirical analysis
of the preferred mode of interpretation for each form of reference. The resulting
integrated salience scores are then used to rank the candidate referents during
the resolution process, with the candidate scoring the highest selected as the
referent.
IS = ((α ∗ VS + β ∗ LS))/2 (1)
The framework distinguishes three stages of salience integration.
Stage 1 This stage computes the salience of the entities in the context prior to
the processing of a given referring expression. It includes the basic visual
salience (i.e. the prominence of an entity due to bottom up visual cues)
and linguistic salience (i.e. the prominence of an entity due to previous
discourse) of an object.
Stage 2 This stage computes the salience of each entity within the context pro-
vided by the referring expression being resolved. For each entity two
salience scores result from this stage of processing: a reference relative
visual salience and a reference relative linguistic salience. These salience
scores are computed by integrating each entities basic visual salience and
basic linguistic salience with a rating of how well the entity fulfils the
description provided in the referring expression.
Stage 3 For each entity in the context this stage computes an overall attention
score by integrating the entity’s salience scores that resulted from Stage
2. This overall salience is computed using a weighted integration, where
the weights used reflect the biasing associated with different forms of ref-
erence toward a particular information source.
The flow of information during reference resolution is from Stage 1 to Stage 3.
Algorithm 1 lists the basic steps in reference resolution.

Algorithm 1 Reference Resolution Algorithm


1. compute the reference relative saliences for each object in the context
2. compute the attention score for each object in the context
3. return the object with the highest attention score as the referent
Visual salience and the other one 213

One advantage of this approach is that resolution process occurs within the
full multi-modal context of the dialog, in so far as the the referent is selected
from a full list of the objects in the multi-modal context ordered by a model
of integrated salience. Consequently, none of the objects in the context are ex-
cluded from consideration. As a result situations where the intended target of the
reference is erroneously excluded, due to an individual assumption within the
resolution process, are avoided. Also, the framework can recognise cross-modal
ambiguity by comparing the integrated salience of the primary candidate with
the integrated salience of all the other objects in the context. In these ambigu-
ous cases the initiation of a clarification dialog may be a better system response
rather than the selection of the primary candidate referent. By contrast, many of
the previous multi-modal resolution frameworks exclude entities in the multi-
modal context model from consideration before the selection of the referent. In
some cases, for example Kievit et al. (2001); Salmon-Alt and Romary (2001);
Landragin and Romary (2003); Kelleher et al. (2005), the initial set of candi-
dates referents is restricted to a sub-set of the context based on preferences with
respect to the mode of interpretation relative to the form of reference. In other
frameworks, for example Gorniak and Roy (2004), candidate referents are in-
crementally excluded from consideration as the resolution process progresses
due to the sequential manner that the semantics of the terms within the reference
are processed.
Moreover, from a functional perspective this approach has the advantage of
modularity and the potential to accommodate learning within the system. The
modularity of the framework stems from the fact that the only information re-
quired by the resolution process from each of the information sources (language
and vision) within the context are the salience scores for each entity. As a result,
the resolution process is, to a large extent, decoupled from the representations
and processes used within the linguistic and visual context models. The learn-
ing aspect of the system arises from the ease (relative to rule based approaches)
with which the integration weightings associated with a particular form of ref-
erence could be updated, for example, using machine learning techniques such
as reinforcement learning. Finally, from a cognitive perspective, an attention
based model fits the theoretical and psychological data that points to the role of
attention within human reference resolution.

4. Implementation

The resolution framework described in Sect. 3 has been used to update the LIVE
system’s (Kelleher et al. 2005) approach to reference resolution. In the follow-
214 John D. Kelleher

ing sections the data structures, salience algorithms and reference resolution
algorithms used by the updated system are described.

4.1. Data Structures


The basic data structure used by the framework is called an coreference class.
Each coreference class stores the saliency information for one object in the con-
text model. Figure 3 illustrates the internal structure of a coreference class. The
coreference class id is a unique string identifier. Each coreference class con-
tains components for storing the basic and reference relative visual and linguis-
tic salience scores and the integrated salience scores for the object the class
represents in the context model.

id = String value
visual salience
a) basic = [0 . . .1]
b) reference-relative = [0 . . .1]
linguistic salience
a) basic = [0 . . .1]
b) reference-relative = [0 . . .1]
integrated salience = [0 . . . 1]
Figure 3. A Coreference Class

New coreference classes are added to the context model as a result of visual pro-
cessing. Each time an object is detected in the visual scene the context model is
queried for the coreference class representing the object. If there is no corefer-
ence class for the object model a new coreference class is created and is assigned
the id used by the vision processing. The basic visual salience component is ini-
tialised to the value created by vision processing when the object was detected.
This is updated after each scene is rendered. All the other salience scores are
initialised to 0. These components are updated after each utterance has been
processed. Coreference classes are removed from the context model when both
their basic visual and linguistic saliences fall below a threshold (.0001). In the
following sections the algorithms that provide and use the information stored
in these structures are described.

4.2. Modelling Basic Visual Salience


Most computational models of visual attention focus on bottom-up processing,
see Koch and Itti (2001) and Heinke and Humphreys (2004) for reviews. In
most of these models several feature maps (such as colour, intensity etc.) are
Visual salience and the other one 215

computed in parallel across the visual field and these are then combined into
a single saliency map. Then a selection process deploys attention to locations
in decreasing order of salience. In Kelleher and van Genabith (2004) a simple
model of visual salience (based on object size and centrality relative to a focus
of visual attention) was presented. In this paper we adopt use this model to
capture the information entering the discourse through vision.
The visual salience algorithm uses a false-colouring technique. Each object
in the simulation is assigned a unique colour or vision-id. This colour differs
from the normal colours used to render the object in the world; hence the term
false colouring. Each frame is rendered twice: firstly, using the objects’ normal
colours, textures and shading, and secondly, using the vision-ids. The first ren-
dering is on screen (i.e. the user sees it), the second rendering may be off screen.
After each frame is rendered, a bitmap image of the false colour rendering is
created. The bitmap is then scanned and a list of the colours in the image is
created. Using this list the system can recognise which objects are visible and
which are not. Moreover, the system can identify, at the pixel level, the area
covered by each object in the scene. This pixel information is used to compute
the basic visual saliency of each object.
Mimicking the spread of visual acuity across the retina, the algorithm
weights each pixel in the image based on its distance from the point of visual fo-
cus. The weighting is computed using Equation 2. In this equation, D equals the
distance between the pixel being weighted and the point of focus, M equals the
maximum distance between the point of focus and any point on the border of the
image. The point of focus can be determined using eye tracking technology to
compute the user’s gaze at each scene rendering. However, if eye tracking is not
being used the point of focus defaults to the center of the image or to the center
of silhouette of the last object referred to. Algorithm 2 lists the procedure used
to compute basic visual saliency and to update the coreference classes. For each
scene processed the algorithm returns a list of objects in the scene each with a
relative salience between 0 and 1, with 1 representing maximum salience.

D
Weighting = 1 − (2)
M+1

4.3. Modelling Basic Linguistic Salience


The basic linguistic salience of objects in the context are computed using an al-
gorithm that is similar to Krahmer and Theune (2002) . The algorithm is based
on the ranking of the so-called forward looking centers (Cf ) of an utterance.
The set of forward looking centers of an utterance contains the objects referred
216 John D. Kelleher

Algorithm 2 The basic visual salience algorithm.


for each object Oi in the scene do
AW (Oi ) = average weighting of the pixels covered by Oi
TotalAW = TotalAW + AW (Oi )
endfor
for each coreference class CRi do
if CRi is the coreference class representing Oi then
CRi .basic_visual = (CRi .basic_visual/2) + (AW (Oi )/TotalAW
else
CRi .basic_visual = CRi .basic_visual/2
endif
Totalbvs = Totalbvs + CRi .basic_visual_salience
endfor
for each coreference class CRi
CRi .basic_visual = CRi .basic_visual/Totalbvs
endfor

to in that utterance. This set is partially ordered to reflect the relative promi-
nence of the referring expressions within the utterance. Grammatical roles are
a major factor here, so that subject > object > other. The central compo-
nent of the algorithm is a function sf that maps the objects in a domain D to the
set {0, . . . , 1}, with the intuition that 0 represents non-salience and 1 maximal
salience. Figure 4 defines the salience function sf used by the framework. The
algorithm assumes that in the initial situation s0 all the objects in the domain
are equally (not) salient: sf (s0 , d) = 0 for all d ∈ D.
It should be noted that, although this algorithm is inspired by the Centering
framework of Grosz et al. (1995) it does deviate from Centering is so far as
it is recursive and assigns salience scores to entities not mentioned in the im-

Let Ui be a sentence uttered in state si , in which reference is made to {di , . . . , dn } ⊆ D. Let


Cf (Ui ) (the f orward looking center of Ui ) be a partial order defined over {di , . . . , dn } ⊆
D. Then the salience weight of objects in s i+1 is determined as follows:


⎪ 1 if d = subject(Ui )

(sf (si , d)/2) + .5 if d = object(Ui )
sf (si+1 , d) =

⎪ (sf (s i , d)/2) + .25 if d = other(Ui )

sf (si , d)/2 if d ∈
/ Cf (Ui )

Figure 4. Linguistic Salience Weight Assignment


Visual salience and the other one 217

mediately preceding utterance. In this aspect, it is similar to Hajicová’s (1993)


framework. Also, it is not claimed that the function sf is the best way to assign
linguistic salience. However, it does provides a reasonable, transparent and op-
erational model of linguistic salience. Algorithm 3 defines the procedure used
to update linguistic salience after an utterance has been processed.

Algorithm 3 The basic linguistic salience algorithm. BLS = basic linguistic salience.
Let TotalDS = 0
for each coreference class CRi do
CRi .BLS = sf (sj , CRi )
TotalDS = TotalDS + CCi .BLS
endfor
for each coreference class CCi do
CRi .BLS = CRi .BLS/TotalDS
endfor

4.4. Computing Reference Relative Saliences


The first step in resolving a reference is to compute for each object in the context
the salience of that object within each modality within the context provided by
the reference. These reference relative saliences are computed for each object
by integrating each object’s basic visual and linguistic saliences with a rating of
how well the object fulfils the selectional preferences6 encoded in the reference.
The rating of how well an object fits the description provided by a refer-
ence is called an f-score. Two f-scores are computed for each object for each
reference: a visual and a linguistic f-score. Currently, the system can rate ob-
jects relative to their type, colour, size7 and location.8 Table 1 lists the ratings
ascribed to an object for each type of selectional preference. An object’s vi-
sual f-score is initialised to 0 and its ratings are integrated using addition. An
object’s linguistic f-score is initialised to 1 and its ratings are integrated using
multiplication.
Once the f-scores have been computed the object’s reference relative visual
and linguistic saliences are computed by integrating the f-scores with its ba-
sic visual and linguistic salience. Again, addition is used for integration in the

6. The semantics of the descriptive terms used in the reference.


7. An objects size rating is based on the number of pixels it covers relative to the other
objects in the scene.
8. An objects location rating is computed using the AVS model described in Regier and
Carlson (2001)
218 John D. Kelleher

Table 1. Selectional Preferences Scores


Fulfils Not Fulfill
TYPE 1 0
COLOUR 1 0
SIZE [1 . . .0]
LOCATION [1 . . .0]

visual context and multiplication is used for integration in the linguistic con-
text. Consequently, an object’s reference relative visual salience will be > 0 if
it fulfils any of the selectional preferences in the description, and its linguistic
reference relative salience will be = 0 if it does not fulfill all of the selectional
preferences in the description. Algorithm 4 lists the algorithm for computing
the reference relative saliences.

Algorithm 4 Computing the reference relative saliences. RLS = reference relative


linguistic salience, RVS = reference relative visual salience, BLS = basic linguistic
salience, BVS = basic visual salience.
for each coreference class CRi do
f _scorelinguistic = 1
f _scorevision = 0
for each selectional preference spj in the description do
f _scorelinguistic = f _scorelinguistic ∗ rating(CRi , spj )
f _scorevision = f _scorevision + rating(CRi , spj )
endfor
CRi .RLS = CRi .BLS ∗ f _scorelinguistic
Totalrls = Totalrls + CRi .RLS
CRi .RVS = CRi .BLSl ∗ f _scorevision
Totalrvs = Totalrvs + CRi .RVS
endfor
for each coreference class CRi do
CRi .RLS = CCi .RLS/Totalrls
STATE CCi .RVS = CCi .RVS/Totalrvs
endfor

4.5. Creating the Integrated Context and Selecting the Referent


The final step before the selection of the referent is the integration of each ob-
ject’s reference relative saliencies. This is done using a weighted combination.
The weightings are dependent on the form of referring expression (e.g. defi-
Visual salience and the other one 219

nite descriptions versus pronominal references) being resolved and reflect the
preferential interpretation associated with each type of reference. For exam-
ple, in general, a pronoun is used to refer to a referent that is prominent within
the linguistic context. By contrast, a definite description can be used to refer
to an object from the visual scene and to previously mentioned objects. Ide-
ally, these weights would be set based on an empirical examination of a multi-
modal corpora. However, the system currently uses predefined weights for this
integration. When resolving a definite description visual and linguistic salience
are integrated evenly. When resolving a pronominal reference the integration
weightings used biases towards linguistic salience. Algorithm 5 defines the pro-
cedure used to construct the integrated context and select the reference. It also
defines the mechanism used to check for ambiguous references. This ambigu-
ity check uses a predefined confidence interval and simply checks that within
the context provided by the referring expression the integrated salience of the
object selected as the referent is sufficiently larger than the other objects in the
context to ensure that the reference is not ambiguous. In situations where the
ambiguity check fails the algorithm returns 0.

Algorithm 5 Constructing the integrated context and selecting the references. RVS =
reference relative visual salience, RLS = reference relative linguistic salience.

for each coreference class CRi in the context model do


Let index = 0, max = 0, interval = 0.3
if reference = definite description then
CRi .integrated = (CRi .RVS ∗ 0.5) + (CRi .RLS ∗ 0.5)
elseif reference = pronominal reference then
CRi .integrated = (CRi .RVS ∗ 0.1) + (CRi .RLS ∗ 0.9)
endif
if CRi .integrated > max then
index = i
max = CRi .integrated
endif
endfor
for each coreference class CRj in the context model do
if j <> index then
if CRj .integrated > CRindex .integrated − interval then
return 0 //Reference Deemed Ambiguous
endif
endif
endfor
return CRindex
220 John D. Kelleher

5. Worked Example

The functioning of the framework can be illustrated using a worked example.


The example uses Figure 5 as a visual context, and the utterances (1) through
(4) as the example discourse.
1. make the blue house bigger
2. make it smaller
3. make the red house on the right bigger
4. make the other one bigger
The example will illustrate how the framework interprets a definite description,
a pronominal reference, a locative expression, and an other anaphoric definite
description with a pronominal head. Within a visually situated discourse this last
form of reference is particularly interesting as the interpretation of word like
other and one illustrates the need to closely interrelate visual and linguistics
contexts.
Other-anaphora occurs when a definite description contains the modifier
other, e.g. the other house. Other designates an object that has been excluded
from a specified or implied grouping. For example, the other housepresupposes
that the context contains a set of houses within which a subset has already been
specified in some manner (e.g. the blue house) and the referent of the NP the
other house is then selected from the set of house that are not elements of this
subset (e.g. the houses that are not blue). Importantly, interpreting an other-
anaphoric NP may require both visual and linguistic information. The definition

Figure 5. Example visual context. H1,H3 = red house, H2 = blue house.


Visual salience and the other one 221

of the set of elements that other designates its referent as being excluded from
often requires information from prior discourse and the set of elements that the
referent may be selected from may contain elements from the visual context
that have not been mentioned previously.
It has long been observed that the nominal anaphor one can be resolved only
in relation to the discourse context. However, it gains new importance where
references of the form the green one or the other one get evaluated. For one
thing it is perfectly possible that while the resolution of the pronoun one is to
a noun used in the antecedent discourse, the referent of the noun phrase (NP)
belongs only to the visual context. Thus the red one can refer to a house that
is visible in the scene but has not yet been mentioned, although this is only
because the word house occurs in an earlier utterance that the red one can be
interpreted as the red house.
Neither of these phenomena effect the weightings used to integrate the vi-
sual and linguistic salience. However, one anaphora does effect the selectional
preference used in the computation of the overall salience of its NP and other
anaphora effects the selection of its NP’s referent once the overall saliencies
have been computed.
When interpreting a one-anaphoric NP, the pronoun one is resolved to the
most recent NP in the the discourse that referred to the referent with the highest
linguistic salience in the context model. It inherits all the selectional prefer-
ences specified in its antecedent NP that do not contradict selectional prefer-
ences specified in the one-anaphoric NP. For example, assuming the red house
on the right is the most recent reference in the discourse the reference the green
one is interpreted as the green house on the right.
Other also functions as an anaphor of sorts – it presupposes that the con-
text contains a set X of objects of the relevant kind and a proper subset Y of
X that its referent is excluded from. The referent of the NP containing other is
then selected from the difference between these two sets, X − Y . Consequently,
in order to resolve other anaphora the subset of the context that other specifies
its referent is excluded from (i.e. Y ) must be defined. In this framework this
subset is defined as the containing the element with the highest overall refer-
ential salience computed relative to the selectional preferences encoded in the
NP modified by other. Assuming a reasonable distribution of visual salience,
the updating of linguistic salience and the constraint that an object’s reference
based linguistic salience can only be > 0 if it fulfils all the selectional prefer-
ences encoded in the current NP imposes a partial ordering on the reference
resolution context:
222 John D. Kelleher

1. objects that have previously been mentioned and that fulfil the selectional
preferences encoded in the NP being interpreted (internally ordered by lin-
guistic and visual salience)
2. objects that fulfil the selectional preferences encoded in the NP being inter-
preted (internally ordered by visual salience)
3. other objects in the context (internally ordered by visual salience)
Consequently, by excluding the most salient element in the reference resolution
context the most recently mentioned object that fulfils the selectional prefer-
ences encoded in the NP is excluded. For example, during the interpretation of
a reference the other house the most recently mentioned house would be ex-
cluded from the set of candidate referents and the reference would be resolved
to the next most salient house in the context.
Table 2 lists the data computed by the framework during the different stages
of this interaction. Rows 1, 2 and 3 present the visual and linguistic saliencies
of the objects in Figure 5 before any commands are input; H2’s visual salience
is higher than H1 and H3’s because it is closer to the centre of the image, which
is the default point of visual focus in the visual saliency algorithm.
Rows 4, 5 and 6 present the data computed during the interpretation of the
blue house. This is a definite description so visual and linguistic salience are
weighted equally in the integration process. However, at this point, there has
been no previous discourse so each object’s overall salience is dependent on
the interaction between its visual salience and its visual f-score. H2 fulfils two
of the selectional preferences encoded in the reference blue and house. Con-
sequently, its visual f-score is 2. The other houses in the scene only fulfil the
type selectional preference and their visual f-scores are 1. Primarily due to the
difference in f-scores H2 achieves the highest overall salience and is selected
as the referent. The asterix in the overall salience column (os column) of row
5, H2’s row, highlights this.
Rows 7, 8, and 9 list the data computed when the reference it is processed.
Note, that H2 has a linguistic salience of 1 because it was selected as the referent
for the subject in the preceding utterance. The pronoun it does not encode any
selectional preferences. Consequently, all the objects visual and linguistic f-
scores are set to the default values. The biasing towards linguistic salience in
the computation of the overall saliencies is apparent in the dominance of H2’s
integrated salience and again it is selected as the referent.
Rows 10 to 12 list data computed when the red house on the right is pro-
cessed. There are three selectional preference in this reference red, house and
on the right and the visual and linguistic f-scores are computed accordingly. As
H2 does not fulfil all the selectional preferences its linguistic f-score is 0. This
Visual salience and the other one 223

results in its linguistic salience being discounted in the resolution process. Con-
sequently, H3 achieves the highest overall salience due to its dominant visual
f-score.
Rows 13 to 15 list the data computed during the processing of the other one.
As noted in Sect. 4.5, the pronoun one inherits its selectional preferences from
the most recent NP in the preceding utterance that referred to the referent with
the highest linguistic salience in the context model. This results in the reference
being interpreted as the other red house on the right. In Sect. 4.5, the effect of
the modifier other on its NP was also presented. It does not introduce a selec-
tional preference into the process of computing the overall saliencies, rather it
affects the selection of the referent once these saliencies have been computed.
Consequently, the overall saliencies are computed using the same selectional
preferences as the preceding reference: red, house and on the right. As a result,
H1, H2 and H3 have the same visual and linguistic f-scores. However, as H3
was selected as the referent of the preceding subject NP it now has the highest
linguistic salience in the context 0.8889. Moreover, it has the maximum visual
f-score. These two factors result in H3 achieving the highest overall salience.
However, due to the modifier other in the reference it is excluded as a candidate
referent. The next most salient object is H1 and it is selected as the referent. H1’s
higher overall ranking relative to H2 is due to two factors: (1) H2 did not ful-
fil all the selectional preferences so its linguistic salience was discounted from
the resolution process; (2) H1 fulfilled more of the selectional preferences than
H2 which resulted in its visual reference based salience being higher. Figure 5
illustrates the visual context at the end of the interaction.

Figure 6. The final state of the visual context.


224 John D. Kelleher

Table 2. Salience scores computed during the example interaction. Acronyms: bvs =
basic visual salience, bls = basic linguistic salience, vfs = visual f-score, lfs = linguistic
f-score, rvs = reference based visual salience, rls = reference based linguistic salience,
os = overall salience.
bvs bls vfs lfs rvs rls os
Initial Context
1 H1 0.2812 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
2 H2 0.4376 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
3 H3 0.2812 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
make the blue house bigger
4 H1 0.2812 0.0000 1.0000 0.0000 0.2556 0.0000 0.2556
5 H2 0.4376 0.0000 2.0000 1.0000 0.4887 0.0000 0.4887*
6 H3 0.2812 0.0000 1.0000 0.0000 0.2556 0.0000 0.2556
make it smaller
7 H1 0.2879 0.0000 0.0000 1.0000 0.2879 0.0000 0.1438
8 H2 0.4348 1.0000 0.0000 1.0000 0.4348 1.0000 0.7124*
9 H3 0.2879 0.0000 0.0000 1.0000 0.2879 0.0000 0.1438
make the red house on the right bigger
10 H1 0.2812 0.0000 2.0000 0.0000 0.3259 0.0000 0.3259
11 H2 0.4376 1.0000 1.0000 0.0000 0.2054 0.0000 0.2054
12 H3 0.2812 0.0000 3.0000 1.0000 0.4687 0.0000 0.4687*
make the other one bigger
13 H1 0.1990 0.0000 2.0000 0.0000 0.3141 0.0000 0.1571*
14 H2 0.3477 0.1111 1.0000 0.0000 0.1925 0.0000 0.0963
15 H3 0.4534 0.8889 3.0000 1.0000 0.4833 1.0000 0.7476

6. Conclusions and Future Work

This paper presented an attention based reference resolution framework for vi-
sually situated discourse. The framework uses a weighted integration of visual
and linguistic attention to order the candidate referents within the context. The
candidate with the highest integrated attention score is taken to be the referent.
One advantage of this approach is that the resolution process occurs within the
full multimodal context. As a result situations where the intended target of the
reference is erroneously excluded, due to an individual assumption within the
resolution process, are avoided. Moreover, the system can recognise situations
where attentional cues from different modalities make a reference potentially
ambiguous. From a cognitive perspective the framework meshes well with psy-
Visual salience and the other one 225

cholinguistic results that point to the role of attention within human reference
resolution processes.
Finally, it should be noted that the framework as it currently stands is in-
tended to represent an abstract and preliminary attempt. Several issues need to
be addressed if it is to be used as a component within a dialog systems for less
constrained contexts. In particular, the use of predefined weights for salience in-
tegration is overly simplistic. This issue could be addressed by using a machine
learning algorithm, such as reinforcement learning, to automatically compute
these weights. The visual and linguistic salience algorithms should also be im-
proved. For dialog systems interfacing with virtual environments, the visual
salience algorithm should be extended to at least handle attentional cues such
as colour, motion and location of gaze. If the framework was to be used within a
real-world system, such as a robot dialog system, a computer vision saliency al-
gorithm, such as Itti and Koch (2000), could be adopted. The linguistic saliency
algorithm should also be revised and extended. In particular, the relationship be-
tween the framework’s model of local level attention and a more global model
of discourse structure, such as Grosz and Sidner’s focus stack model or Asher
and Lascarides’ SDRT framework, should be clarified. Fortunately, the mod-
ular nature of the framework makes such modifications possible without major
changes to the overall approach.

References
J.F. Allen and L.K. Schubert
1991 The TRAINS project. Technical report, University of Rochester, De-
partment of Computer Science.
H. Alshawi
1987 Memory and Context for Language Interpretation. Cambridge Uni-
versity Press, Cambridge, UK.
M. Ariel
1990 Accessing noun-phrase antecedents. London: Routeledge.
N. Asher and A. Lascarides
2003 Logics of Conversation. Cmabridge University Press.
I. Duwe and H. Strohner
1997 Towards a cognitive model of linguistic reference. Report: 97/1 – Si-
tuierte Künstliche Kommunikatoren 97/1, Univeristät Bielefeld.
P. Gorniak and D. Roy
2004 Grounded semantic composition for visual scenes. Journal of Artifi-
cial Intelligence Research, 21:429–470.
226 John D. Kelleher

B.J. Grosz and C.L. Sidner


1986 Attention, intentions, and the structure of discourse. Computational
Linguistics, 12(3):175–204.
B.J. Grosz, A.K. Joshi, and W. Weinstein
1995 Centering: A framework for modelling local coherence of discourse.
Computational Linguistics, 21(2):203–255.
B.J. Grosz
1977 The Representation and Use of Focus in Dialogue Understanding.
PhD thesis, Standford, University.
J.K. Gundel, N. Hedberg, and R. Zacharski
1993 Cognitive status and the form of referring expression in discourse.
Language, 69:274–307.
E. Hajicová
1993 Issues of Sentence Structure and Discourse Patterns, volume 2 of
Theoretical and Computational Linguistics. Charles University Press.
P.A. Heeman and J.A. Allen
1995 The TRAINS 93 dialogues. Trains Technical Note 94-2, Department
of Computer Science, University of Rochester.
D. Heinke and G. Humphreys
2004 Computational models of visual selective attention: A review. In
G. Houghton, editor, Connectionist Models in Psychology. Psychol-
ogy Press.
J.R. Hobbs
1985 On the coherence and structure of discourse. Technical Report CSLI-
85-37. Center for the Study of Language and Information.
L. Itti and C. Koch
2000 A saliency-based search mechanism for overt and covert shirts of vi-
sual attention. Vision Research, 40(10–12):1489–1506.
J. Kelleher and J. van Genabith
2004 Visual salience and reference resolution in simulated 3d environ-
ments. AI Review, 21(3-4):253–267.
J. Kelleher, F. Costello, and J. van Genabith
2005 Dynamically structuring, updating and interrelating representations of
visual and lingusitic discourse context. Artificial Intelligence, 167(1-
2):62–102, September 2005.
L. Kievit, P. Piwek, R.J. Beun, and H. Bunt
2001 Multimodal cooperative resolution of referential expressions in the
denk system. In H. Bunt and R.J. Beun, editors, Cooperative Multi-
modal Communication, Lecture Notes in Artificial Intelligence 2155,
pages 197–214. Springer-Verlag, Berlin Heidelberg.
Visual salience and the other one 227

C. Koch and L. Itti


2001 Computational modelling of visual attention. Nature Reviews Neuro-
science, 2(3):194–203, March 2001.
E. Krahmer and M. Theune
2002 Efficient context-sensitive generation of referring expressions. In
K. van Deemter and R. Kibble, editors, Information Sharing: Refer-
ence and Presupposition in Language Generation and Interpretation.
CLSI Publications, Standford.
I. Kruijff-Korbayová and E. Hajicová
1997 Topics and centers: A comparison of the salience-based approach and
the Centering Theory. Prague Bulletin of Mathematical Linguistics,
67:25–50, Charles University, Prague, Czech Republic
F. Landragin and L. Romary
2003 Referring to objects through sub-contexts in multimodal human-
computer interaction. In DiaBruck 7th Workshop on the Semantics
and Pragmatics of Dialogue, Sept 4th-6th 2003, University of Saar-
land, Germany, 2003.
S. Lappin and H. Leass
1994 An algorithm for pronominal anaphora resolution. Computational
Linguistics, 20(4):535–561.
W.C. Mann and S.A. Thompson
1987 Rhetorical Structure Theory: Description and construction of text
structures. In G. Kempen, editor, Natural Language Generation:
New Results in Artificial Intelligence, Psychology and Linguistics,
pages 83–96. Nijhoff, Dordrecht.
M. Poesio
1993 A situation-theoretic formalization of definite description interpreta-
tion in plan elaboration dialogues. In P. Aczel, D. Israel, Y. Kata-
giri, and S. Peters, editors, Situation Theory and its Applications, vol-
ume 3. CSLI.
T. Regier and L. Carlson
2001 Grounding spatial language in perception: An empirical and compu-
tational investigation. Joural of Experimental Psychology: General,
130(2):273–298.
S. Salmon-Alt and L. Romary
2001 Reference resolution within the framework of cogitive grammar. In
Proceedings of the Seventh International Colloquium on Cognitive
Science (ICCS-01), pages 284–299, Donostia, Spain.
M. Spivey-Knowlton, M. Tanenhaus, K. Eberhard, and J. Sedivy
1998 Integration of visuospatial and linguistic information: Language com-
prehension in real time and real space. In P. Olivier and K.P. Gapp,
228 John D. Kelleher

editors, Representation and Processing of Spatial Expressions, pages


201–214. Lawrence Erlbaum Associates.
H. Strohner, L. Sichelschmidt, I. Duwe, and K. Kessler
2000 Discourse focus and conceptual relations in resolving referential am-
biguity. Journal of Psycholinguistic Research, 29:497–516.
M. Tanenhaus, M. Spivey-Knowlton, K. Eberhard, and J. Spivey
1995 Integration of visual and linguistic information in spoken language
comprehension. Science, 268:1632–1634.
Salience in hypertext: Multiple preferred centers in
a plurilinear discourse environment

Birgitta Bexten

In this paper, I demonstrate that in hypertext, coherence relies on linguistic as


well as paratextual salience marking. Linguistically marked, e.g. discourse-old,
entities promise a coherent discourse connection in a current hypertext node,
whereas paratextual link marks promise that the link-marked entity is coher-
ently connected to the target node. Thus, both kinds of salience markers allow
to predict the ‘aboutness’ of the proceeding discourse. Bridging the gap between
linguistic centers and hyperlinks with findings concerning common paralinguis-
tic ways of salience marking, like typographic marks of prosodic attributes, I
propose an integrative description of the two kinds of salience marks within the
framework of Centering.

1. Introduction

Hypertexts are special. Not only do they increasingly combine semiotic func-
tions and create new ways of associative writing and reading. In addition, their
mere tree- or networklike structure provides intellectual challenges for hyper-
text authors, readers and, of course, for discourse linguists. None of them can
completely rely on already existing, polished conventions. Accordingly, all of
them wrestle with new attempts to get a grip on text in hypertext. What are the
criteria for an appropriate text unit? How can I combine those units to form a
coherent whole? How can I mark – and recognise – various text connections,
various link types? How can I decide where to read on when I come across a
hyperlink? From a discourse linguistic point of view, one of the central ques-
tions is what a hypertext’s network structure does to the text. What remains
the same if one compares it to traditional linear texts? What changes? Using the
framework of Centering, in this paper, I demonstrate how hypertext differs from
linear text in its capability and its need to provide multiple so-called preferred
centers in a single utterance.
230 Birgitta Bexten

2. The starting point

Hypertexts, narrative just as well as scientific ones, go beyond the scope of tra-
ditional, linear texts. While traditional texts usually only come up with a single
reading sequence,1 hypertexts are bound to provide a selection of different se-
quences. Without this possibility, neither treelike nor networklike hypertexts
would be possible. This quality affects the text itself just as well as the hyper-
text’s author and reader. As far as the text is concerned, it has to adapt to the
exceptional discourse structure. Due to the hyperlinks, the text splits up2 and
becomes plurilinear.3 Consequently, to form a coherent whole, not only the text
within every single hypertext node has to be coherent but also the connections
between the linked nodes. For the hypertext author, this means that he has to
arrange text and links very carefully. Especially so as he cannot tell for sure
which units the reader has already perceived. This is only possible in extremely
well-planned hypertexts. The easiest way to master a networked text surely is
to cut off direct pronominal connections between the information units and to
use referring nouns instead (Kuhlen 1991), but this way of granulating text is
surely not desirable for every kind of hypertext. Especially authors of fictional
hypertexts can fall back on direct utterance-connecting devices.
Surely, the author can add hypertext-specific coherence cues like overviews,
lists of currently available target units and so forth (Storrer 2002), but the read-
ability of the hypertext mainly depends on coherent connections between the
single units. The sequence of the hypertext units depends on each individual
reader’s decision to follow hyperlinks and this might vary every time the hy-
pertext is being read. From the reader’s point of view, this means that the pluri-
linear structure requires the reader to decide whether he wants to read on in the
current unit or in the link’s target unit.
The readers’s decision is influenced by his expectations about how the two
offered text strings might proceed. He can deduce clues about this from the ut-
terance he is reading at the very moment: usually the most salient entities of a

1. Even though some texts do offer various reading sequences via footnotes, but, in
contrast to hypertext units, footnotes are not constitutive for the text. (Nielsen 1995)
2. This description only applies to treelike hypertexts with a simple structure. In a more
complex, networklike hypertext, the text not only bifurcates, but also merges at var-
ious points. Even if the reader relinearises (parts of) the text while reading, the text
would structurally remain a network. For the purpose of this paper, however, the idea
of recombining text strings can be left aside as it depends more on the backward-
looking aspect text connections than on the here relevant forward-looking one.
3. For the concept of plurilinear texts see Harweg (1974) and for an application of his
model on hypertexts Bexten (2006)
Hypertext and multiple salience centers in the framework of Centering 231

given utterance are most likely to be the topic of the next one. What does this
mean for the two text strings in hypertext? As for the string announced by the
hyperlink, the reader’s curiosity can be satisfied quite easily even without the
need of following the link: he can expect the target unit to provide additional
information about the link-marked entity. Hyperlinks are of course quite salient
just because they differ from the surrounding text by their highlighted design:
conventionally, links are visually marked, e.g. by a different colour.4 In addition
to the link marks, there are also hints that concern the continuation of the cur-
rent unit’s text. This is because not only non-linguistic, paratextual 5 features can
increase an entity’s degree of salience but also linguistic ones. Which entities
might be central in the ongoing text, the reader can tell from their grammatical
role or from their information value, as has been widely described in Centering
Theory. Obviously, this kind of discourse marking does not only apply to pluri-
linear hypertext structures but also to traditional linear discourses. With regard
to hypertexts, however, I consider a combination of the two ways of salience
marking a decisive factor for the establishment of coherence. The first type of
salience facilitates coherent discourse processing in the current hypertext node,
and the second for cross-node discourse continuation.
What is of interest here, is the question in how far the available results from
research on linguistic salience can be used to describe paratextual salience.
To answer this question, it is necessary to find out, whether it is justifiable
to equate linguistic and paratextual salience. For that purpose, I consider the
framework of Centering a promising candidate, for it takes into account the
forward-looking, promise-making character of text. In other words: it detects
how the text enables the reader to conjecture about the text’s continuation. What
is more, the various approaches to Centering indicate that salience can depend
on varying factors. Therefore, to cover more than traditional linear texts, an inte-
gration of a paratextual way to mark salience is, as the following considerations
will point out, not only possible but desirable.
The basic Centering model, as originally developed by Grosz et al. (1995)
and expended for example by Strube & Hahn (1999) only covers questions
of linguistic salience. Therefore, though Centering is a thought-out method to

4. The question of labeling various link types will remain out of consideration here. For
an overview over research on the impact of labeled hyperlinks on the cognitive load
placed upon the reader see DeStefano & LeFevre (2007). For a model of bottom-up
visual salience in a situated-dialog context, see Kelleher (this volume).
5. Genette (1987) defines as paratext those linguistic elements (index, title, etc.) and
non-linguistic elements (fonts, illustrations, etc.) that accompany and present text in
a medium-specific way.
232 Birgitta Bexten

identify the most salient discourse entity, for the purpose of my argumentation,
its focus needs to be extended to non-linguistic ways of salience marking. Af-
ter a short exemplification of the hypertextual double salience, I will illustrate
the fundamental comparability of linguistic and non-linguistic salience in the
framework of Centering.6

3. Hypertext and multiple salient centers in the framework of


Centering

3.1. A first example


Consider the following example (1) from the fictional hypertext “About time”7
The sequence
(1) a. “There are two more continents,” Mouth said.
b. “Maybe more.”
c. Wow. Continents?
is continued in the same hypertext unit with
d. Those are really big, aren’t they?”
and in another unit8 with
d.’ Tuber asked where these continents were.
The discourse bifurcates due to the hypertext link Continents. It offers the reader
a connection to a second text string in addition to the one in the current node.
Obviously, the word Continentsis salient because of its link marks. It is however
also marked linguistically because of its information value “discourse-old”. In
terms of Centering, this last kind of salience, qualifies Continents as the pre-
ferred forward-looking center of the current utterance. This kind of salience
guarantees that the reader will expect the next utterance to be about continents,

6. For different adoptions of Centering Theory see for example Chiarcos, this volume,
who incorporates a variant of Centering Theory into his Mental Salience Frame-
work, and Kelleher, this volume, who computes basic linguistic salience by using an
algorithm that is inspired by the Centering framework.
7. Swigart and Strange (2002, “Discoveries” in “Mouth’s Journey 40.000 Years Ago.
Part 1.”).
8. Swigart and Strange (2002, “Continents” in “Mouth’s Journey 40.000 Years Ago.
Part 1.”)
Hypertext and multiple salience centers in the framework of Centering 233

too; the link marks on the other hand call the reader’s attention to the fact that
in the link’s target node he will find a subsequent text string about continents
as well.9 Naturally, the two types do not always coincide in one and the same
entity. If different entities are salient in the same utterance, one linguistic and
one paratextual, the reader can expect that not only the discourse but also the
topic bifurcates.
The linguistically and the paratextually marked entities are equal in as far
as both enable the reader to predict the ‘aboutness’ of subsequent utterances.
Hence, both can be regarded as preferred centers. In hypertext, their promis-
ing10 character is far more prominent than in linear texts: Linear texts allow the
reader simply to read on while they draw his attention more or less unnoticed
on the preferred centers. Hypertext forces the reader at least partly to give up
this comparatively passive role. With every paratextually marked hyperlink, the
hypertext obliges the reader’s attention to undergo a rapid oscillation between
between looking through and looking at the text (Bolter 1991), for hypertext
offers two different promises about two separated text strings by two different
kinds of salience.
Text bifurcations in hypertext are funded on the possibility to provide a sin-
gle utterance with various preferred centers. According to Grosz et al. (1995), in
principle, every utterance can have more than one preferred center. Centers are
constructs of the discourse: different discourse situations in which one and the
same utterance may be made, can lead to different centers. This multiplication
of centers is nevertheless not on the same level as multiple preferred centers
in hypertext. While in Grosz et al.’s argumentation it is a question of either
/ or, of various discourse situations, discourse in hypertext construes several
parallel discourse situations. In a networklike hypertext environment, i.e., in a
discourse with a multitude of bifurcations and mergers, some utterances on all
accounts structurally belong to different discourse situations at the same time.
Therefore, multiple preferred centers are an essential presupposition for hyper-
textuality.

9. Admittedly, not all connections between information units are as smooth as this one.
Especially in scientific hypertexts, they are rather rare. For the purpose of my argu-
mentation, however, a quantitative inventory is less important than highlighting the
possibilities. In addition, the discussion below will show, that the observations from
this neat example are transferable to less direct connections.
10. Genette (1987) ascribes all paratexts to some degree an illocutional character: they
inform, demand, promise, etc. Hammwöhner (1997), who describes the changes that
paratext undergoes in hypertext, points at the promising character of hypertext links.
234 Birgitta Bexten

Before going into detail about forward-looking centers in hypertext, I will


give a brief overview on the aspects of Centering Theory which are relevant
here, focussing primarily on the question of how preferred forward-looking cen-
ters can be identified.

3.2. The basic Centering model


The aspect of Centering Theory that is most relevant for the argumentation in
this paper is that it examines the possibilities of discourse to give hints about its
own continuation. For this purpose, Centering Theory focuses on the interaction
between so-called centers of attention and the attentional state.
It depends on the connection of these two whether a discourse is perceived
as coherent or not. Centers of attention are those discourse entities a sequence
of utterances is about. The attentional state dynamically focuses on the most
salient center at each point of the progressing discourse. It has direct influence
on the inference load placed upon the language user. Usually, a recipient can
more or less predict how a discourse would proceed: people tend to continue
speaking or writing about the same entity for a while by referring to it intersen-
tentially. The attentional state, then, continues centering around a single topic.
In this case, the inference load is quite low. But when the discourse flips back
and forth between different centers, the thematic progression becomes less pre-
dictable. The attentional state undergoes rough shifts, and the discourse is very
likely to be perceived as less coherent. What is true for linear discourses def-
initely holds for connected hypertext units. If the topic that is announced by
the link-marked entity is not presented in the links’s target node, the inference
load placed upon the reader is even higher than with a rough shift in linear dis-
courses, because the reader relies on the convention that hyperlinks are placed
to announce additional information about the marked entity. While a thematic
shift in linear discourse is possibel because several centers are eligible for being
topicalised in the next utterance, in hypertext the link marks determine the topic
of the connected utterance bindingly.
The attentional state is directly related to the most salient center of a given
utterance. Usually, it is this prominent discourse entity that is semantically
linked to an entity of the following utterance. To describe the relation between
these connected entities, Grosz et al. (1995) distinguish between forward- and
backward-looking centers. The forward-looking centers, Cf , of an utterance Un ,
are those discourse entities that can be referred to in the following utterance.
A set of them can be found in every utterance. Accordingly, the backward-
Hypertext and multiple salience centers in the framework of Centering 235

looking center, Cb , of an utterance Un+1 refers to one of the preceding forward-


looking centers.11 Consequently, it cannot be part of the initial utterance.
The following sequence
(2) a. John loves Mary.
b. He wants to kiss her.
is connected by the anaphoric resolution of the forward-looking center John.
John, in Grosz et al.’s model, is marked linguistically by its subject position
and is therefore more salient than Mary. Naturally, the entity Mary, too, is a
forward-looking center. But, because of its lower degree of salience it does not
function as antecedent of the backward-looking center in (b): (b) is about John,
not about Mary. In the case that the second utterance would be But she does
not like him we would have to do with a rather smooth thematic shifting which
Grosz et al. would call retaining. The pronoun him still functions as the Cb , but
the chance that the next utterance would be about Hans has clearly decreased.
She becomes the most salient forward looking center of the current utterance.
Walker et al. (1998) state that the most salient forward-looking center, the
preferred center, Cp , can be seen as a prediction about the Cb of the following
utterance. Hence, identifying the Cp helps the recipient to form expectations
about the continuation of the discourse. Marking preferred centers, therefore,
has a fundamental impact on the question how smoothly readable a text be-
comes.
The question of how forward-looking centers are marked and which entity
is the highest ranked in terms of salience, is widely discussed in Centering The-
ory. Here, I will briefly introduce two approaches. One of them, namely the
predominantly grammatical model by Grosz et al. (1995) represents the basic
idea of Centering, while the other, namely Strube and Hahn’s (1999) functional
model seems most applicable to hypertext because of its flexibility.

3.3. The preferred center


According to Grosz et al., Cf ordering is considerably affected by the gram-
matical role and the surface position of an entity. The hierarchy they suggest is
subject > object(s) > other, for in English, the surface position often cor-
responds to the grammatical role on the one hand and to topic position on the
other. Furthermore, the authors postulate a transition rule that constrains the

11. Backward-looking centers are not the same as anaphora, e.g. pronouns. Grosz et al.
illustrate that every utterance can contain several pronouns while only a single entity
can function as backward-looking center.
236 Birgitta Bexten

movement of centers between utterances. In short: continuation of the center of


attention to the subsequent utterances is preferred over every form of change.
Thus, the reader can assume that the subject will be the preferred center, i.e., in
this model, the preferred antecedent of a successive backward-looking center,
or, informally spoken, the topic of subsequent utterances.12
While Grosz et al. provide syntactic principles – even though they admit that
several factors may have influence on the ranking of the Cf –, Strube and Hahn
(1999) focus on functional mechanisms.
Their model of Centering Theory is language independent due to the fact
that functional criteria can be applied to both, fixed- and free-word-order lan-
guages. In opposition to Grosz et al., what concerns anaphor resolution, Func-
tional Centering does away with backward-looking centers13 and traditional
transition types. Starting from the relatively free-word-order language German,
Strube and Hahn ground the ordering of the forward-looking center list in the
functional information structure of utterances in a discourse. If an entity pro-
vides information that has already been introduced, i.e., if it is discourse-old or
hearer-old, it is ranked higher than a discourse-new or hearer-new entity that
does not refer to another discourse entity.14
In addition to this basic Cf ranking, which is valid for texts in which pronom-
inal reference is dominant, Strube and Hahn introduce an extended version. The
refined criteria of this ranking help analysing texts where pronouns occur rather
infrequently, like texts from technical domains. This criterion applies to many
hypertexts, too. In a hypertext network, one and the same hypertext node can
usually occur at several moments of the reading process. It is therefore eas-
ier for the author to avoid the usage of entities without autonomous reference,
like pronouns, for cross-node reference. Kuhlen (1991) advises to use referring
nouns instead. How exactly the discourse connection between hypertext nodes

12. The interpretation of the centers can differ between languages. In his research on
Eastern Kanthy, Filchenko (this volume) resorts to the use of the terms foregrounded
center and backgrounded center to mark special pragmatic states.
13. For a first attempt of Centering without backward-looking centers see Strube (1998).
14. The authors make reference to the terms of information status proposed by Prince
(1981) and (1992). The sets of discourse-old and discourse-new entities can be fur-
ther categorised in terms of Prince’s (1981) familiarity scale: hearer-old entities can
be split into evoked > unused while the hearer-new entities can be split into in-
ferable > containing inferable > anchored brand-new > brand-new. An
investigation that compares the relative contribution of syntactic and semantic promi-
nence to the salience of entities is presented by Rose in this volume. Rose’s corpus
analysis indicates that the prediction of subsequent reference is enhanced when join-
ing the information of two salience factors (syntactic and semantic prominence).
Hypertext and multiple salience centers in the framework of Centering 237

is construed anaphorically cannot be discussed in this paper. Here, I focus on


the forward-looking part of discourse connections rather than on the question
of anaphora resolution.15
Strube and Hahn accommodate the basic Cf ranking to e.g. technical texts
by separating a third set of expressions from the hearer-new discourse entities:
mediated discourse entities.16 It helps incorporating the availability of world
knowledge, which is needed to understand inferables, in a more detailed way.
Mediated discourse entities have a status between the set of hearer-old discourse
entities (which remains the same) and hearer-new discourse entities (which in
this model only consists of brand-new entities). Thus, the extended Cf ranking
prefers old discourse entities over mediated ones, and mediated ones over new
ones.
What is important here is that both models – which, according to Walker
et al. (1998), is one of the key aspects of Centering Theory in general – allow
for projecting preferences for interpretation in the subsequent discourse. Due
to the forced decision-making, i.e., the fact that the networked text forces the
reader to decide where to read on with every link, this forward-looking aspect
of coherence becomes more dominant in hypertext.
In the next section, I illustrate the function of preferred centers in hyper-
text and illustrate why both Centering models, the grammatical as well as the
functional one, would fail to describe them sufficiently.

3.4. … and its application in hypertext


The degree of a discourse element’s salience is directly linked to its ability
to attract the addressee’s attention. Scharl (2000) points out that in hypertext,
salience is considerably affected by the design of the hypertext units. The de-
gree of an element’s salience depends on its placement, its relative size, colour,
etc. When it comes to the text itself, it is the link-marked words or phrases
that catch the reader’s attention and, therefore, can be assumed to be most
salient.17 But a hypertext link does not necessarily correspond to the preferred
forward-looking center. Nevertheless, I regard link-marked entities as high-
ranked forward-looking centers as well, as they, too, allow predicting the ‘about-

15. Nevertheless, I assume that Strube and Hahn’s extended version of the Cf ranking is
an applicable Centering model for hypertexts. It does at least allow for the fact that
pronominal anaphor resolution is hard to combine with the use of hypertext links.
16. This set consists of inferable, containing inferable, and anchored brand-new entities.
17. The difference between linguistic and non-linguistic salience is also described by
Kelleher, this volume, who provides an approach that integrates linguistic and visual
salience to account for situated reference resolution.
238 Birgitta Bexten

ness’ of subsequent utterances. The only difference is, that these utterances do
not occur in the same hypertext node as the link-marked entity itself but in the
link’s target node.
Thus, in hypertext, a single utterance can contain multiple preferred centers.
Speaking in terms of Grosz et al.’s model, each of the salient centers of an
utterance Un , i.e. the one that is ranked highest by grammatical – or functional
– criteria as well as the one that is purely salient by means of link marks, can
be connected to its own backward-looking center. This does not mean that, at
the same time, a single utterance Un+1 could have multiple backward-looking
centers. The Cb ’s belong to different utterances which, again, usually belong to
different hypertext nodes: one to the node in which the link occurs and one to
the target node.18
Link marks can be conceived of as a promise to the reader that he can find
more information about the link-marked entity in the target node. This promise
can only be kept if the target node contains a coreferent entity, or at least one
that shows some kind of connection with the link-marked entity. In other words:
whether the promise is kept, depends on a coherent connection between the hy-
pertext nodes. But even – and this is crucial here – if the readers’s prediction
does not come true, at the time he comes across the hypertext link, the link-
marked entity has to be regarded as a preferred center nevertheless. Even more:
such a case would support my argumention because it illustrates the impact
which the hints about the discourse continuation exert on the question of con-
nectivity. The link mark attracts the reader’s attention and offers him an addi-
tional way to read on. The discourse bifurcates. The reader has to decide which
sequence he wants to follow. When choosing the link, he relies on the link’s
promise that there is a coherent connection with the target node. When stick-
ing to in the current node, he can presume the discourse’s process by the usual,
syntactically or functionally marked preferred center of the current utterance.
In the remainder, I discuss some examples illustrating this phenomenon.
As already mentioned, it is not necessarily the preferred center which is
marked as hyperlink. I can think of two basic positions a hypertext link can have
in respect of the Cp of an utterance. Either the link corresponds to the preferred
forward looking center, as was the the case in example (1) – here repeated as (3)
–, or it does not, as in example (4).19 Naturally, both cases can occur in one and
the same utterance for there can be various link-marked entities at once. To il-

18. For a discussion of the question whether an utterance can have more than one
backward-looking center see Kruiff-Korbayová and Hajičová (1997)
19. Swigart and Strange (2002, ”Who’s on First” in ”The Granville Files. Present Day.
Part 2.”)
Hypertext and multiple salience centers in the framework of Centering 239

lustrate the phenomenon, however, it is sufficient to concentrate on occurrences


of the two mentioned positions.
To make the centers more visible, I have added relevant elliptic forms in
square brackets.
(3) a. “There are two more continents,” Mouth said.
b. “Maybe more [of them].”
c. “Wow. Continents?
d. Those are really big, aren’t they?”
(4) a. “[…] Conquering another nation is easy.
b. There are models [for conquering another nation].
c. People had done it before.”
d. “Yeah [they did it before].”
e. “I’m thinking about the ones who did it the very first time.”
In example (3) , the link-marked entity Continents is the preferred center of the
given utterance. At the same time, the typical marks show that it serves as a link
anchor.
In example (4), the center of attention in utterance (e) is obviously not the
link-marked verb thinking. The forward-looking centers are I, the ones who, and
it. In Grosz et al.’s grammatical model, the subject I would be highest ranked.
One could argue that the finite verb is congruent with the subject and therefore
could be considered at least be closely connected to the preferred center.
But obviously, the author decided not to mark the subject but only the verb.
Apart from that, it can be argue in terms of Functional Centering that the sit-
uationally evoked entity I is not the preferred center of the utterance after all.
Without the link marks, the reader’s attention would neither be focused on Inor
on thinking but on the discourse-old entity people evoked by the ones who.20
The reader would not expect the subsequent utterances to deal with the speaker
himself or with his process of thinking, but with the people who conquered
another nation for the very first time. I’m thinking is no more than a prelude.
But is it generally impossible that a verb like thinking can function as a tradi-
tional Cp ? Well, technically, it is possible. This requires, however, an adequate
linguistic and paralinguistic environment, for example a suitable initiation that
enables the speaker to emphasise the verb – or every other entity that is ranked
low in terms of salience – contrastively. I get back to this phenomenon be-
low.

20. In Strube and Hahn’s model, forward-looking centers of the same type are ranked
according to their position in text. Thus, in (4)e the ones who is ranked higher than it.
240 Birgitta Bexten

Apart from these two basic cases, two more arrangements can be found. It
also occurs that, as in example (5),21 not a whole constituent is link-marked
but only, e.g, an adjective. In such a case, too, the link could or could not co-
incide with – or rather be part of – the preferred center. The other possibility is
that the link marks go beyond the borders of one constituent as is the case in
example (6).22
(5) a. Bei Simulationen handelt es sich um spezielle
By simulations be it itself about special
interaktive Programme, die dynamische Modelle von
interactive programs, which dynamic models of
Apparaten, Prozessen und Systemen abbilden.
devices, processes and systems represent.
b. Simulations are special interactive programs which represent dy-
namic models of devices, processes and systems.
(6) a. Er nimmt die Sonnenbrille ab, lässt seinen Blick ins
He takes the sunglasses off, lets his gaze in_the
Wolkenkratzergetümmel tauchen, sieht Details und nimmt
skyscraper_turmoil dive, sees details and takes
sein Schreiben wieder auf. Die Buchstaben rennen den
his writing again on. The letters run the
Ereignissen hinterher.
events after.
b. He takes off his sunglasses, lets his gaze dive into the turmoil of
skyscrapers, sees details and resumes his writing. The letters run
after the events.
In example (5), the link only promises more information about interaktive and
not about spezielle interaktive Programme.
In example (6) – apart from the linguistically marked forward-looking center
the letters –, the utterance as a whole functions as a preferred center.
In all examples, due to the links, the discourse forks into two distinct text
strings. In example (3), for instance, the reader can predict the Cp Continents
as being connected to two different subsequences of utterances. Whether this
prediction is right, the reader can only find out by reading on either in the cur-
rent hypertext node or in the target node. For the time being, he can only con-

21. Blumstengel (1998, /Klassifikation-computerunterstuetzter-Lehr-Lernsysteme.html)


22. Ohler (1995, /anfang.htm)
Hypertext and multiple salience centers in the framework of Centering 241

sider the linguistic and paratextual marks of Continents as promises for such
connections.
Strictly speaking, Continents should therefore be analysed as two Cp ’s, a
linguistic center Cp 1 and paratextual center Cp 2. Each belongs to its own text
string. At first, both strings are combined in the same hypertext node. Then they
split up, and one half continues in the current node and the other in the link’s
target node. This phenomenon becomes more obvious in example (4). Here, the
two different Cp ’s are divided. The linguistically marked Cp 1, the ones who, is
connected to the discourse segment of the current node while the second Cp 2,
thinking, opens a discourse connection to the segment of another node.
At this point, one could argue that in hypertext not utterances are connected
but longer sub-texts and that the question of cross-node forward-looking cen-
ters therefore goes beyond traditional23 Centering. Nevertheless, for two rea-
sons, I consider the Centering model a reasonable approach to reveal coherent
discourse connections in hypertext.
On the one hand, one can doubt that Centering in hypertext necessarily is
about connected sub-texts. Especially – but not exclusively – in fictional24 hy-
pertext like the one in example (3), hypertext links are quite often constructed
as direct connections between utterances. Due to the network structure, the
linked utterances cannot occur as a visible linear sequence but have to be written
down in different hypertext nodes. Nevertheless, the structural linguistic con-
nection between the link-marked entity and the coreferent entity in the – usually
first – utterance in the target node supports an interpretation as a sequence of
utterances.
On the other hand, even if this is not the case, Centering can be used to
describe the role link marks play for the coherence of a discourse. Coherence,
in the Centering framework, can be seen as the discourse’s devices to support a
prediction about how it continues and of finding the prediction fulfilled. Both,
prediction and fulfillment, are founded in the discourse itself. The predicting
part of this progress, which is central here, is fundamentally affected by link
marks.

23. Except from traditional Centering theories that focus on local coherence, there are
attempts to concentrate on global discourse structure as well (e.g. Hahn and Strube
(1997) and Walker (2000)).
24. In non-fictional hypertexts, information retrieval is often more important than a co-
herent text. Links, then, target at text units that do not continue the text, but rather
define the link-marked words. Todesco (1997) even advises to write hypertexts in
such a lexicon-like manner. In this case, the hypertext nodes are rather independent
of each other. An interpretation of linked sub-texts, therefore, seems indeed more
appropriate.
242 Birgitta Bexten

The interpretation of link-marked entities as preferred centers seems to ex-


tend the traditional perspective in quite a substantial way. In addition, there are,
as the examples have demonstrated parallels with link marks in lexicons and
with the use of emphasis in spoken language. Also footnote marks and typo-
graphic attributes of speech,25 like bold, italics or underline, can be considered
as in line with these findings. Therefore, and also against the background of the
question how special hypertexts really are, it seems appropriate to get a bit more
granular on the parallels between preferred centers and link marks.

4. Preferred centers and hyperlinks

The most logical properties to compare preferred centers and hyperlinks are
their form and their function.
Concerning their function, the preceding elaborations have already demon-
strated that both phenomena match considerably. Structurally, both function as
a forward-looking connecting element between two discourse elements. From
a pragmatic viewpoint, they function as ways to assess the text’s continuation.
But hyperlinks go beyond these functions. Apart from the fact that they rep-
resent an executable connection between virtual information units, they also
announce the bifurcation of the recent text string. The same is true for footnote
marks and links in lexicons, but only the latter always announce the same kind
of connected unit, namely relatively context independent information about the
marked word. Hyperlinks as well as footnote marks can be connected to a much
broader range of targets. The difference between them is, that footnotes can only
announce side strings of the text while hyperlinks often connect equally impor-
tant discourse units.26

25. I limit the argumentation here to those typographic attributes that map emphasis to
written language. On the one hand, taking into consideration typographic marks in
general, would go beyond the scope of this paper – just think of typographic indi-
cators of propositional macro-structures or the marking of technical terms. As far as
I know, there are no studies which concentrate on the effects those markings have
on connections between utterances. On the other hand, results from research on ver-
bal salience, which are even at hand in the framework of Centering, can be adopted.
Which marks are to be regarded as typographic reproduction of prosodic qualities
depends on the given discourse context (Wehde 2000). Therefore, it is impossible to
in- or exclude specific ways of marking.
26. For a further discussion of this matter see Bexten (2006)
Hypertext and multiple salience centers in the framework of Centering 243

For the matter of the form, the marked entity and the way it is marked can
be distinguished.27 First, I will concentrate on the marked entity itself.
The preferred centers and hyperlinks in the examples above both mark lin-
guistic entities. How large this entities are, is mutable for both kinds. For the
matter of linguistic centers, Grosz et al. (1995) allude that centers are seman-
tic objects and therefore must not be confounded with words, phrases, etc.
Also other discourse theories suggest that anaphora connections can be hetero-
syntactical, anaphora can resolve for example between phrases, subphrases or
even conjunctional subordinated clauses (Harweg 1979).28 Here the question is
whether centers as for their size are as flexible as hyperlinks.
In hypertext, every entity can theoretically be used as a link anchor, no mat-
ter how small or large it is. The examples have already shown that whole sen-
tences or mere adjectives can serve as a such.29 Whether a whole constituent or
just a noun or an adjective is link marked, depends on several factors. One thing
that plays a role is the link’s target: if it provides a definition or explication of
the marked word, the surrounding words most likely are not link-marked. This
kind of connection usually comes along with a change of dimension (Harweg
1979), which means that a concrete phenomenon, like the interactive character
of computer programs in example (5), is linked to a general explanation of the
concept of interactivity. Thus, especially non-fictional hypertexts show obvious
parallels to printed lexicons.30 In fictional texts, or generally: when a concrete

27. For pragmatical reasons I exclude links, which are placed outside the actual text,
and non-linguistic elements, like pictures, that function as link anchor. The fact that
non-linguistic elements can be used for thematic connections, points at the smooth
transition between the various medial elements, though.
28. In addition to the focus on traditional, linguistic centers, some theories extend the
concept to other conceptions of salience, esp. discourse-structural salience (as used
by Ramm and Hinterhölzl/Petrova (this volume). They describe salience for both
discourse referents and events/discourse segments.
29. And even smaller entities, like letters are link-marked at some websites. In alphabet-
ical lists, this even makes sense, in cases like the one of the proudly presented su-
perlink (http://www.stangl-taller.at/4711/SIEB.10/NETERATUR/LITTERATUR/
Litteratur.html), however, where nearly every letter works as a hyperlink but lacks
every thematic connection to its target, the function of hyperlinks is completely un-
dermined. The reader can by no means tell which website he will be sent to when
clicking on one of the links. These kinds of hyperlink only function on very exper-
imental websites where text-constitutive phenomena as coherence play a secondary
role.
30. The fact that the link marks in print lexicons conventionally follow the word in ques-
tion comes second here, especially as this kind of marks can be found in hyper-
244 Birgitta Bexten

referent should be referred to, it makes more sense to link-mark the whole con-
stituent.
How about linguistic preferred centers then? First, I will focus on their pos-
sibility to cross the borders of a single constituent. In the following dialogic
sequence of utterances31
(7) a. A: But you said, that you’ld agree!
b. B: I never said that!
the whole indirect quotation functions as a forward-looking center. To be a re-
alistic candidate32 for anaphora resolution in the next utterance, however, in
contrast to larger hyperlinks, such a linguistic center makes certain demands: it
has to be preceded by an appropriate preamble, like But you said. This preamble,
on the other hand, causes the whole quotation to have the value of a constituent.
As example (6) has shown, this is not necessary for hyperlinks.
The next question is, whether forward-looking centers can be smaller than
constituents? As already implied in the discussion of the verb thinking in ex-
ample (4), this seems possible, but here, too, certain demands have to be met.
Consider the following example:
(8) a. In his opinion, the blue sweater did not suit her.
b. Blue just wasn’t her colour.
In a., blue can only be judged as a preferred center if the thematic context allows
it and if, in addition, it is emphasised contrastively. What concerns the context,
it requires that several sweaters in various colours were mentioned before. As
for the emphasis, it could – leaving the context put of consideration – fall on
her or even on sweater just as well, which would cause the preferred center
to shift to these entities. Thus, emphasising as a paralinguistic instrument can
influence the salience hierarchy. The matter of an entities size and the ways
of marking it apparently overlap at this point. As this example indicates and
as Grosz et al. (1995) already have conceded, preferred centers can be marked
in various ways. Certainly, it does not necessarily have to be a small entity,
like an adjective, that is made more salient by means of emphasis. The same

texts, too. Grether (n.d.), for example, uses them to differ between intra- and extra-
hypertext links.
31. For an application of the Centering concept to dialogues see for example Byron &
Stent (1998).
32. Theoretically, every clause can be referred to by a meta statement, and therefore can
count as a forward looking center, but it has to be regarded as being ranked very low
in the salience hierarchy.
Hypertext and multiple salience centers in the framework of Centering 245

holds, as Navarretta’s (2002) supplement to Centering Theory makes clear, for


forward-looking centers in general. On the basis of spoken Danish, Navarretta
recognises word order, prosodic marking, and syntax as methods for salience
marking.33 According to her, only entities that are explicitly focally marked
have the highest degree of salience. By marking entities, the language producer
announces to the addressee that the ‘aboutness’ of the discourse will change.
Doing so, the recipient can prepare to shift his attention as well.
Navarretta’s approach refers to spoken, linear discourses, but can without
difficulty be adapted to written language in general and to hypertexts in partic-
ular, if one takes into account the possibility to use the already mentioned ways
to typographically mark prosodic qualities. In the example above, the adjec-
tive blue could typographically be highlighted as bold, italics, underlined, etc.
Both, emphasis and typographic marks are paralinguistic phenomena that can
be operated to influence an entities degree of salience. Especially typographic
marks are, in this regard, align with hyperlinks. Hyperlinks, too, depend on ty-
pographic highlighting to fullfill their function. In both cases, even an entity that
normally would not be expected to be salient can form the center of attention.
In example (4), the writer has explicitly marked thinking, an entity that under
normal circumstances would hardly serve as a preferred center. By doing so, he
pushes the low-ranked entity to a very salient position and informs the reader
that, apart from the ones who (i.e. people), there is a second entity the proceed-
ing discourse – or rather one of the two different proceeding text strings – can
be about. The reader can prepare to go into one of the two different directions
the discourse offers: one with an entity that has a coherent connection with peo-
ple and one that continues the string evoked by the paratextually marked entity
thinking.
In example (3), the two different Cp ’s are not separated, but overlay each
other. The double salient entity Continents promises the reader that there are
two different sequences of utterances which are both about continents.
There are, however, also clear differences because hyperlinks are more flex-
ible than linguistic centers. On the one hand, the possibility to emphasising
an entity goes hand in hand with the entity’s discourse environment, whereas
this does not apply for hyperlinks (remember example (4)). The same holds for
the question of marking larger entities: here, too, linguistic centers depend on

33. Her analysis is supported by findings outside Centering Theory. Caldwell (2002),
for example, underlines that in English syntax alone is not enough to foreground an
entity and names among other devices verbal emphasis.
246 Birgitta Bexten

their surroundings, while this does not necessarily apply for hyperlinks.34 On
the other hand, emphasis or typographic marks force both, the forward-looking
center and the reader’s attention, to shift. In contrast, in hypertext, the center of
attention only shifts in the reader’s mind but not in the discourse itself. In the
discourse segment, the first Cp remains where it is while, in addition, a second
Cp arises from the link marks.
The comparison of form and function of linguistic centers and hyperlinks
reveals many parallels. What is more, this comparison demonstrates that it is a
small step from mere linguistic to paralinguistic devices of salience marking.
As mentioned, the latter already are partly integrated in Centering approaches.
Against the background of the presented elaboration, including hyperlink marks
is only consequent.

5. Conclusions and future research

The analysis of utterances in hypertexts shows that the limit of one preferred
forward-looking center per utterance can be exceeded. Apart from the normal
linguistically marked Cp , the writer can explicitly highlight additional Cp ’s. The
first Cp corresponds to the most salient forward-looking center usually analysed
in Centering Theory. The additional Cp ’s, the hypertext links, can be entities
with a low degree of linguistic salience. They usually are marked paratextually
by being, for example, underlined and coloured. Both kinds of Cp ’s are equal in
as far as both permit the reader to predict the ‘aboutness’ of subsequent utter-
ances. Therefore, both contribute to the coherence of discourse in a plurilinear
hypertext environment. In this paper, I proposed to describe the two types in-
tegratively in the Centering framework. While linguistically salient Cp ’s can
sufficiently be analysed with traditional Centering theories, I used Navarretta’s
description of explicit salience marking as theoretical background to show that
an integration of paralinguistic phenomena is perfectly reasonable. Hypertext
links are not necessarily salient in a linguistic way. Hence, in traditional Cen-
tering, they would not be regarded as preferred centers. Referring to less salient
discourse entities, in traditional Centering, would mean to risk the discourse to
become incoherent. Here, I pointed out that with regard to hypertext links this
interpretation is not accurate. Discourse in hypertext allows for rougher shifts
than traditional linear discourse because by highlighting discourse entities as
links, the writer warns the reader to prepare for the thematic shift. Without link

34. At least, this is true from a pure descriptive viewpoint. Keeping the question of co-
herent connections in mind, it is arguable whether it really is advisable to tap the full
potential of hyperlink placement.
Hypertext and multiple salience centers in the framework of Centering 247

marks, the reader could focus his attention only on the linguistically marked pre-
ferred center in the linear sequence of utterances in one hypertext node. Only
the paratextual marks tell him that there is a connection with another text string
and that he should shift his attention if he would want to follow it. Without link
marks there would be no such information and, what is more, there would be no
hypertext. The hypertext network depends on both kinds of preferred centers.
Hence, in contrast to linear texts, coherence in hypertext can only be described
sufficiently in a model that includes linguistic as well as non-linguistic, i.e. para-
textual, salience. In future research such a model could adapt results from Cen-
tering research to the analysis of hypertexts. It is, for example desirable to take
a closer look on the impact that the placement of hyperlinks in an utterance has
on cross-node coherence. Arguing on the assumption that every center shifting
between utterances entails a decrease of coherence, one could assume, that this
holds for cross-node coherence, too. Therefore, it would be plausible to let co-
incide the linguistic and the paratextual center as often as possible. On the other
hand, it is imaginable that this would confuse the reader, because the added in-
formation value would be less comprehensible. An integrative application of
Centering Theory, as suggested above, provides a framework to deal with these
questions.

References

Bexten, Birgitta
2006 Hypertext and Plurilinearity. Challenging an old-fashioned discourse
model. International Symposium. Discourse and Document. Schedae.
Prépublications de l’Université de Caen Basse Normandie. Press uni-
versitaires de Caen, 117–121.
Blumstengel, Astrid
1998 Entwicklung hypermedialer Lernsysteme. Ph.D. thesis, University of
Paderborn, Paderborn. http://dsor.uni-paderborn.de/forschung/publi
kationen/blumstengel-diss/.
Bolter, David J.
1991 Writing space: The computer, hypertext, and the history of writing.
Hilldale etc.: Lawrence Erlbaum Associates, Inc.
Byron, Donna K. and Stent, Amanda J.
1998 A preliminary model of Centering in dialog. Proceedings of the 36th
annual meeting on Association for Computational Linguistics, Mor-
riston : Association for Computational Linguists, 1475–1477.
Caldwell, Thomas P.
2002 Topic-Comment Effects in English. Meisei Review, 17: 49–69.
248 Birgitta Bexten

Chiarcos, Christian
this volume The Mental Salience Framework: Context-adequate generation of re-
ferring expressions. this volume, 105–139.
DeStefano, Diana and Jo-Anne LeFevre
2007 Cognitive load in hypertext reading. A review. Computing Human
Behaviour, 23: 1616–1641.
Genette, Gérard
1987 Seuils. Paris: Éditions du Seuil.
Grether, Reinhold
n.d. Die Weltrevolution nach Flusser. http://www.flusser.de/.
Grosz, Barbara J., Aravind K. Joshi, and Scott Weinstein
1995 Centering: A Framework for Modeling the Local Coherence of Dis-
course. Computational Linguistics, 21(2): 203–225.
Hahn, Udo and Michael Strube
1997 Centering in-the-Large: Computing Referential Discourse Segments.
Proceedings of the 35th Annual Meeting of the Association for Com-
putational Linguistics and the 8th Conference of the European Chap-
ter of the Association for Computational Linguistics, Madrid, 104–
111.
Hammwöhner, Rainer
1997 Offene Hypertextsysteme. Das Konstanzer Hypertextsystem (KHS)
im wissenschaftlichen und technischen Kontext. Universitäts-Verlag,
Konstanz (= Schriften zur Informationswissenschaft; 32).
Harweg, Roland
1974 Bifurcations de textes. Semiotika 12: 41–59.
Harweg, Roland
1979 [1968] Pronomina und Textkonstitution. München: Wilhelm Fink Verlag.
Hinterhölzl, Roland and Svetlana Petrova
this volume Rhetorical relations and verb placement in Old High German. this
volume, 173–201.
Kelleher, John D.
this volume Visual Salience and the Other One. this volume, 205–228.
Kruijff-Korbayová, Irina and Eva Hajicová
1997 Topics and Centers. A Comparision of the Salience-Based Approach
and the Centering Theory. The Prague Bulletin of Mathematical Lin-
guistics, 67: 25–50.
Kuhlen, Rainer
1991 Hypertext. Ein nicht-lineares Medium zwischen Buch und Wissens-
bank. Springer, Berlin, Heidelberg, New York etc. (Edition SEL-Stif-
tung).
Hypertext and multiple salience centers in the framework of Centering 249

Navarretta, Constanza
2002 Combining Information Structure and Centering-based Models
of Salience for Resolving Intersentential Pronominal Anaphora.
In: António Branco, Tony McEnery and Ruslan Mitkov, edi-
tors,Proceedings of DAARC 2002 – 4h Discourse Anaphora and
Anaphora Resolution Colloquium, Lisbon, September, 135–140.
Nielsen, Jakob
1995 Multimedia and hypertext: the internet and beyond. Boston u.a.: Aca-
demic Press Professional, Inc.
Ohler, Normen
1995 Die Quotenmaschine. http://home.ph-freiburg.de/kepser/qm/.
Prince, Ellen F.
1981 Toward a Taxonomy of Given-New Information. In: Peter Cole, edi-
tor, Radical Pragmatics. Academic Press, New York, 223–255.
Prince, Ellen F.
1992 The ZPG Letter: Subjects, Definiteness, and Information-Status. In:
William C. Mann and Sandra A. Thompson, editors. Discourse De-
scription: Diverse Analyses of a Fund Raising Text, John Benjamins
B.V., Philadelphia, Amsterdam, 295–325.
Ramm, Wiebke
this volume Discourse-structural salience from a cross-linguistic perspective: Co-
ordination and its contribution to discourse (structure). this volume,
143–172.
Rose, Ralph L.
this volume Joint Information Value of Syntactic and Semantic Prominence for
Subsequent Pronominal Reference. this volume, 81–103.
Scharl, Arno
2000 Evolutionary Web Development. Automated Analysis, Adaptive De-
sign, and Interactive Visualization of Commercial Web Information
Systems London: Springer.
Storrer, Angelika
2002 Coherence in Text and Hypertext. Document Design, 3(2):156–168.
Strube, Michael
1998 Never Look Back: An Alternative to Centering. Proceedings of the
17th International Conference on Computational Linguistics, Asso-
ciation for Computational Linguistics, Morristown, NJ, USA, 1251–
1257.
Strube, Michael and Udo Hahn
1999 Functional Centering: Grounding Referential Coherence in Informa-
tion Structure. Computational Linguistics, 25(3):309–344.
250 Birgitta Bexten

Swigart, Rob and Allen Strange


2002 About Time. A Digital Interactive Hypertext Fiction. Two Braided
Parallel Paths. A Double Helix. http://www.wordcircuits.com/gallery/
abouttime/.
Todesco, Rolf
1997 Die Definition als Textstruktur im Hyper-Sachbuch. In: Dagmar
Knorr and Eva Maria Jakobs, editors, Textproduktion in elektronis-
chen Umgebungen. Frankfurt/M: Peter Lang (= Textproduktion und
Medium; 2), 109–120.
Walker, Marilyn A., Aravind K. Joshi and Ellen F. Prince
1998 Centering in Naturally Occurring Discourse: An Overview. In: Mari-
lyn A. Walker and Aravind K. Joshi and Ellen F. Prince, editors, Cen-
tering Theory in Discourse. Oxford University Press, Oxford, Eng-
land, 1–30.
Walker, Marilyn A.
2000 Toward a Model of the Interaction of Centering with Global Dis-
course Structures.Verbum. http://www.dcs.shef.ac.uk/ walker/cent-
cache.pdf
Wehde, Susanne
2000 Typographische Kultur. Eine zeichentheoretische und kulturge-
schichtliche Studie zur Typographie und ihrer Entwicklung.. Tübin-
gen: Niemeyer (= Studien und Texte zur Sozialgeschichte der Liter-
atur, Bd. 69).
Establishing salience during narrative text
comprehension: A simulation view account

Berry Claus

1. Introduction

Salience as a theoretical concept has received only little attention in psycholog-


ical research on text comprehension, that is, in research on how comprehenders
mentally represent what is described in a text. However, more than fifteen years
ago, salience was one of the controversial issues of a dispute in this field. What
had happened? McKoon and Ratcliff published a paper in which they argued
for their minimalist hypothesis according to which “readers do not automati-
cally construct inferences to fully represent the situation described by a text”
(McKoon and Ratcliff, 1992, p. 440). They also provided a minimalist account
of a by-now classical empirical finding of Glenberg, Meyer, and Lindem (1987).
In the study by Glenberg and colleagues, participants read short narratives such
as (1) in either of the two versions and then were tested for the accessibility of
one of the mentioned entities, the target entity (e.g., sweatshirt). The target en-
tity was found to be better accessible when it was spatially associated with the
narrative’s protagonist (put on) compared to when it was spatially dissociated
from the protagonist (took off).

(1) John was preparing for a marathon in August. After doing a few warm-
up exercises, he put on (associated) / took off (dissociated) his sweat-
shirt and went jogging. He jogged halfway around the lake without too
much difficulty. Probe: sweatshirt

This finding received much attention and has been considered an elegant sup-
port for the notion that comprehenders construct situational representations
which guide comprehension. However, McKoon and Ratcliff claimed that there
is an alternative interpretation of the finding. They argued that the difference in
accessibility between the associated and dissociated condition can be explained
without assuming that comprehenders construct situational representations. Ac-
cording to their interpretation, the result does not reflect the spatial distance in
252 Berry Claus

the described situation. Rather, the differential accessibility should be attributed


to a difference between the two conditions with regard to salience in a proposi-
tional representation. According to this account, the associated entity was more
accessible than the dissociated entity because it was more salient.
It is true that the two conditions do not only differ with regard to the spatial
structure. Hence, McKoon and Ratcliff may be right that the difference in ac-
cessibility is not due to the manipulation of spatial distance but rather reflects
a difference in salience. Yet, there are two problems with their alternative in-
terpretation in terms of salience. First, McKoon and Ratcliff do not provide an
adequate explanation of why the target entity was more salient in the associated
condition compared with the dissociated condition. It should be noted that the
two conditions did not differ with regard to linguistic salience. The target entity
was mentioned only once and at the same syntactic position in both conditions.
According to McKoon and Ratcliff, the two conditions differed with regard to
salience in a propositional representation because the associated target entity
was more relevant to the discourse topic than the dissociated target entity (see
Hinterhölzl and Petrova, this volume, and Ramm, this volume, for discussions
of the term topic from a linguistic perspective). Though it seems intuitively
quite plausible that the two conditions differed with regard to discourse topi-
cality, it remains unclear what this difference is due to.
The second problem with McKoon and Ratcliff’s alternative interpretation
is the general denial of situational representations. They thereby ignore the pos-
sibility that salience may emerge from a situational representation. The notion
of salience is by no means incompatible with situational representations. On the
contrary, the salience of an entity which is mentioned in a text may well derive
from the representation of the described situation.1 Roughly speaking, the tar-
get entity in the study by Glenberg and colleagues might have been more salient
in the associated condition due to its relation to the protagonist and the affor-
dances it provides. Attributing the assumed difference in salience between the
two conditions as a result of the situational representation does not only provide
an explanation of what the difference is due to. It also offers a more parsimo-
nious account of the finding by Glenberg and colleagues in terms of salience
than McKoon and Ratcliff’s interpretation. Hence, it seems worthwhile to pur-
sue and elaborate an account of salience in terms of situational representations.
The aim of the present paper is to convince the reader that what makes an
entity salient – over and above linguistic factors – may indeed depend on the
representation of the described situation (for approaches that take into account

1. It should be noted that this is also one of the arguments in Glenberg’s reply (see
Glenberg and Matthew, 1992) to McKoon and Ratcliff (1992).
A simulation view account of salience 253

other extra-linguistic factors, see Bexten, this volume, who addresses the is-
sue of hypertextual salience marking; Kelleher, this volume, who developed a
model of reference resolution in situated communication that takes into account
visual salience; see also Rose, this volume, for an account of pronominal ref-
erence that joins the information of syntactic and semantic prominence). The
theoretical point of departure is the simulation view of language comprehen-
sion. According to this view, language comprehension is tantamount to men-
tally simulating the experience of the described situation. It should be noted in
advance that the paper is not intended as providing a full account of salience.
Its purpose is to outline a simulation view account of salience which claims that
salience may derive from mental simulations constructed during narrative text
comprehension.
The next section will provide a brief overview of the simulation view of
language comprehension. Section 3 will give an account of how mental simula-
tions during language comprehension may affect salience. Section 4 will report
empirical support for the simulation view account of salience. Section 5 will
conclude with some final remarks.

2. Language comprehension as mental simulation

Currently, there is wide agreement among language comprehension researchers


that narrative comprehension involves the construction of a representation of
the described situations (e.g., Johnson-Laird, 1983; van Dijk and Kintsch, 1983;
Zwaan and Radvansky, 1998). A situational representation constructed during
language comprehension is a referential representation. It is a representation
of the nonlinguistic entities which constitute the described situation (cf. Heim,
1982: file change semantic; Kamp and Reyle, 1993: discourse representation
theory; Karttunen, 1976: discourse referents).
Yet, a controversial issue is the question as to the representational format
of a referential representation. Traditionally, language comprehension – and
cognition in general – has been viewed as being based on the manipulation
of amodal, abstract symbols. This traditional view assumes clear demarcations
between language processing on the one hand and perception and action on the
other hand.
Recently, however, a radically different view has been gaining in impor-
tance in psychological language comprehension research. According to embod-
ied theories (e.g., Barsalou, 1999; Glenberg, 1997; Zwaan, 2004), referential
representations constructed during language comprehension recruit the same
modality-specific mental subsystems as representations constructed during non-
254 Berry Claus

linguistic cognition. Thus, referential representations are assumed to be modal


representations, which are grounded in perception and action. Proponents of
embodied theories adopt a simulation view of language comprehension. Lan-
guage comprehension is assumed to involve mentally simulating the described
states of affairs. Consider, for example, the utterance in (2).

(2) A pit bull is attacking a little girl.

According to the simulation view, the utterance is understood by mentally sim-


ulating the experience of the described situation. This simulation would utilize
the same mental subsystems as are involved when actually seeing and/or hear-
ing a pit bull attacking a little girl.
It should be noted that the simulation view does not imply that simulations
constructed during language comprehension are “life-like”. Simulations are al-
ways vague and incomplete (cf. Barsalou, 1999; Zwaan, 2004). A simulation
of described states of affairs can be considered to be a model of a part of a
world. This model cannot be complete, it has to be partial. Language never
specifies all aspects of a described situation. Moreover, comprehenders are eco-
nomical processors and usually do not fill in what language left undetermined
(e.g., Graesser, Singer and Trabasso, 1994).
There is already growing evidence for the simulation view of language com-
prehension. Neuroscientific studies have revealed a considerable overlap be-
tween the pattern of brain activation that occurs when a particular linguistic
expression is processed and the pattern of activation that is involved in actu-
ally experiencing the object or doing the activity denoted by the linguistic ex-
pression (e.g., Buccino, Riggio, Melli, Binkofski, Gallese, and Rizzolatti, 2005;
González, Barros-Loscertales, Pulvermüller, Meseguer, Sanjuán, Belloch, and
Ávila, 2006; Hauk, Johnsrude, and Pulvermüller, 2004; Moscoso del Prado
Martín, Hauk, and Pulvermüller, 2006). For example, the study by Hauk and
colleagues (2004) indicated that when processing verbs that refer to actions,
like pick or kick, areas in the motor area are activated, which overlap with the
areas that are activated when actually performing the actions.
Further empirical support for the simulation view comes from behavioural
studies (e.g., Claus and Kelter, 2009; Glenberg, Havas, Becker, and Rinck,
2005; Glenberg and Kaschak, 2002; Glenberg, Sato, and Cattaneo, 2008; Mete-
yard, Bahrami, and Vigliocco, 2007; Pecher, van Dantzig, Zwaan, and Zeelen-
berg, 2009; Taylor and Zwaan, 2008; Zwaan and Taylor, 2006). For example,
the study by Zwaan and Taylor (2006) revealed an interaction between compre-
hending sentences implying a particular motor action and concurrently perform-
ing a corresponding motor action. In one of their experiments, participants read
A simulation view account of salience 255

sentences that either implied a clockwise, (3a), or a counterclockwise manual


rotation, (3b). The sentences were presented frame-by-frame (in the example
below, the frame boundaries are indicated by the slashes). Participants advanced
through the sentences by turning a knob either in clockwise or in counterclock-
wise direction. Reading times for the frame that contained the critical verb (e.g.,
screwed in, turned down) were longer when there was a mismatch compared
to when there was a match between the direction of the rotation implied by
the verb and the direction of the knob turning required to advance through a
sentence.

(3) a. To attach / the boards / he / took out / his / screwdriver / and /


screwed in / the / screw.
b. He / realized / that the / music / was / too loud / so he / turned
down / the / volume.

Finally, there is also empirical evidence for the simulation view of language
from behavioural studies concerned with the representation of abstract infor-
mation, such as descriptions of non-physical transfer (Glenberg and Kaschak,
2002), desiderative sentence mood (Claus, 2008), or negation (Kaup, Lüdtke,
and Zwaan, 2006; Kaup, Yaxley, Madden, Zwaan, and Lüdtke, 2007).
Taken together, the findings from neuroscientific and behavioural studies
provide strong empirical support for the view that language comprehension in-
volves embodied mental simulations. The findings are difficult to align with
amodal theories of language comprehension. To be sure, amodal theories could
account for the findings by adding additional assumptions. However, such an
account would be a completely post hoc explanation.
Moreover, amodal theories suffer from two inherent problems. The trans-
duction problem refers to the lack of an account as to how amodal abstract
symbols emerge in the mind, that is, how perceptual experiences are transduced
into arbitrary symbols (Barsalou, 1999; see also Brooks, 1987). The reverse of
this problem is the symbol grounding problem. It pertains to the question as
to how amodal abstract symbols are mapped back onto the world, that is, how
the meaning of arbitrary symbols is grounded (Harnad, 1990). Both problems
do not exist in embodied theories of cognition which assume that meaning is
grounded in perception and action.
At present, the embodied account of language comprehension is not yet a
full-fledged theory. Most of the studies that investigated predictions of the sim-
ulation view were concerned with the processing and representation of narrated
concrete situations. The presently available evidence for the simulation view is
limited to narrative text comprehension. There are currently no studies within
256 Berry Claus

this framework which address the issue of expository texts. What is also still
lacking are substantial theoretical approaches and empirical evidence regard-
ing the question as to how embodied theories of language comprehension can
account for issues such as abstract concepts and function words. However, the
results of the above mentioned studies concerning abstract transfer, sentence
mood, and negation are promising with regard to future research within the
simulation view framework.
The next section considers what the simulation view of language compre-
hension in its present state can contribute to the issue of salience. The scope of
the considerations is limited to nonlinguistic aspects of the described situations
during the comprehension of narrative texts.

3. Salience is derivable from mental simulations

According to the simulation view of language comprehension, comprehenders


understand the description of a situation by running a mental simulation. With
regard to the comprehension of a narrative text that describes an ongoing occur-
rence consisting of temporally contiguous events, it can be assumed that com-
prehenders construct a coherent dynamic representation as they do when expe-
riencing an evolving event sequence (cf. Kelter, Kaup and Claus 2004). That is,
they would start with the simulation of the first event, and then continue with
the simulation of the second event, and so on, gradually constructing a coherent
representation. Only when encountering a temporal shift, the current simulation
is discontinued, and a new simulation is initiated (cf. Kelter et al., 2004). Hence,
as long as the narrative describes a temporally contiguous sequence of events,
the representation consists in a continuously growing simulation.
However, it is beyond question that the entire hitherto constructed simu-
lation cannot be available at a given moment during text processing. Due to
working-memory capacity limits, only a few elements of the described event
sequence are available at any one time. It seems reasonable to conceive these
elements as the most salient ones at a particular time during text processing.2
What does the simulation view imply with regard to the issue as to which
entities of a narrative constitute the available and hence salient elements at any
on time? The answer to this question emerges from two characteristics of men-
tal simulations constructed during language comprehension. Simulations are

2. Indeed, there seems to be wide agreement across different theories on text compre-
hension that the most salient entities are part of the available working-memory repre-
sentation. However, there is disagreement on the question of which factors constrain
the available set of entities.
A simulation view account of salience 257

assumed to be experiential and perspectival. Let’s first consider these two char-
acteristics and then turn back to the question.

3.1. Mental simulations of described situations are experiential


Mental simulations constructed during language comprehension rest upon ex-
periences. Roughly speaking, it is assumed that incoming words re-enact mul-
timodal memory traces of previous experiences with the entities which they de-
note (cf. Zwaan, 2004; see also Barsalou, 1999). Combinations of words govern
the activation of mutually compatible experiential traces and guide their integra-
tion in a simulation of the described situation. Take for example, the utterance
in (2) about the pit bull and the girl. When comprehending this utterance, the
words pit bull, attacks, and girl will each re-activate experiential traces of dif-
ferent encounters with pit bulls, girls, and attacking events (originating from
actual experience as well as for example from language or films).
Thus, according to the simulation view, language comprehension is strongly
affected by the comprehender’s experiences. Hence, mental simulations are
biased. Consider the short narrative in (4), adapted from Sanford and Garrod
(1981, p. 114).

(4) John was on his way to school. He was terribly worried about the maths
lesson. He thought he might not be able to control the class again today.
It was not a normal part of a janitor’s duties.

When reading the first sentence, most people will simulate a pupil on his way
to school. From an experiential simulation view this can be attributed to the
fact that for most comprehenders, the majority of memory traces of way-to-
school experiences originate from their own school days. Hence, most com-
prehenders are led up the garden path by an experientially biased simulation,
resulting in difficulties when processing the third sentence. According to the
simulation view, a teacher, who would read the narrative in (4) might construct
a differentially biased simulation, resulting in processing difficulties with the
last sentence when John turns out to be a janitor.
Mental simulations of described situations are not only shaped by unique ex-
periences. In particular, they are also constrained by basic principles underlying
human experience of the world such as temporal and spatial organization.
Time plays a central role in how we experience the world. The temporal
dimension can be considered to be the most important one in structuring our
experiences (cf. Navon, 1978). In experiencing, we conceive time as continu-
ously extending from past to present to future. The present is mentally set off
against the past and the future. More precisely, the situation that exists at the
258 Berry Claus

now point is mentally highlighted as it is given in perception and can be acted


upon. However, the now point is not fixed but moves forward continuously.
Empirical findings suggest that during narrative comprehension, comprehen-
ders similarly act on the assumption of a continuous progression of the now
point in the described world. Reading times for sentences implying a discontin-
uous shift of the narrative Now are prolonged compared with reading times for
sentences implying a continuous movement of the narrative Now (e.g., Bestgen
and Vonk, 2000; Rinck and Weber, 2003; Speer and Zacks, 2005).
Human experience is also organized and affected by the spatial dimension.
It is constrained by the scope of the human perceptual and motor apparatus. Per-
ception and action are confined to a limited spatial region. Objects within this
region are mentally organized by a spatial framework which is constructed by
the three axes of the body (head/feet, front/back, left/right). The axes differ in
accessibility depending on their perceptual and physical asymmetries and their
relation to gravity. Empirical findings indicate that during language comprehen-
sion, people likewise impose a spatial framework on the described world (e.g.,
Bryant, Tversky, and Lanca, 2001; Hörnig, Claus, and Eyferth, 2000; Franklin
and Tversky, 1990).
Inevitably, human experience never results in objective representations of
states of affairs. Rather, the representations are interpretations of states of af-
fairs, which are governed by the experiencer’s point of view. Hence, repre-
sentations constructed during nonlinguistic cognition are always perspectival.
According to the simulation view, this also holds for language comprehension
(cf. MacWhinney, 1977, 2005).

3.2. Mental simulations of described situations are perspectival


Narratives are usually centred around a protagonist. Hence, with regard to narra-
tive text comprehension, it can be assumed that the mental simulation of the de-
scribed events is biased by the stated or inferred perspective of the protagonist.
Indeed, empirical findings indicate that comprehenders adopt the protag-
onist’s spatial point of view (Black, Turner and Bower, 1979; Franklin and
Tversky, 1990; Rall and Harris, 2000; Ziegler, Mitchell and Currie; but see
O’Brien and Albrecht, 1992). For example, in the study by Black and colleagues
(1979), participants read sentences, such as (5a) and (5b), which consisted of
two clauses. The main clause introduced a character and his or her location; the
subordinate clause described a movement of a second character toward this lo-
cation. The movement was either referred to by a deictic term of motion (come)
that was consistent with the point of view of the first character, (5a), or by a de-
ictic term of motion (go) that implied a perspective shift, (5b).
A simulation view account of salience 259

(5) a. Bill was sitting in the living room reading the paper when John
came into the living room.
b. Bill was sitting in the living room reading the paper when John
went into the living room.
Reading times for the perspective-shift sentences were found to be prolonged
compared with reading times for the perspective-consistent sentences, indicat-
ing that the participants adopted the spatial point of view of the first character.
This conclusion is further bolstered by the additional finding that participants
made systematic errors in recalling the perspective-shift sentences by replacing
went by came.
However, spatial point of view is merely one type of perspective. There
is empirical evidence that comprehenders track the protagonist’s mental per-
spective as well. Studies concerning the representation of emotions suggest
that comprehenders infer unmentioned emotional states of protagonists (Gerns-
bacher, Goldsmith and Robertson, 1992; Gernsbacher and Robertson, 1992).
Other findings indicate that comprehenders also infer non-explicitly stated goals
of protagonists (Long and Golding, 1993; Poynor and Morris, 2003). In ad-
dition, there is evidence that comprehenders keep track of the protagonist’s
knowledge/ignorance (Barquero, 1999; de Vega, Díaz and León, 1997). A study
by Sanford, Clegg, and Majid (1998) suggests that states of affairs being men-
tioned in a narrative are generally mentally coded in terms of their significance
to the protagonist. The results indicate that a background information sentence,
such as (6), is interpreted as being experienced by the protagonist.
(6) The air was hot and sticky.
As mental simulations that are constructed during language comprehension are
assumed to be experiential in nature, they are biased by the comprehender’s
current and past personal experiences. Accordingly, mental simulations con-
structed during language comprehension should be biased also by the compre-
hender’s perspective. Indeed, empirical studies indicate that language compre-
hension is affected by the comprehender’s personality. Findings of Zwaan and
Truitt (1998) indicate that smokers and non-smokers differ with regard to pro-
cessing smoking-related sentences. A study by Holt and Beilock (2006) sug-
gests that novice and expert ice hockey players and novice and expert football
players construct different mental simulations of described hockey-specific sit-
uations and football-specific situations, respectively.
Let’s now turn back to the question as to what the simulation view implies
with regard to the issue as to which entities of a narrative constitute the salient
elements at a given moment during text comprehension.
260 Berry Claus

3.3. Implications for salience


As outlined at the beginning of this section, capacity limits constrain the avail-
ability of elements of the narrated world. Only a restricted part of the narrated
world is available at any one time during comprehension. Elements which be-
long to that part can be considered to be the currently most salient entities of the
unfolding narrative. As to the question which part of the narrated world can be
assumed to be available at a given time, the prediction of the simulation view
should be straightforward by now (and might, at first glance, even appear to be
trivial). According to the simulation view, what constitutes the available part is
the current Here and Now of the protagonist. Hence, the simulation view im-
plies that the entities which make up the protagonist’s current situation are the
most salient ones at a given time in the course of comprehension. Thus, entities
pertaining to the protagonist’s Now and entities pertaining to the protagonist’s
Here should, in principle, be highly salient, whereas temporally or spatially re-
mote entities should be (more or less) low salient.
However, mental simulations constructed during narrative text comprehen-
sion are assumed to be perspectival. They are biased by the protagonist’s per-
spective which additionally determines which entities of the narrated world
compose the available set of entities at a given time during comprehension.
First, the available set is not confined to entities which are physically present at
the protagonist’s current situation. It is molded by the protagonist’s mental per-
spective. Hence, the available set should also comprise the protagonist’s mental
states such as thoughts, emotions or goals. Second, the available set does not
consist of all entities which are present at the protagonist’s current situation. It
is constrained by the protagonist’s spatial point of view and his or her needs
and goals. As a result, the available set should first and foremost include those
entities of the current situation which are visible to the protagonist, which he or
she could act upon, or which are of functional importance to his or her current
situation.

4. Empirical Evidence

This section will report findings which provide empirical support for the claim
that salience may derive from mental simulations during language comprehen-
sion. Before turning to these findings, some remarks on the measurement of
salience have to be made.
A simulation view account of salience 261

4.1. Measuring salience by testing accessibility


The empirical findings reported below come from studies that investigated
whether the mental accessibility of entities mentioned in a narrative text is
affected by properties of the described situation.3 Here, the findings of these
studies are considered as being of relevance for the issue of salience as it can
be assumed that highly salient entities which are part of the available set of
elements at the time of testing are better accessible than non-salient entities.
It should be noted that a difference in accessibility does not necessarily im-
ply a difference in anticipation of anaphoric reference. Manipulating the tempo-
ral distance in the described world (large vs. less large) between a past event and
the current narrative Now at the time of testing affected the mental accessibility
of an entity involved in the past event but had no effect on ratings of the like-
lihood that the upcoming text would anaphoricically refer to the entity (Claus
and Kelter, 2006, control experiment). This suggests that mental accessibility
is not a reliable predictor of the degree to which an anaphoric reference is ex-
pected – at least not in case of entities which do not pertain to the protagonist’s
current Now.
However, this may not affect the scope of the findings reported below. The
findings stem from studies which compared the mental accessibility of entities
which are present in the protagonist’s current situation to the mental accessibil-
ity of entities which are absent from the current situation. A result by Glenberg
and Mathew (1992) indicates that entities that are spatially and temporally as-
sociated with the protagonist and entities that are spatially dissociated differ in
mental accessibility as well as in ratings of perceived salience.

4.2. Mental accessibility of elements of the protagonist’s current situation


According to the simulation view, the available set of elements at a given time
during text processing includes those entities which make up the protagonist’s
current situation. Hence, entities pertaining to the narrative Now at the time of
testing should be highly accessible. Indeed, numerous studies have shown that
states of affairs that obtain at the current Now are especially easy to access,
whereas states of affairs that obtained in the described world prior to that time

3. The majority of these studies tested mental accessibility by measuring reaction times
on a probe-recognition task. In a typical probe-recognition task experiment, partici-
pants read texts sentence by sentence at a self-paced rate. At a given moment (either
during or at the end of the text presentation), they are presented with a probe word.
Their task is to indicate as quickly and accurately as possible whether or not the word
was mentioned in the text.
262 Berry Claus

are less accessible (e.g., Anderson, Garrod, and Sanford, 1983; Bestgen and
Vonk, 1995; Carreiras, Carriedo, Alonso, and Fernández, 1997; Magliano and
Schleich, 2000; Speer and Zacks, 2005; Zwaan, 1996; Zwaan, Madden, and
Whitten, 2000).
In one of the experiments of the study by Carreiras and colleagues (1997,
Experiment 1), participants read short narratives and were tested for the mental
accessibility of a job description (e.g., economist) that was mentioned in the
narrative either in a sentence describing the protagonist’s current situation, as
in (7a), or in a comparable sentence referring to the protagonist’s past, as in (7b).
(7) a. Now she works as an economist for an international company.
b. Sometime in the past she worked as an economist for an interna-
tional company.
The job description was found to be more accessible when it was presented
as currently applying to the protagonist, as in (7a), compared to when it was
presented as not currently applying to the protagonist, as in (7b). Remarkably,
this effect was obtained even when the accessibility was tested immediately
after the manipulated sentence, that is, immediately after reading the sentence
in which the job description was introduced.
Findings by Zwaan, Madden, and Whitten (2000) also indicate that the pres-
ence or absence of states of affairs at the protagonist’s Now immediately affects
accessibility. Participants read sentence pairs such as (8a) or (8b).
(8) a. Thomas was programming his computer. When his drink spilled,
he continued.
b. Thomas was programming his computer. When his drink spilled,
he stopped.
After reading the second sentence, participants were tested for the accessibility
of the activity mentioned in the first sentence (e.g., programming). The activ-
ity proved to be more accessible when the second sentence stated that it was
still going on at the protagonist’s Now at the time of testing, as in (8a), com-
pared to when the second sentence stated a discontinuation of the activity, as
in (8b).
Additional evidence for the effect of temporal presence on mental accessibil-
ity stems from studies examining the role of verb aspect (Carreiras et al., 1997,
Experiment 3; Magliano and Schleich, 2000, Experiment 3). For example, in
the experiment by Magliano and Schleich (2000), participants were presented
with narratives containing a sentence which described an activity of the protag-
onist either in imperfective aspect, as in (9a), or in perfective aspect, as in (9b).
A simulation view account of salience 263

Hence, the sentence either implied that the activity was ongoing or that it was
completed.

(9) a. Stephanie was changing the flat tire.


b. Stephanie changed the flat tire.

The manipulation of verb aspect had an impact on the accessibility of the activ-
ity. The activity was more accessible when it was conveyed with an imperfec-
tive aspect compared to when it was conveyed with a perfective aspect.
The results of the studies reported so far indicate that states of affairs per-
taining to the protagonist’s Now are more easy to access than states of affairs
pertaining to the (far or distant) past. This finding provides strong empirical
evidence for the assumption that the available set of elements at a given time
during text processing is composed of those entities that make up the protago-
nist’s current situation.
Additional support for the assumption comes from studies which compared
the mental accessibility of entities that are present at the current situation to the
mental accessibility of spatially distant entities. Most of the studies which inves-
tigated the impact of spatial distance on mental accessibility employed an ex-
perimental paradigm introduced by Morrow, Greenspan, and Bower (1987) or
Rinck and Bower’s (1995) variant of this paradigm (e.g., Dutke, 2003; Haenggi,
Kintsch, and Gernsbacher, 1995; Morrow, Bower, and Greenspan, 1989; Rinck,
Hähnel, Bower, and Glowalla, 1997; Rinck, Williams, Bower, and Becker,
1996). Participants first memorize the layout of rooms in a building and the
various objects in the rooms. They then read a narrative containing several mo-
tion sentences, such as (10), describing the protagonist’s movement from one
room (source room) through an unmentioned path room to a goal room. After
the presentation of a motion sentence, reading is interrupted by a test of the
mental accessibility of the objects in the rooms.

(10) Then he walked from the library to the storage room.

The typical result of studies using this “Morrow paradigm” is that objects of
the goal room, that is, objects at the protagonist’s current location at the time
of testing, are more accessible than objects of the path room, which in turn are
more accessible than objects of the source room. This finding fits well with the
assumption that entities pertaining to the protagonist’s current Here are highly
salient. Yet, it is questionable whether the results from studies involving memo-
rizing a spatial layout before reading a text about it can be generalized to natural-
istic reading conditions. The layout learning may have directed the participants’
264 Berry Claus

attention to the spatial properties of the described world (cf. Zwaan, Radvansky,
Hilliard, and Curiel, 1998).4
This objection does not hold for the study by Glenberg and colleagues (1987)
which has been mentioned already in the introduction of this paper. Glenberg
and colleagues presented their participants with short narratives such as (1),
repeated here as (11), without especially encouraging them to attend to spatial
relations.
(11) John was preparing for a marathon in August. After doing a few warm-
up exercises, he put on (associated) / took off (dissociated) his sweat-
shirt and went jogging. He jogged halfway around the lake without too
much difficulty.
Participants were tested for the accessibility of an entity mentioned in the ma-
nipulated sentence (e.g., sweatshirt). It was found that the entity was more ac-
cessible after reading the associated version than after reading the dissociated
version. That is, the probed entity was more accessible when it pertained to the
protagonist’s current Here at the time of testing compared to when it was spa-
tial distant from the protagonist. According to the simulation view account of
salience this result can be attributed to the salience of the protagonist’s Here.
However, one might object that the difference in mental accessibility does
not reflect an effect of spatial presence. Indeed, the associated and dissociated
condition did not only differ with regard to spatial distance but also with re-
gard to the functional importance of the probed entities for the protagonist’s
current situation. Yet, an explanation of the results in terms of functionality (cf.
Radvanyky and Copeland, 2000) would by no means be incompatible with the
simulation view account of salience. As Glenberg and Mathew (1992) pointed

4. Findings from research on spatial text information suggest the conclusion that com-
prehenders do not normally construct spatial representations (at least not detailed
ones), unless such representations are necessary with regard to the specific task de-
mands or personal reading goals or are easy to construct. However, this conclusion
may be premature considering that in virtually all the studies the material was pre-
sented visually. Visual text presentation as opposed to auditory text presentation can
be expected to be disadvantageous with regard to representing spatial information
about a described situation (cf., Brooks, 1967; Eddy and Glass, 1981). As reading
already involves a spatial task, namely the control of eye movements, it should inter-
fere with the processing of spatial text information. Indeed, empirical results indicate
that comprehenders construct detailed spatial representations of described situations
with auditory text presentation but not with visual text presentation under conditions
where neither the instruction, nor the materials, nor the experimental task highlighted
spatial information (Claus and Kelter, 2009).
A simulation view account of salience 265

out in their reply to McKoon and Ratcliff (1992), “what make an object or event
salient are its relations to the other objects and events and our knowledge about
what those relations imply for further action”.
Mental simulations of described situations are assumed to be perspectival
and highly selective, as are representations of directly experienced situations.
Hence, the available set of entities at a given time during processing is assumed
to be determined by the protagonist’s spatial and mental perspective. Thus, it
should include, in particular, those entities which are functionally related to the
protagonist.

4.3. Effects of the spatial and mental perspective of the protagonist on


mental accessibility
Mental simulations constructed during language comprehension are assumed
to be biased by the protagonist’s point of view. Thus, according to the simula-
tion view account of salience, which entities of the narrated world compose the
available set is affected by the protagonist’s spatial and mental perspective.
As was mentioned in section 3.2, several studies indicate that comprehen-
ders adopt the protagonist’s spatial point of view. There are also empirical find-
ings suggesting that the spatial point of view of the protagonist affects the men-
tal accessibility of entities of the described situation.
In a study by Borghi, Glenberg, and Kaschak (2004, Experiment 1), partici-
pants were presented with sentences describing an object either from an inside
perspective, as in (12a), or from an outside perspective, as in (12b).
(12) a. You are driving a car.
b. You are washing a car.
After each sentence, participants had to respond to a part verification task. A
probe was presented and the participants’ task was to verify if the probe named
a part of the object mentioned in the sentence. Some of the probes named parts
usually found inside the object (e.g., fuel gauge) and some of them named parts
usually found outside the object (e.g., trunk). Reaction times on the part veri-
fication task revealed an interaction between the manipulated perspective and
the type of the probed part. Reaction times were shorter when the location of
the probed part was consistent with the perspective location (e.g., fuel gauge
– driving) compared to when it was inconsistent (e.g., trunk – driving). This
finding suggests that perspective-consistent entities of the described world are
more accessible than perspective-inconsistent entities. Admittedly, the study by
Borghi and colleagues tested the mental accessibility of unmentioned entities.
Hence, it remains an open question, whether similar results would be obtained
266 Berry Claus

when testing the mental accessibility of explicitly mentioned entities. However,


findings of a study by Horton and Rapp (2003) lend some credence to the con-
jecture that this question might receive a positive answer in future research.
The study by Horton and Rapp was concerned with the question of whether
the mental accessibility of an entity mentioned in a text is affected by its visi-
bility to the protagonist. Participants were presented with short narratives like
the one in (13). Each narrative mentioned a critical entity that was visible from
the protagonist’s current point of view (e.g., mailbox). There were two versions
of the last sentence of the narrative, manipulating the visibility of the critical
entity without referring to it. In the unblocked version, the sentence described
an event that did not block the protagonist’s view of the critical entity, such
that it remained visible. In the blocked version, the sentence described an event
that blocked the view, such that the critical entity became occluded from the
protagonist’s view.

(13) Mr. Ranzini was sitting outside on his front stoop. He had lived on
this block for over 30 years. Next door was a local playground for the
children. Directly across the street was the mailbox that he used. As
usual, Mrs. Rosaldo was taking her poodle for a walk.
unblocked: Suddenly, a man on a bike rode up in front of Mr. Ranzini.
blocked: Suddenly, a large truck pulled up in front of Mr. Ranzini.

At the end of the narrative, participants had to respond to a probe question about
the critical entity (e.g., Was there a mailbox across the street?). Response la-
tencies to the probe question were found to be shorter after reading the un-
blocked version of the last sentence compared to after reading the blocked ver-
sion. This result suggests that entities which are visible to the protagonist are
mentally more accessible than entities which are occluded from the protago-
nist’s view.
Hence, there is some, albeit limited, support for the assumption, that the
available set of entities at a given time during narrative comprehension, is de-
termined by the protagonist’s perceptual perspective. Let’s now turn to studies
which demonstrate effects of the protagonist’s mental perspective on accessi-
bility.
Results of a study using the “Morrow paradigm” (see section 4.2) indicate
that the mental accessibility of entities does not only depend on the protago-
nist’s current spatial location but mainly on his or her mental location, that is, a
location pertaining to the protagonist’s thoughts (Morrow et al., 1989). Partici-
pants were presented with sentences, such as (14), describing the protagonist’s
thoughts which involved a particular room.
A simulation view account of salience 267

(14) He thought the library should be rearranged to make room for a dis-
play of current research.
It was found that after reading such a sentence, objects located in the room that
the protagonist was thinking about were more accessible than objects in any
other room.
Additional support for the impact of the protagonist’s mental perspective on
accessibility comes from studies concerning the significance of protagonists’
goals. Empirical findings of several studies indicate that protagonists’ goals are
highly accessible (e.g., Dopkins, Klin and Myers, 1993; Huitema, Dopkins, Klin
and Myers, 1993; Suh and Trabasso, 1993). Other findings suggest that this only
holds as long as the protagonist’s goal is not satisfied. Goal-related information
was found to be more accessible when it pertains to a failed goal compared to
when it pertains to a completed goal (Lutz and Radvansky, 1997; Radvansky,
and Curiel, 1998; Suh and Trabasso, 1993).5
In the study by Radvansky and Curiel, participants read narratives such as
(15). There were two versions of the narrative which differed with respect to
whether an initially mentioned goal of the protagonist (buying a retirement gift)
was failed or completed.
(15) Once there was a bank teller named Roy. Roy realized his boss was
retiring in four days. He wanted to give her a retirement gift. He went
to the department store.
failed goal: He couldn’t find anything nice enough. He felt dis-
couraged.
completed goal: He bought a nice big-screen TV for his boss. He felt
pretty good.
Accessibility of the protagonist’s (ongoing or completed) goal was tested by
measuring response latencies to a probe question (e.g., Had Roy wanted to buy
his boss a gift?). Response were found to be shorter after reading that the goal
was failed compared to after reading that the goal was satisfied.
The findings of the studies reported in this section fit well with the assump-
tion that the available set of entities is shaped by the protagonist’s point of view.
They indicate that the mental accessibility of an entity is affected by its relation
to the protagonist’s spatial and mental perspective.

5. This resembles an effect found for non-linguistic cognition. People remember un-
completed tasks better than completed tasks (Zeigarnik, 1927).
268 Berry Claus

5. Conclusion

The present paper took a look at the issue of salience from the perspective of the
simulation view of language comprehension. According to the simulation view,
extra-linguistic salience derives from the mental simulations constructed during
language comprehension. The available set of entities at a given moment during
processing is assumed to be determined by the mental simulation of the situa-
tion at the current narrative Now. The studies reported in the previous sections
provide empirical support for this assumption.
It should be noted, that the simulation view account of salience by no means
implies a denial of the impact of linguistic devices on salience. Mental simu-
lations of described situations are constructed through language. They are in-
structed by linguistic means. Hence, the simulation view account of salience is,
in principle, not incompatible with accounts of salience in terms of linguistic
factors. It would be interesting to see whether and how both types of accounts
might benefit from each other.
In future research, it needs to be clarified whether and how simulation-based
salience affects the resolution of referential expressions and how this interacts
with effects of linguistic salience. There are already some promising empirical
findings in this regard. Results of a sentence-completion study by Stevenson,
Crawley and Kleinman (1994) indicate a preference to resolve an ambiguous
pronoun with the thematic role that is associated with the consequences of the
precedingly described event (i.e., goal, patient) rather than with the thematic
role that is associated with the starting point (i.e., source, agent) – even when
order of mention/syntactic function is controlled for. Results of Morrow (1985)
also suggest that reference resolution depends more on the event structure than
on the linguistic surface structure. He investigated the resolution of ambiguous
definite noun phrases (e.g., He noticed the room was dark) after reading a sen-
tence that described that a character moved from one room to another room.
Antecedent choices were determined more by grammatical aspect and preposi-
tion that implied the mover’s current location (e.g., John walked past the living
room into the kitchen vs. John was walking past the living room to the kitchen)
than by the order of mention. Remarkably, an additional experiment indicated
that the referential interpretation of proximal and distal demonstratives (e.g.,
this room vs. that room) was affected by spatial and/or temporal distance in
the described world rather than by surface linguistic distance. However, Mor-
row’s findings as well as the findings by Stevenson and colleagues are based on
offline measures. In future studies on the effects of linguistic versus simulation-
based salience, it would be expedient to investigate which factors influence the
time course of reference resolution during online comprehension. For exam-
A simulation view account of salience 269

ple, findings of a recent study by Kaiser, Runner, Sussman and Tannenhaus


(2009) suggest that fine-grained online measures may yield more differentiated
results. In a visual-world eye-tracking experiment, they found early effects of
both structural and semantic constraints (syntactic vs. semantic role) on the in-
terpretation of pronouns and reflexives (e.g., Peter {told / heard from} Andrew
about the picture of {him / himself} on the wall), with the two anaphoric forms
being differentially sensitive to structural and semantic information.
The present paper focused exclusively on issues concerning language com-
prehension. It will be a crucial task to account for the production side, that
is, the issue as to whether simulation-derived extra-linguistic salience does af-
fect the choice of referring expressions and how such effects may interact with
purely linguistic factors (see Filchenko, this volume, for culturally conditioned
effects on speakers’ choice of grammatical constructions that were found for
the indigenous language Eastern Khanty). An experimental study by Arnold
(2001) is germane to this question. She found that speakers more frequently
used pronouns when referring to the goal of a transfer verb than when refer-
ring to the source. However, this effect primarily occurred for object referents
rather than for subject referents6 (see also Rose, this volume, who provides
corpus-based evidence that both, syntactic and semantic prominence affect sub-
sequent pronominal reference). Arnold’s findings do suggest that situational
factors, such as event structure, may co-determine the choice of referring ex-
pressions. Yet, it remains to be seen in future studies whether and to what extent
simulation-based salience can affect the form of anaphorical reference.
The choice as well as the interpretation of referring expressions is largely
governed by linguistic conventions. For instance, pronominal reference is usu-
ally not licensed when there is no explicit linguistic antecedent.7 Hence, it is not
surprising that the form and resolution of anaphoric expressions is crucially de-
termined by purely linguistic factors. However, recent findings strongly support
a form-specific multiple-constraints approach, which claims that the resolution
of anaphora is guided by multiple constraints (e.g., syntactic, semantic, dis-
course based) that are weighted differently for different anaphoric forms (e.g.,
Brown-Schmidt, Byron, and Tanenhaus, 2005; Kaiser and Trueswell, 2008;
Kaiser et al., 2009). The point of this paper is that extra-linguistic, simulation-
based salience may also play a role – at least in narrative comprehension.

6. Overall, the proportion of pronoun uses was higher for subject referents than for
object referents.
7. Exceptions are, for example, focussed entities in situated language processing and
probably also nuclear implicit referents in dialogue utterances (for empirical evi-
dence, see Cornish, Garnham, Cowles, Fossard, and André, 2005).
270 Berry Claus

References

Anderson, Anne, Simon C. Garrod, and Anthony J. Sanford


1983 The accessibility of pronominal antecedents as a function of episode
shifts in narrative text. Quarterly Journal of Experimental Psychology
35A: 427–440.
Arnold, Jennifer E.
2001 The effects of thematic roles on pronoun use and frequency of refer-
ence. Discourse Processes 31: 137–162.
Barquero, Beatriz
1999 Mentale Modelle von mentalen Zuständen und Handlungen der Text-
protagonisten [Mental models of text protagonists’ mental states and
actions]. Zeitschrift für Experimentelle Psychologie 46: 243–248.
Barsalou, Lawrence W.
1999 Perceptual symbol systems. Behavioral and Brain Sciences 22: 577–
660.
Bestgen, Yves and Wietske Vonk
1995 The role of temporal segmentation markers in discourse processing.
Discourse Processes 19: 385–406.
Bexten, Birgitta
this vol. Multiple preferred centers in a plurilinear discourse environment.
this volume, 229–250.
Black, John B., Terrence J. Turner, and Gordon H. Bower
1979 Point of view in narrative comprehension, memory, and production.
Journal of Verbal Learning and Verbal Behavior18: 187–198.
Borghi, Anna M., Arthur M. Glenberg, and Michael P. Kaschak
2004 Putting words in perspective. Memory & Cognition 32: 863–873.
Brooks, Lee R.
1967 The suppression of visualization by reading. Quarterly Journal of Ex-
perimental Psychology 19: 289–299.
Brooks, Rodney A.
1987 Intelligence without representation. Artificial Intelligence 47: 139–
159.
Brown-Schmidt, Sarah, Donna K. Byron, and Michael K. Tanenhaus
2005 Beyond salience: Interpretation of personal and demonstrative pro-
nouns. Journal of Memory and Language 53: 292–313.
Byrant, David J., Barbara Tversky, and Margaret Lanca
2001 Retrieving spatial relations from observation and memory. In Con-
ceptual structure and its interfaces with other modules of representa-
tion, Emile van der Zee and Urpo Nikanne (eds.), 116–139. Oxford:
Oxford University Press.
A simulation view account of salience 271

Buccino, Giovanni, Lucia Riggio, Gabriele Melli, Ferdinand Binkofski, Vittorio Gallese,
and Giacomo Rizzolatti
2005 Listening to action-related sentences modulates the activity of the mo-
tor system: A combined TMS and behavioral study. Cognitive Brain
Research 24: 355–363.
Carreiras, Manuel, Núria Carriedo, María A. Alonso, and Angel Fernández
1997 The role of verb tense and verb aspect in the foregrounding of infor-
mation during reading. Memory & Cognition 25: 438–446.
Claus, Berry
2008 Comprehending descriptions of non-factual desired situations: Dis-
course referents and motor actions. In Proceedings of the Workshop
Constraints in Discourse 3, Anton Benz, Peter Kühnlein, and Man-
fred Stede (eds.). Potsdam, Germany.
Claus, Berry and Stephanie Kelter
2006 Comprehending narratives containing flashbacks: Evidence for tem-
porally organized representations. Journal of Experimental Psychol-
ogy: Learning, Memory, and Cognition 32: 1031–1044.
Claus, Berry and Stephanie Kelter
2009 Embodied language comprehension: The processing of spatial infor-
mation during reading and listening. In Advances in Psychology Re-
search, Vol. 59, Alexandra M. Columbus (ed.), 1–44. Hauppauge,
NY: Nova Science.
Cornish, Francis, Alan Garnham, H. Wind Cowles, Marion Fossard, and Virginie André
2005 Indirect anaphora in English and French: A cross-linguistic study of
pronoun resolution. Journal of Memory and Language 52: 363–376.
de Vega, Manuel, José M. Díaz, and Immaculada León
1997 To know or not to know: Comprehending protagonists’ beliefs and
their emotional consequences. Discourse Processes 23: 169–192.
Dopkins, Stephen, Celia Klin, and Jerome L. Myers
1993 Accessibility of information about goals during the processing of nar-
rative texts. Journal of Experimental Psychology: Learning, Memory,
& Cognition 19: 70–80.
Dutke, Stephan
2003 Anaphor resolution as a function of spatial distance and priming: ex-
ploring the spatial distance effect in situation models. Experimental
Psychology 50: 270–284.
Eddy, John K., and Arnold L. Glass
1981 Reading and listening to high and low imagery sentences. Journal of
Verbal Learning and Verbal Behavior 20: 333–345.
272 Berry Claus

Filchenko, Andrey Y.
this vol. Parenthetical agent-demoting constructions in Eastern Khanty: Dis-
course salience vis-à-vis referring expressions. this volume, 57–79.
Franklin, Nancy and Barbara Tversky
1990 Searching imagined environments. Journal of Experimental Psychol-
ogy: General 119: 63–76.
Gernsbacher, Morton Ann, H. Hill Goldsmith, and Rachel R.W. Robertson
1992 Do readers mentally represent fictional characters’ emotional states?
Cognition & Emotion 6: 89–111.
Gernsbacher, Morton Ann and Rachel R.W. Robertson
1992 Knowledge activation versus sentence mapping when representing
fictional characters’ emotional states. Language and Cognitive Pro-
cesses 7: 353–371.
Glenberg, Arthur M.
1997 What memory is for. Behavioral and Brain Sciences 20: 1–55.
Glenberg, Arthur M., David Havas, Raymond Becker, and Mike Rinck
2005 Grounding language in bodily states: The case for emotion. In
Grounding cognition: The role of perception and action in memory,
language, and thinking, Diane Pecher and Rolf A. Zwaan (eds.), 115–
128. Cambridge: Cambridge University Press.
Glenberg, Arthur M. and Michael P. Kaschak
2002 Grounding language in action. Psychonomic Bulletin & Review 9:
558–565.
Glenberg, Arthur M., and Shashi Mathew
1992 When minimalism is not enough: Mental models in reading compre-
hension. Psycoloquy 3(64), reading-inference-2.1
Glenberg, Arthur M., Marion Meyer, and Karen Lindem
1987 Mental models contribute to foregrounding during text comprehen-
sion. Journal of Memory and Language 26: 69–83.
Glenberg, Arthur M., Marc Sato, and Luigi Cattaneo
2008 Use-induced motor plasticity affects the processing of abstract and
concrete language. Current Biology 18: R290–R291.
González, Julio, Alfonso Barros-Loscertales, Friedemann Pulvermüller, Vanessa Mese-
guer, Ana Sanjuán, Vicente Belloch, and César Ávila
2006 Reading cinnamon activates olfactory brain regions. NeuroImage 32:
906–912.
Graesser, Arthur C., Murray Singer, and Tom Trabasso
1994 Constructing inferences during narrative text comprehension. Psy-
chological Review 101: 371–395.
A simulation view account of salience 273

Haenggi, Dieter, Walter Kintsch, and Morton Ann Gernsbacher


1995 Spatial situation models and text comprehension. Discourse Proces-
ses 19: 173–199.
Harnad, Stevan
1990 The symbol grounding problem. Physica D 42: 335–346.
Hauk, Olaf, Ingrid Johnsrude, and Friedemann Pulvermüller
2004 Somatotopic representation of action words in human motor and pre-
motor cortex. Neuron 41: 301–307.
Heim, Irene
1982 The semantics of definite and indefinite noun phrases. Unpublished
Dissertation. University of Massachusets, Armherst.
Hinterhölzl, Roland and Svetlana Petrova
this vol. Rhetorical relations and verb placement in Old High German. this
volume, 173–201.
Holt, Lauren E. and Sian L. Beilock
2006 Expertise and its embodiment: Examining the impact of sensorimo-
tor skill expertise on the representation of action-related text. Psycho-
nomic Bulletin & Review 13: 694–701.
Hörnig, Robin, Berry Claus, and Klaus Eyferth
2000 In search of an overall organizing principle in spatial mental models:
a question of inference. In Spatial Cognition: Foundations and ap-
plications. Selected papers from Mind III, Annual conference of the
Cognitive Science Society of Ireland, 1998, Seán Ó’Nualláin (ed.),
69–81. Amsterdam: John Benjamins.
Horton, William S. and David N. Rapp
2003 Out of sight, out of mind: Occlusion and the accessibility of informa-
tion in narrative comprehension. Psychonomic Bulletin & Review 10:
104–110.
Huitema, John S., Stephen Dopkins, Celia M. Klin, and Jerome L. Myers
1993 Connecting goals and actions during reading. Journal of Experimental
Psychology: Learning, Memory, & Cognition 19: 1053–1060.
Johnson-Laird, Philip N.
1983 Mental models: Towards a cognitive science of language, inference,
and consciousness. Cambridge: Cambridge University Press.
Kaiser, Elsi, Jeffrey T. Runner, Rachel S. Sussman, and Michael K. Tanenhaus
2009 Structural and semantic constraints on the resolution of pronouns and
reflexives. Cognition 112: 55–80.
274 Berry Claus

Kaiser, Elsi and John C. Trueswell


2008 Interpreting pronouns and demonstratives in Finnish: Evidence for a
form-specific approach to reference. Language and Cognitive Pro-
cesses 23: 709–748.
Kamp, Hans and Uwe Reyle
1993 From discourse to logic. Dordrecht: Kluwer Academic Publishers.
Kaup, Barbara, Jana Lüdtke, and Rolf A. Zwaan
2006 Processing negated sentences with contradictory predicates: Is a door
that is not open mentally closed? Journal of Pragmatics 38: 1033–
1050.
Kaup, Barbara, Richard H. Yaxley, Carol J. Madden, Rolf A. Zwaan, and Jana Lüdtke
2007 Experiential simulations of negated text information. Quarterly Jour-
nal of Experimental Psychology 60: 976–990.
Karttunen, Lauri
1976 Discourse referents. In Syntax and semantics, Vol. 7: Notes from the
linguistic underground, James D. McCawley (ed.), 363–386. New
York: Academic Press.
Kelleher, John D.
this vol. Visual salience and the other one. this volume, 209–228.
Kelter, Stephanie, Barbara Kaup, and Berry Claus
2004 Representing a described sequence of events: A dynamic view of nar-
rative comprehension. Journal of Experimental Psychology: Learn-
ing, Memory, and Cognition 30: 451–464.
Long, Debra L. and Jonathan M. Golding
1993 Superordinate goal inferences: Are they automatically generated dur-
ing comprehension? Discourse Processes 15: 55–73.
Lutz, Mark F. and Gabriel A. Radvansky
1997 The fate of completed goal information in narrative comprehension.
Journal of Memory and Language 36: 293–310.
MacWhinney, Brian
1977 Starting points. Language 53: 152–187.
MacWhinney, Brian
2005 The emergence of grammar from perspective. In Grounding cogni-
tion: The role of perception and action in memory, language, and
thought, D. Pecher and R. Zwaan (eds.), 198–223. Cambridge: Cam-
bridge University Press.
Magliano, Joseph P. and Michelle C. Schleich
2000 Verb aspect and situation models. Discourse Processes 29: 83–112.
McKoon, Gail and Roger Ratcliff
1992 Inferences during reading. Psychological Review 99: 440–466.
A simulation view account of salience 275

Meteyard, Lotte, Bahador Bahrami, and Gabriella Vigliocco


2007 Motion detection and motion verbs. Psychological Science 18: 1007–
1013.
Morrow, Daniel G.
1985 Prepositions and verb aspect in narrative understanding. Journal of
Memory and Language 24: 390–404.
Morrow, Daniel G., Gordon H. Bower, and Steven L. Greenspan
1989 Updating situation models during narrative comprehension. Journal
of Memory and Language 28: 292–312.
Morrow, Daniel G., Steven L. Greenspan, and Gordon H. Bower
1987 Accessibility and situation models in narrative comprehension. Jour-
nal of Memory and Language 26: 165–187.
Moscoso del Prado Martín, Fermín, Olaf Hauk, and Friedemann Pulvermüller
2006 Category specificity in the processing of color-related and form-relat-
ed words: An ERP study. NeuroImage 29: 29–37.
Navon, David
1978 On a conceptual hierarchy of time, space, and other dimensions. Cog-
nition 6: 223–228.
O’Brien, Edward J. and Jason E. Albrecht
1992 Comprehension strategies in the development of a mental model.
Journal of Experimental Psychology: Learning, Memory, and Cog-
nition 18: 777–784.
Pecher, Diane, Saskia van Dantzig, Rolf A. Zwaan, and René Zeelenberg
2009 Language comprehenders retain implied shape and orientation of ob-
jects. Quarterly Journal of Experimental Psychology 62: 1108–1114.
Poynor, David V. and Robin K. Morris
2003 Inferred goals in narratives: Evidence from self-paced reading, recall,
and eye movements. Journal of Experimental Psychology: Learning,
Memory, and Cognition 29: 3–9.
Radvansky, Gabriel A. and David E. Copeland
2000 Functionality and spatial relations in memory and language. Memory
& Cognition 28: 987–992.
Radvansky, Gabriel A. and Jacqueline M. Curiel
1998 Narrative comprehension and aging: The fate of completed goal in-
formation. Psychology & Aging 13: 69–79.
Rall, Jaime and Paul L. Harris
2000 In Cinderella’s slippers? Story comprehension from the protagonist’s
point of view. Developmental Psychology 36: 202–208.
276 Berry Claus

Ramm, Wiebke
this vol. Discourse-structural salience from a cross-linguistic perspective: Co-
ordination and its contribution to discourse (structure). this volume,
143–172.
Rinck, Mike and Gordon H. Bower
1995 Anaphora resolution and the focus of attention in situation models.
Journal of Memory and Language 34: 110–131.
Rinck, Mike, Andrea Hähnel, Gordon H. Bower, and Ulrich Glowalla
1997 The metrics of spatial situation models. Journal of Experimental Psy-
chology: Learning, Memory, and Cognition 23: 622–637.
Rinck, Mike and Ulrike Weber
2003 Who when where: An experimental test of the event-indexing model.
Memory & Cognition 31: 1284–1292.
Rinck, Mike, Pepper Williams, Gordon H. Bower, and Eni S. Becker
1996 Spatial situation models and narrative understanding: Some general-
izations and extensions. Discourse Processes 21: 23–55.
Rose, Ralph L.
this vol. Joint information value of syntactic and semantic prominence for sub-
sequent pronominal reference. this volume, 81–103.
Sanford, Anthony J., Michael Clegg, and Asifa Majid
1998 The influence of types of character on processing background infor-
mation in narrative discourse. Memory & Cognition 26: 1323–1329.
Sanford, Anthony J. and Simon C. Garrod
1981 Understanding written language: explorations of comprehension be-
yond the sentence. New York: John Wiley.
Speer, Nicole K. and Jeffrey M. Zacks
2005 Temporal changes as event boundaries: Processing and memory con-
sequences of narrative time shifts. Journal of Memory and Language
53: 125–140.
Stevenson, Rosemary J., Rosalind A. Crawley, and David Kleinman
1994 Thematic roles, focus and the representation of events. Language and
Cognitive Processes 9: 519–548.
Suh, Soyoung and Tom Trabasso
1993 Inferences during reading: Converging evidence from discourse anal-
ysis, talk-aloud protocols, and recognition priming. Journal of Mem-
ory and Language 32: 279–300.
Taylor, Larry J. and Rolf A. Zwaan
2008 Motor resonance and linguistic focus. Quarterly Journal of Experi-
mental Psychology 61: 896–904.
A simulation view account of salience 277

van Dijk, Teun A. and Walter Kintsch


1983 Strategies of discourse comprehension. New York: Academic Press.
Zeigarnik, Bluma
1927 Das Behalten erledigter und unerledigter Handlungen [Remember-
ing completed and uncompleted tasks]. Psychologische Forschung 9:
1–85.
Ziegler, Fenja, Peter Mitchell, and Gregory Currie
2005 How does narrative cue children’s perspective taking? Developmental
Psychology 41: 115–123.
Zwaan, Rolf A.
1996 Processing narrative time shifts. Journal of Experimental Psychology:
Learning, Memory, and Cognition 22: 1196–1207.
Zwaan, Rolf A.
2004 The immersed experiencer: Toward an embodied theory of language
comprehension. In The psychology of learning and motivation, Brian
H. Ross (ed.), 35–62. Academic Press, New York.
Zwaan, Rolf A., Carol J. Madden, and Shannon N. Whitten
2000 The presence of an event in the narrated situation affects its availabil-
ity to the comprehender. Memory & Cognition 28: 1022–1028.
Zwaan, Rolf A. and Gabriel A. Radvansky
1998 Situation models in language comprehension and memory. Psycho-
logical Bulletin 123: 162–183.
Zwaan, Rolf A., Gabriel A. Radvansky, Amy E. Hilliard, and Jacqueline M. Curiel
1998 Constructing multidimensional situation models during reading. Sci-
entific Studies of Reading 2: 199–220.
Zwaan, Rolf A. and Larry J. Taylor
2006 Seeing, acting, understanding: motor resonance in language compre-
hension. Journal of Experimental Psychology: General 135: 1–11.
Zwaan, Rolf A. and Timothy P. Truitt
1998 Smoking urges affect language processing. Experimental and Clinical
Psychopharmacology 6: 325–330.
Index

Language index

Danish 109, 245 Latin 1, 175–184, 186–194


Dutch 9, 38

English 1, 11, 32–33, 82–98, 143–168, Norwegian 15, 143–169


235, 245
Old English 177, 179, 185
Finnish 9, 123 Old High German 15, 82, 173–196
German 15, 111, 114, 143–169, 173– Old Norse 177
178, 193–195, 236 Old Saxon 177, 179

Japanese 98 Russian 10, 31–53, 58, 83, 143


Khanty, Eastern (Uralic) 10, 57–76, 99,
269 Spanish 98

Index of determinants, manifestations and aspects of salience

aboutness 35, 37, 59, 119, 128, 146– contrast (contrastivity, contrastiveness)
147, 182–183, 195, 229–242, 245– 16–17, 39, 41, 123, 161, 239, 244
246, 265–267 coordination 4, 13–15, 143–169, 174–
accessibility 4–5, 14, 19, 59, 71, 107– 175, 183, 185–186, 188, 192–195
111, 134, 174, 180–184, 251–252,
258, 261–267 definiteness 5, 45, 62, 128–130, 180,
184, 186, 205, 207–209, 219–220,
activation 5, 9–10, 31–42, 52, 59–60,
222, 268
62, 107, 109, 208, 254, 257
discourse prominence 9, 57, 76, 107,
agenthood (agentivity, see semantic
110–111, 115
roles) 10, 57, 60–65, 68, 71–76, 82,
distance (recency of mention) 16, 18,
86–87, 93–95, 117
32–33, 35–37, 39, 51–52, 83, 113–
animacy (animate, inanimate) 33, 42, 117, 174, 215, 221–223, 242, 251–
45–50, 62, 74, 94, 173, 191 252, 261, 263–264, 268

emphasis 107, 111, 123, 134, 239, 242,


clause linkage 145, 151, 161, 167 244–246, 246
cognitive status (attentional state, dis-
course status) 34–37, 59, 61, 82, familiarity, information status (as de-
108–109, 119, 123, 127, 133, 173, fined by Ellen F. Prince) 5, 114, 182,
180, 190, 208, 234 236
280 Index

discourse-new (see new) 15, 31, 174, persistence (thematic prominence) 46–
183, 236 48, 51, 61, 76, 81, 115, 122
discourse-old (see given) 15, 229, pronominalization 6–8, 61, 76, 90–92,
232, 236, 239 96, 113, 115–118, 120–121, 127–
hearer-new (see new) 119, 236–237 130, 132
hearer-old (see given) 236–237 property salience 125–126
feature salience 125–126 prosody 17, 63, 107, 229, 150, 229, 242,
focusing (semantic focusing) 10, 17, 82, 245
145, 165
referential choice (choice of referring
given (givenness) 3, 5, 9, 15, 32, 34, 35, expressions) 9, 11, 31–52, 76, 83,
105, 107–109, 111, 115, 123, 130, 105–134, 269, 173
134, 146, 173–174, 180, 182–184, referentiality 5, 35–37, 74, 88, 184,
190–191, 193, 258 253–254, 268
grammatical roles (syntactic relations) relevance 1, 9, 31, 35, 38, 47–49, 52,
5–8, 11, 33, 57, 59–66, 73–74, 76, 59, 64, 106–107, 134, 146, 149–150,
81–97, 105–134, 144, 173, 216, 231, 161, 180, 230, 252
235, 269
selective attention 16
hearer salience 11, 84, 107–134 semantic roles (thematic roles) 10–11,
17, 57, 59–60, 63–67, 76, 81–99,
identifiability 35–36, 60, 71
150, 167, 268–269
importance 1, 4, 45–46, 94, 107–108,
socio-cultural factors 66–73
115, 122–123, 134, 174, 179, 192,
speaker salience 11, 84, 107–134
242, 260, 264
subordination 4, 13–15, 93, 143–169,
imposed salience 9, 109
174–175, 185–188, 195, 243, 258
inherent salience 9, 109

markedness 81, 109–112, 163, 229, 232, topicality 5, 9, 59, 61–64, 71–77, 105,
233, 235, 238, 240–241, 244–247 107, 111, 115–118, 120, 126, 134,
182, 252
new (newness, newsworthiness) 3, 15, typographic marks 4, 17, 229, 242, 245–
31, 45–46, 51, 60–62, 107, 111, 119, 246
128, 134, 146, 150, 173–174, 180–
184, 186, 190–193, 236–237, 256 visual salience 16, 18, 83, 107, 123,
nuclearity (nucleus-satelite relationships 126, 205–225, 231, 237, 253
in RST) 14, 143, 146, 148–149,
154–155, 166–168, 185 word order (order of mention) 8, 11, 57,
60, 76, 81–82, 98, 105–134, 143–
perceptual salience 15–17, 109, 133– 144, 153, 159, 164, 168, 174–192,
134, 205, 208, 258 195, 236, 245, 268
Subject index 281

Subject index

active-direct voice 59–65, 71, 74–75, Centering transitions 6–8, 94, 113, 120,
106 128, 131, 235–236, 243
attention 1, 5–7, 9, 11, 16–17, 32, 34– coherence 3, 6–8, 10, 12, 19, 76, 82,
35, 61, 105–109, 111, 115, 119, 111, 113, 229, 132, 134, 174–175,
122–124, 127, 131, 133–134, 148, 180, 185, 195, 229–231, 237, 241,
179, 207–209, 211–215, 224–225, 243, 246–247, 256
233–234, 236–239, 245–247, 264 common ground 5, 108–109, 123, 131,
208–209
Background (discourse structure) 145– continuity
150, 154, 156, 168 continuation (discourse relation)
background (information structure) 3, 186–187, 195
110, 143, 145–146, 149, 180 continue (Centering transition) 6–8,
backgrounding (salience demotion, 61, 65, 74, 76, 133, 234, 236
downgrading, defocusing) 10, 57, topic continuity 150–151, 187
59–62, 64–66, 68–69, 71–76, 99,
110, 144–146, 151, 162–166, 174, deixis 17, 126, 258
184–185 demonstrative 9–10, 31–52, 123–124,
backward-looking (anaphoric aspects of 153, 173, 205, 268
discourse processing) 3, 5–11, 108, discourse relations (coherence relations,
111, 113–114, 120–121, 127–131, rhetorical relations) 1, 4, 11–15, 43,
147–148, 154, 234–236, 238 144–155, 159–164, 166, 168, 173–
bridging 131, 181 195
Discourse Representation Theory (DRT)
center of attention (see focus of atten- 12, 14, 253
tion) 6, 106, 131, 239, 245 discourse structure 2–4, 11–15, 97, 143–
center (Centering) 5–8, 19, 59, 61, 63, 169, 174–175, 183–190, 195, 208,
65, 76, 113–114, 120–121, 127–131, 225, 230, 241, 243
216, 232–241, 245–247
backgrounded center 59, 65, 236 ergative voice construction 10–11, 58,
backward-looking center 5–8, 113– 59, 61–63, 68–76
114, 120–121, 127–131, 234–
236, 238 focus (information structure) 3–4, 10,
foregrounded center 59, 63, 65, 236 17, 61–62, 72, 92, 143, 146, 162,
forward-looking centers 5–7, 15, 18– 178, 180, 183, 193
19, 113–114, 116, 120, 132, 234– focus (of attention) 1, 5, 9, 18, 35–36,
237, 239–241, 244–246 106, 145, 208, 211, 214–215, 222,
preferred center 6–7, 61, 63, 113, 234, 239, 247
120–121, 229–247 focus stack model 208, 225
Centering Theory 2–3, 5–10, 18–19, 59, foreground 132, 145, 147–148
61, 76, 83, 105, 111, 113–122, 126– foregrounding (salience promotion) 57,
134, 216, 229–247 59, 63–65, 73–76, 107, 109, 110–
282 Index

111, 132–133, 144–145, 147–148, 71–76, 82–83, 88–99, 108–120, 123–


165, 173, 184–185, 236, 245 126, 128, 133, 173–174, 180–182,
forward-looking (cataphoric aspects of 184, 186–187, 190–192, 195, 205,
discourse) 3, 5–7, 10–11, 15, 18–19, 207–213, 218–224, 243–244, 253,
111, 113–114, 116, 120, 131, 147– 269
148, 230–232, 234–237, 239–242, Relevance Theory 149–150
244–246 Rhetorical Structure Theory (RST) 14,
43, 143, 146, 148–149, 154–155,
Game Theory 134–135 160, 168, 185
Givenness Hierarchy 9, 32, 34–35 right frontier principle 12–14, 155
hypertext 4, 17–19, 229–247, 253
saliency map 17, 215
iconicity 106, 110–111 Segmented Discourse Representation
information structure 3, 15, 59, 61–62, Theory (SDRT) 12, 143–169, 174,
64, 75, 123, 145–148, 176, 179–183, 185–186, 195, 225
194, 236 shift (of attention) 7, 41, 76, 106–107,
Information Theory 11, 82–83, 89–91, 109, 111, 173, 179, 244–247
97 shift (Centering transition) 7, 61,
127, 234, 247
memory 9, 31–32, 34–35, 107, 256–257 perspective shift 258, 191, 258–259
temporal shift 31, 41–45, 52, 256
Optimality Theory 120, 126–131, 133 thematic shift 235, 246
topic shift 42
passive voice 58–59, 63–66, 68, 72–77,
simulation view (of language compre-
106, 177
hension) 18–19, 251–269
quaestio, question under discussion situated communication 4, 16–18, 205,
(QUD) 148, 150 207, 220, 224, 231, 237, 253, 269

reference resolution (anaphor resolution) topic 3, 5, 8, 10, 35, 42, 59, 61–66, 74–
4, 8, 14, 17–18, 97, 113, 128, 130, 75, 92, 127–128, 145, 147, 178, 180,
133, 205–225, 236–237, 243–244, 182–184, 187, 192, 194–195, 231,
253, 268–269 233–236, 252
referent (discourse referent) 2–3, 6–9,
18, 31–39, 42, 44–52, 57, 59–65, 68, visual-world paradigm 16, 269

You might also like