0% found this document useful (0 votes)
24 views40 pages

Nekrasova 2009

The study examines the knowledge of lexical bundles, which are frequently occurring multi-word sequences, between native English speakers and second language English speakers. It consists of two experiments - a gap-filling activity and a dictation task - that tested participants' knowledge of discourse-organizing versus referential lexical bundles. The results showed differences in knowledge between specific bundles and an overall greater knowledge of discourse-organizing bundles. The study contributes to research on the role of frequent language chunks in first and second language processing in English.

Uploaded by

Mhilal Şahan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views40 pages

Nekrasova 2009

The study examines the knowledge of lexical bundles, which are frequently occurring multi-word sequences, between native English speakers and second language English speakers. It consists of two experiments - a gap-filling activity and a dictation task - that tested participants' knowledge of discourse-organizing versus referential lexical bundles. The results showed differences in knowledge between specific bundles and an overall greater knowledge of discourse-organizing bundles. The study contributes to research on the role of frequent language chunks in first and second language processing in English.

Uploaded by

Mhilal Şahan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Language Learning ISSN 0023-8333

English L1 and L2 Speakers’ Knowledge


of Lexical Bundles
Tatiana M. Nekrasova
Northern Arizona University

The purpose of the present study is to contribute to the ongoing debate about the use
of lexical bundles by first (L1) and second language (L2) speakers of English. The
study consists of two experiments that examined whether L1 and L2 English speakers
displayed any knowledge of lexical bundles as holistic units and whether their knowledge
was affected by the discourse function of the lexical bundles (discourse-organizing or
referential). The participants in Experiment 1 (N = 61) completed a gap-filling activity,
whereas the participants in Experiment 2 (N = 61) carried out a dictation task. Results
showed that the participants’ knowledge differed for specific lexical bundles and that,
overall, they knew more discourse-organizing bundles than referential bundles. The
implications of the study are discussed in terms of current research about the role of
frequency-based language chunks in L1 and L2 speech processing in English.

Keywords discourse function; formulaic sequences; frequency-based chunks; language


processing; lexical bundles; psycholinguistics

Since the late 1970s, linguists have established the importance of formulaic
sequences for language processing and production (Hakuta, 1974; Nattinger &
DeCarrico, 1992; Peters, 1983; Wong Fillmore, 1976; Wray, 2002). Typically
defined as frequent multiword combinations that are stored and retrieved holis-
tically from the mental lexicon at the moment of speech, formulaic sequences
have been argued to minimize encoding work for both the speaker and ad-
dressee, thus allowing for the construction of fluent spoken discourse (Erman,

Preliminary results were reported at AAAL 2007 in Costa Mesa, CA and AAAL 2008 in Washing-
ton, DC. I am grateful to Kim McDonough for her insightful comments on earlier versions of this
article. I also thank Doug Biber, Viviana Cortes, Bethany Gray, and the editor and four anonymous
reviewers of Language Learning for their valuable input, Valeria Kashpur for assistance with data
collection, and Tony Becker for his assistance with data coding and consistent support of this
project through comments and discussion. Any errors are, of course, my own.
Correspondence concerning this article should be addressed to Tatiana M. Nekrasova, Depart-
ment of English, Northern Arizona University, P.O. Box 6032, Flagstaff, AZ 86011-6032. Internet:
[email protected]

Language Learning 59:3, September 2009, pp. 647–686 647



C 2009 Language Learning Research Club, University of Michigan
Nekrasova Knowledge of Lexical Bundles

2007; Pawley & Syder, 1983; Raupach, 1984; Wood, 2006). In addition, proper
use of formulaic sequences has been found to be critical for the acquisition of
nativelike language competence (Dufon, 1995; House, 1996).
Formulaic sequences, as a broad category, include many different sub-
classes: proverbs, lexicalized stems, clichés, speech formulae, idioms, recur-
ring utterances, and others. Wray (2002) provided a list of terms that are used
to describe aspects of formulaic language in terms of their place on a contin-
uum from being completely fixed (e.g., idioms and set expressions) to more
compositional (e.g., semi-preconstructed phrases, sentence builders, patterns).
Although formulaic language has been the focus of linguistic inquiry for sev-
eral decades, only a few relatively fixed subclasses of formulaic sequences
have been targeted in traditional linguistic studies conducted in phraseology
and pragmatics. As a result, more compositional subclasses of formulaic se-
quences that differed from idioms and set expressions in their structural and
functional characteristics were largely ignored in linguistic research until the
development of corpus-based methods to data analysis. Ever since corpus-
driven research revealed that, in addition to idioms and set phrases, a much
greater number of language constructions have a tendency to occur together,
attempts have been made to formally describe and classify these units in order
to examine their formulaic nature and identify their importance for language
production and language acquisition. These co-occurring constructions and,
more specifically, the question of their psychological reality are the focus of
the present study. Before turning the current discussion to the topic at hand, the
following sections provide a brief overview of research conducted on formulaic
sequences in phraseology, pragmatics, and corpus linguistics, thus situating the
present study within a broader scope of formulaic language.

Previous Research on Formulaic Sequences


Phraseology
Early research on formulaic language has directed a considerable amount of
attention to idioms, which have been traditionally defined as chunks of frozen
syntax that are not constructed from the generative grammar rules and are re-
trieved holistically at the moment of use (e.g., raining cats and dogs, kick the
bucket, spill the beans). As nontransparent constructions, idioms were com-
monly viewed as archetypical formulaic sequences; and as some researchers
argued, it was the nontransparency of the idioms that defined their formulaic sta-
tus (Hudson, 1998; Nattinger & DeCarrico, 1992; Williams, 1994). At the same
time, other scholars advocated for a broader definition of an idiom to include

Language Learning 59:3, September 2009, pp. 647–686 648


Nekrasova Knowledge of Lexical Bundles

partly analyzable constructions together with nontransparent ones (Cowie,


1988; Wray, 2002). Specifically, Cowie (1988) suggested that instead of separat-
ing nontransparent idioms from other (partially) transparent expressions, there
should be a continuum of idiomatic expressions that would incorporate “very
many semantically evolved composites which are still partially analysable”
(p. 135). This alternative account of idiomatic expressions expanded the cat-
egory of formulaic sequences by going beyond traditional idioms and recog-
nizing other types of constructions, more transparent in nature, as formulaic.

Pragmatics
Another subclass of formulaic sequences—speech formulas (or routine
formulas)—has been largely explored in pragmatic research within the frame-
work of speech acts through the work of the linguists who examined the lan-
guage of routine social encounters (Coulmas, 1979, 1981; Ferguson, 1976;
House, 1996). Speech formulas were identified as set expressions that are tied
to particular predictable situations and are used to realize such speech func-
tions as thanking, apologizing, and others (e.g., thank you very much, I am very
sorry). Although semantically more transparent compared to traditional idioms,
speech formulas acquired their formulaic status from their ability to meet cer-
tain functional demands that, subsequently, led to their high predictability and
frequency of occurrence in certain types of social situations. At the same time,
speech formulas were described to be similar to idioms in terms of their form:
both subclasses of formulaic sequences are considered to be relatively fixed,
with certain types of speech formulas, however, being defined as more compo-
sitional than others (e.g., Nattinger & DeCarrico, 1992; Van Lancker-Sidtis &
Rallon, 2004).

Corpus Linguistics
The development of corpus-based techniques introduced a new way to explore
a language. Whereas previous language research relied exclusively on native-
speaker intuition when describing language units, corpus linguistics brought
in a more objective frequency-based approach to not only offer new insights
about existing language regularities but also to reveal previously unobserved
phenomena (e.g., Conrad, 2000; Lindemann & Mauranen, 2001). Thus, when
analyzing a range of oral and written corpora, it became obvious that, in addition
to already established classes of formulaic sequences (sayings, proverbs, speech
formulae, and idioms), a large number of language units were found to co-
occur in preferred order without being governed by specific grammar rules
(Altenberg, 1998; Biber & Conrad, 1999; Granger, 1998; Moon, 1998; Sinclair,
1999; Wray, 2002).

649 Language Learning 59:3, September 2009, pp. 647–686


Nekrasova Knowledge of Lexical Bundles

Observations from corpus-driven research motivated a growing number of


studies that explored different structural types of recurrent multiword chunks
in various kinds of oral and written corpora: recurrent word combinations
(Altenberg, 1998), prefabricated patterns (Granger, 1998), phrasal lexemes
(Moon, 1998), highly recurrent word combinations (De Cock, 2000), and lex-
ical bundles (Biber & Conrad, 1999; Biber, Johansson, Leech, Conrad, &
Finegan, 1999), all of which are identified in a language on the basis of their
frequency of co-occurrence. These structural and semantic distinctions of re-
current multiword constructions from other subclasses of formulaic sequences
thus led to questions about the formulaic status of recurrent combinations.
However, whereas much research has investigated the psychological reality of
traditional idioms and set expressions (see Gibbs & Gonzales, 1985; Gibbs,
Nayak, & Cutting, 1989; Swinney & Cutler, 1979; Van Lancker & Kempler,
1987), little has been done to examine the psychological status of recurrent
multiword chunks (but see Schmitt, Grandage, & Adolphs, 2004). The present
study examines the issue of psychological validity of a particular type of recur-
rent chunks: lexical bundles. The following sections briefly outline the research
on lexical bundles conducted to date and provide the rationale for the present
study.

Identification and Characteristic Features of Lexical Bundles


First introduced by Biber and colleagues, lexical bundles are defined as the
most frequently occurring sequences “of three or more words that show a sta-
tistical tendency to co-occur” (Biber & Conrad, 1999, p. 183). Lexical bundles
were initially identified in two major registers of the The Longman Grammar of
Spoken and Written English (Biber et al., 1999)—conversation and academic
prose—as units that occurred at least 10 times per million words. One charac-
teristic feature that distinguishes lexical bundles from other types of recurrent
multiword chunks is that they are often structurally incomplete units that occur
at the phrase and clause boundaries (e.g., in the case of, the point of view of,
I don’t know if ). Structurally, these units can consist of incomplete nominal
chunks (i.e., prepositional or noun phrase: the nature of the, as a result of ) or
clausal chunks (i.e., verb phrase and the beginning of a complement clause: I
don’t know how, I thought that was). Furthermore, shorter lexical bundles can
often be incorporated within longer lexical bundles, sometimes more than one
(e.g., I don’t think as a part of well I don’t think, I don’t think so, but I don’t
think). In addition, lexical bundles can be classified in terms of their discourse
functions, which are described in the next section.

Language Learning 59:3, September 2009, pp. 647–686 650


Nekrasova Knowledge of Lexical Bundles

According to Biber, Conrad, and Cortes (2004), the four primary functions
of lexical bundles identified in English academic registers and conversation in-
clude (a) stance bundles that convey interpersonal meanings, such as attitudes
and assessments (e.g., it is important to, I don’t think so, I want you to); (b)
discourse organizers that help reveal relationships between prior and coming
discourse, such as topic introduction and topic elaboration (e.g., nothing to do
with, on the other hand, as well as the); (c) referential bundles that perform an
ideational function and are used to make direct reference to physical or abstract
entities, such as time, place, and text references (e.g., is one of the, in the form
of, as a result of, the nature of the); and (d) special conversational bundles
that are mostly used in conversation to express politeness, inquiry, and report
(e.g., thank you very much, what are you doing, I said to him/her). These four
discourse functions of lexical bundles should be distinguished from pragmatic
functions that other multiword constructions, such as speech formulas, can have
(Coulmas, 1979, 1981). Pragmatic functions are usually associated with highly
conventionalized expressions that are more salient and, thus, are used to effec-
tively communicate certain pragmatic meanings, such as expressing requests,
apologies, or gratitude. Unlike speech formulas that are more interactional in
nature and are highly dependent on conversational context, the majority of
lexical bundles operate on a textual level and are relatively “context-free.” For
example, whereas the occurrence of a speech formula Nice to see you is closely
bound to a social situation of greeting a person, the occurrence of a lexical
bundle nothing to do with is not associated with any specific situation and
can be equally frequent in a variety of contexts. In this regard, lexical bundles
that serve special conversational functions in discourse are more likely to have
pragmatic functions.

L1 and L2 Research on Lexical Bundles


First language (L1) research on lexical bundles has mostly focused on the
identification of these units and a description of their patterns of occurrence in
different English L1 registers, including both written academic prose and con-
versation (Biber & Conrad, 1999; Biber et al., 1999; Biber, Conrad, & Cortes,
2004). More recent L1 research on lexical bundles has focused on identifying
the discourse functions that lexical bundles serve in different texts (Biber, Con-
rad, & Cortes; Cortes, 2004), as discussed in the previous section. In addition,
a number of corpus studies have adopted a contrastive approach to the analysis
of the use of corpus-derived constructions (including lexical bundles as well
as other recurrent multiword chunks) by comparing L1 and L2 written and

651 Language Learning 59:3, September 2009, pp. 647–686


Nekrasova Knowledge of Lexical Bundles

oral corpora (De Cock, 2000; De Cock, Granger, Leech, & McEnery, 1998;
Granger, 1998; Warga, 2005). These studies indicated that L1 and L2 (second
language) speakers’ use of recurrent phrases was different both quantitatively
and qualitatively (De Cock, 2000; De Cock et al.). More specifically, L2 speak-
ers were found to be unaware of the more common, yet less salient L2 chunks,
and in order to compensate for their lack of awareness, they often referred to
L1 transfer. The process of L1 transfer was realized in several ways. First, L2
speakers were found to either modify or avoid using certain L2 constructions
that did not have L1 equivalents. Second, L2 speakers tended to overuse those
L2 constructions whose L1 equivalents were more common. Finally, L2 speak-
ers showed the misuse of those constructions whose L2 equivalents did not
match their L1 counterparts. As De Cock (2000) argued, turning to L1 transfer
during L2 production could potentially lead to the “foreign-soundness” of L2
speakers’ speech and writing.
Because lexical bundles are defined as combinations that occur frequently
in a text or a collection of texts, it is logical to assume that the frequency counts
serve as an indication of these units being conventionalized by the speech
community, which would suggest their formulaic nature. At the same time,
some corpus linguists argue that simple frequency counts do not provide enough
grounds to view any corpus-derived construction as formulaic (De Cock, 2000;
De Cock et al., 1998). One of the reasons for skepticism is that frequency
information may not be relevant to how language structure is represented in
one’s mind. For example, a combination of the two words it and is extremely
frequent in English language, mostly because the individual words included
in this combination are closed-class items that occur very frequently in any
corpus. Thus, it is very unlikely that this combination is represented in the
mind as a holistic unit and can be defined as formulaic. Another reason to
question the assumptions about the formulaic nature of lexical bundles comes
from their structural and functional differences from other established classes
of formulaic sequences. In order to contribute to the existing body of research
on lexical bundles and define their place within a broader category of multiword
constructions, the present study explores the issue of psycholinguistic validity
of lexical bundles.

Research on Psychological Reality of Lexical Bundles


The extent to which lexical bundles (or other recurrent lexical sequences) could
be considered to be psycholinguistically real has not been fully examined, as
corpus linguists have become interested in this topic only in the last decade

Language Learning 59:3, September 2009, pp. 647–686 652


Nekrasova Knowledge of Lexical Bundles

and the majority of lexical bundle studies generally describe the distribution of
these units in different registers (see Biber & Conrad, 1999; Biber, Conrad, &
Cortes, 2004; Biber et al., 1999; Cortes, 2004).
In the only study to date that has examined the issue of psycholinguistic
validity of lexical bundles, Schmitt, Grandage, et al. (2004) questioned whether
corpus-derived recurrent clusters (i.e., lexical bundles) are psycholinguistically
valid and, therefore, stored and processed holistically. After identifying 25
sequences from previous publications, they created a text about a hitchhiker
and embedded the target sequences in it. Both English L1 (n = 34) and L2
(n = 45) participants performed a dictation task during which they listened
to the recorded text and orally reconstructed it sentence by sentence. The
authors argued that the bundles the participants were able to reproduce could
be considered formulaic and were holistically stored in mind. The findings of
the study suggested that not all corpus-driven clusters were psycholinguistically
valid according to their criteria, with many of them being used idiosyncratically
by the individual speakers. The researchers concluded that both corpus and
psycholinguistic approaches should be used when deciding whether corpus-
driven clusters share the same psycholinguistic characteristics as holistically
stored formulaic chunks. By employing the sequences that varied in length,
frequency, and transparency of meaning, the study did not provide conclusive
evidence to either bridge the two categories (i.e., lexical bundles and formulaic
chunks) or distinguish them.
Schmitt, Grandage et al.’s (2004) study is innovative in that it put a com-
monly accepted assumption to empirical testing. At the same time, this study
displayed several limitations that need to be addressed here. First, not all target
sequences employed in the study could qualify as recurrent bundles: Some of
them were much more frequent in the corpus than others (e.g., you know vs.
to make a long story short). Second, whereas some of the bundles could be
described as more salient in terms of the pragmatic functions they realized in
certain language situations (e.g., I don’t know what to do, go away, to make a
long story short, it’s not too bad), other bundles did not have any pragmatic
function and served more like cohesive devices in a text (e.g., as shown in
figure, is one of the most, what I want to, etc.). Finally, some of the bundles
examined in the study were clearly extracted from academic register, whereas
other bundles came from and were characteristic of a more informal register
(i.e., conversation). Both types of bundles were then embedded in a story about
a hitchhiker, a narrative that had a rather informal tone, which, as the authors
acknowledged, might have created some difficulties for the academic regis-
ter bundles to be equally produced by the participants. Thus, the choices the

653 Language Learning 59:3, September 2009, pp. 647–686


Nekrasova Knowledge of Lexical Bundles

authors made during the initial selection of the target structures and the context
in which they were embedded might have contributed to the inconclusive results.

Present Study
The present study was designed to contribute to the debate concerning the psy-
cholinguistic validity of lexical bundles by addressing some of the limitations
of the previous research conducted in this area. First, all target structures em-
ployed in the study were lexical bundles; thus, they were identified strictly on
the basis of frequency counts. Second, all lexical bundles were homogeneous in
terms of their functional characteristics: They all performed discourse functions
signaling the relationships between different elements (i.e., phrases, clauses,
sentences) in a text. None of the bundles had an advantage of expressing a prag-
matic function by carrying out the meaning related to a certain conversational
context (e.g., See you later in a situation of saying “good-bye” to someone).
Furthermore, the findings of previous L1 corpus-based studies indicated that
discourse function of lexical bundles related to the frequency of their use by the
participants (Cortes, 2004, 2006). Therefore, the present study also investigates
the possible effect of two discourse functions of lexical bundles—referential
bundles and discourse organizers—that may affect their production by L1 and
L2 English speakers. Finally, an attempt was made to ensure that all contexts in
which target lexical bundles were embedded were register-appropriate, that is,
both the target lexical bundles and the contexts belonged to the same registers:
university teaching and textbooks.
The main purpose of the study was to examine if lexical bundles are recog-
nized by L1 and L2 participants as holistic units and, therefore, have psycholog-
ical validity. Following Schmitt, Grandage, et al.’s (2004) study, it was assumed
that no direct nonlaboratory measure was available to determine whether L1
and L2 participants recognize lexical bundles as holistic units. For that reason,
participants’ recognition of lexical bundles as holistic units was operationalized
as (a) their ability to produce them as fixed units in both short and extended
pieces of discourse, (b) their ability to produce lexical bundles in a contex-
tually appropriate matter, and (c) participants’ use of lexical bundles to ease
the processing burden during text comprehension and subsequent production
(see Wray, 2000, 2002; Wray & Perkins, 2000). The study consists of two
experiments that employed different measures to assess whether L1 and L2
English speakers have knowledge of lexical bundles as holistic units. Whereas
Experiment 1 involved a controlled-production activity (a gap-filling task), Ex-
periment 2 employed an extended production activity (a timed dictation task).

Language Learning 59:3, September 2009, pp. 647–686 654


Nekrasova Knowledge of Lexical Bundles

Both experiments addressed the same research question: Do English L1 and


L2 speakers differ in their knowledge of lexical bundles that serve different
discourse functions? Because previous research has indicated that native and
nonnative speakers use lexical bundles differently (De Cock, 2000; Granger,
1998: Warga, 2005), it was predicted that L1 speakers would display greater
knowledge of lexical bundles than L2 learners. In addition, because previous
corpus-based studies have illustrated that lexical bundles performing certain
discourse functions were used more frequently by the participants than lexical
bundles performing other functions (Cortes, 2004, 2006), it was predicted that
the participants’ production of lexical bundles in the present study would be
affected by the discourse function performed by these units in context.

Experiment 1
Method
Participants
The participants were L1 English speakers (n = 20), advanced L2 English
speakers (n = 18), and intermediate L2 English speakers (n = 23), all of
whom were undergraduate and graduate students at a regional university in
the western United States. None of the participants were majoring in applied
linguistics or TESL. The L1 speakers consisted of 4 males and 16 females,
aged between 18 and 45 years (M = 24.3, SD = 7.88). The advanced L2
speakers included 8 males and 10 females, aged between 20 and 43 years (M
= 28.44, SD = 8.32), who had completed between 3 and 16 years (M = 10.56,
SD = 3.71) of formal high school/college education in English and reported
the length of residence in the United States ranging from 1 to 127 months
(M = 20.17, SD = 30.56). The intermediate L2 speakers included 12 males
and 11 females, aged between 17 and 37 years (M = 20.7, SD = 3.85), who
had completed between 4 and 11 years (M = 7.17, SD = 1.99) of formal
English instruction, and their length of residence in the United States was
reported to be between 2 and 48 months (M = 6.39, SD = 9.65). The advanced
and intermediate groups were established on the basis of the participants’
enrollment status. Whereas the advanced L2 speakers were degree-seeking
undergraduate or graduate students, the intermediate L2 speakers were enrolled
in an Intensive English Program. The participants volunteered to take part in
the study and were not compensated.
Materials
Gap-filling task. Following Schmitt, Grandage, et al. (2004), it was as-
sumed that no direct measures were available to assess participants’ knowledge

655 Language Learning 59:3, September 2009, pp. 647–686


Nekrasova Knowledge of Lexical Bundles

of lexical bundles as holistic units. Thus, if the participants could, based on


the context, reproduce the missing parts of lexical bundles correctly, it would
suggest that they had the knowledge of these units as holistic entities. The
gap-filling task was designed to measure whether participants were able to
recognize and produce the missing parts of the target lexical bundles based
on the surrounding context. The test materials consisted of 32 sentences with
embedded lexical bundles that performed two different discourse functions:
discourse-organizing bundles (n = 15) and referential bundles (n = 17). The
two sets of lexical bundles were matched for frequency. Only these two func-
tional types of lexical bundles were selected to be used in the study because,
compared to the stance bundles, they contained less repetition of the same word
sequence in the form and, thus, were less synonymous (e.g., stance bundles:
I don’t know if, I don’t know what, I don’t know how, etc.). The vast use of
synonymous structures in the task could potentially make it difficult for the
participants to display their knowledge of a specific lexical bundle in a context
that allows multiple alternatives. Finally, special conversational bundles were
not targeted in the materials because the Biber, Conrad, and Cortes (2004)
original list included only three of these bundles.
The gap-filling task was designed in several steps. First, 32 lexical bun-
dles and their functions (see Appendix A) were identified from Biber, Conrad,
and Cortes’s (2004) corpus-based study of university discourse. The study was
based on an analysis of texts from university registers in the TOEFL 2000
Spoken and Written Academic Language Corpus (Biber, Conrad, Reppen,
et al., 2004) and focused on four-word bundles that occurred 40 or more times
per million words. Because one’s language experience is shaped by information
coming from different types of registers (i.e., academic and everyday language),
an attempt was made to match the bundles in both functional categories in terms
of their frequency distribution in academic prose as well as conversation. Oth-
erwise, some bundles that were found to be frequent in both academic prose
and conversation could potentially be more salient (i.e., more recognizable)
than those bundles that were frequent only in academic prose. Thus, 40% of
discourse-organizing bundles were frequent in both the academic prose and
conversation, whereas 60% of the bundles were frequent only in academic
prose. For the referential bundles, the frequency distribution was 41% for the
academic prose and conversation and 59% for the academic prose. Next, using
the academic subcorpus of The Longman Grammar of Spoken and Written En-
glish Corpus (Biber et al., 1999), each lexical bundle was embedded within an
attested context in which the function of the bundle (i.e., discourse-organizing
or referential) would remain the same as initially identified. Then one content

Language Learning 59:3, September 2009, pp. 647–686 656


Nekrasova Knowledge of Lexical Bundles

word within a bundle was deleted with space provided to be filled in by the
participants. Finally, all test items were randomly ordered and presented as a
list of sentences (see Appendix B).
The decision as to which word within a lexical bundle to delete was moti-
vated by two criteria. First, lexical bundles are typically described as incomplete
structural units (Biber et al., 1999), so they usually include a limited set of func-
tion words, such as articles, particles, and prepositions, that are often used to
construct the frame of a lexical bundle (e.g., to __ with the, in the __ of, the __
of the, etc.). Because one frame could be employed in several different lexical
bundles, it would be easier for the participants to produce the missing elements
of the frame; this, however, would not necessarily illustrate their knowledge of
a specific bundle. Thus, the decision was made to delete a content word that is
used uniquely in each bundle to explore if the participants could produce each
individual bundle rather than the frame (e.g., the bundles the rest of the and the
top of the are created from the same frame the _ of the). Second, each frame
could potentially be filled with a number of different content words (e.g., in the
absence of, in the form of, in the case of ). Therefore, selecting an appropriate
content word associated with a particular frame in a certain context would pro-
vide more support for the idea that participants recognize certain lexical bundle
as a unit.
The materials were pilot-tested with 10 L1 and 10 L2 speakers of English.
Based on the pilot test, a few sentences were judged to be too difficult for
intermediate learners and the contexts for the target bundles were replaced. The
replaced contexts were also selected from the corpus. A split-half reliability
procedure was used to measure the internal consistency among the items in the
gap-filling task. Because there were two different structures targeted in the task,
separate Guttman split-half coefficients were obtained for discourse-organizing
and referential bundles, which were .86 and .84, respectively, suggesting that
both sections of the task had sufficient internal consistency.

Design and Procedure


The study had a 2 × 3 mixed design, with the discourse function (discourse-
organizing vs. referential) as a within-group variable and participant group (L1
vs. advanced L2 vs. intermediate L2 speakers) as a between-groups variable.
The L1 and advanced L2 speakers scheduled a 30-min appointment with the
researcher, during which they completed the gap-filling task. The intermediate
L2 speakers completed the task during their scheduled English as a second
language (ESL) class. The session lasted approximately 30 mins.

657 Language Learning 59:3, September 2009, pp. 647–686


Nekrasova Knowledge of Lexical Bundles

Analysis
The data were scored by the researcher by giving one point to each lexical
bundle for which the participant provided a contextually appropriate word.
Because some lexical bundles could be equally possible (e.g., at the beginning
of, at the end of ) or synonymous (e.g., what I want to, what I have to, what I need
to) in certain contexts, the decision was made to give one point to each bundle
that was produced as contextually appropriate. All modifications of the original
bundles were checked in terms of their frequency range against the TOEFL
2000 Spoken and Written Academic Language Corpus and were given one
point if they occurred at least 10 times per million words in any written or oral
register in the corpus (e.g., what I have to do, the order in which). The decision
to use this frequency cutoff was based on the definition of a lexical bundle
as a frequently occurring sequence in a register, originally identified by Biber
et al. (1999) as a unit that occurred at least 10 times per million words. No points
were given if the resultant sequence did not occur frequently in the corpus (i.e.,
did not qualify as a lexical bundle), was not contextually appropriate, or if the
item was left blank. Spelling errors were ignored. An independent rater scored
36% of the test data, and simple percentage agreement with the researcher was
98%. After the data were scored, each lexical bundle was analyzed in terms
of how frequently and how accurately it was completed as well as how much
modification to the original form it exhibited. Due to the unequal number of
discourse-organizing and referential bundles in the gap-filling task, raw scores
were converted into proportions, which were then used in the statistical tests.
In addition to the significance tests, the results were also analyzed in terms of
the effect size to estimate the magnitude of the observed differences, measured
by the standardized difference between the means (Rosenthal & Rubin, 1982).
Alpha was set at .05 for all statistical tests.

Results
The research question asked whether English L1 and L2 speakers differed in
their knowledge of lexical bundles that served different discourse functions.
As shown in Table 1, L1 speakers scored the highest on the gap-filling task,
with a mean score of .88 (SD = .06). In terms of the two L2 groups, advanced
L2 speakers scored higher (M = .84, SD = .08) than the intermediate L2
speakers (M = .53, SD = .11). Furthermore, compared to the mean scores for
the referential bundles, the mean scores for discourse organizers were higher
for all three groups, with advanced learners’ scores (M = .93, SD = .06) being
the same as the native speakers’ scores (M = .93, SD = .07).

Language Learning 59:3, September 2009, pp. 647–686 658


Nekrasova Knowledge of Lexical Bundles

Table 1 Completion of lexical bundles by English L1 and L2 speakers

Discourse-
Referential org. Total
Group N M SD M SD M SD

L1 speakers 20 .83 .08 .93 .07 .88 .06


Advanced L2 speakers 18 .77 .11 .93 .06 .84 .08
Intermediate L2 speakers 23 .43 .11 .63 .17 .53 .11

Table 2 The Bonferroni analysis of multiple comparisons (Experiment 1)

(I) groups (J) groups Mean difference Std. Error Sig.

L1 speakers Advanced L2 speakers .03 .03 .708


L1 speakers Intermediate L2 speakers .35 .02 .000
Advanced L2 speakers Intermediate L2 speakers .32 .02 .000

To make statistical comparisons between the participant groups and lex-


ical bundle types, the data for each bundle type from all three participant
groups were submitted to a linear mixed model, with participant group as the
between-groups, three-level factor (L1 speakers, advanced and intermediate L2
speakers) and lexical bundle type as the within-group two-level factor (referen-
tial and discourse-organizing bundles). Results of the evaluation of assumptions
of normality and homogeneity of variance-covariance were satisfactory. There
was a significant main effect for group, F(2, 112.91) = 132.90, p < .05,
ω2 = .39, and for bundle type, F(1, 112.91) = 59.97, p < .05, Cohen’s
d = .80. There was no significant interaction between group and sequence
type, F(2, 112.91) = 2.73, p > .05. A pairwise comparison of the test scores
for the three participant groups using a Bonferroni adjustment (Table 2) in-
dicated that there was a significant difference in the scores between L1 and
intermediate L2 speakers, p < .001 (Cohen’s d = 3.88) and between advanced
and intermediate L2 speakers, p < .001 (Cohen’s d = 3.22). However, no sig-
nificant difference was found between L1 speakers and advanced L2 speakers
(p > .05).
Further analysis of the results indicated that L1 speakers completed eight
lexical bundles appropriately 100% of the time (at the end of, at the same time,
on the other hand, I would like to, know what I mean, one of the most, the
rest of the, to look at the), whereas 18 bundles were completed appropriately
80–99% of the time (e.g., what I want to, nothing to do with, is one of the, take
a look at, in the form of, or something like that). Six of the bundles completed

659 Language Learning 59:3, September 2009, pp. 647–686


Nekrasova Knowledge of Lexical Bundles

appropriately at least 80% of the time were completed with the same word by
all L1 participants (at the same time, on the other hand, know what I mean, one
of the most, nothing to do with, or something like that). Only one bundle (in
the absence of ) was appropriately completed by L1 speakers less than 50% of
the time. Advanced L2 speakers completed 11 bundles appropriately 100% of
the time, six of which overlapped with those completed by L1 speakers (at the
end of, at the same time, on the other hand, I would like to, know what I mean,
one of the most). Furthermore, advanced L2 speakers completed eight lexical
bundles as fixed units (i.e., with the same word), six of which were the same
as completed by L1 speakers and the two additional bundles included I would
like to and if you look at. Three bundles were completed appropriately less than
50% of the time (e.g., in the absence of, on the basis of, as a result of ). Finally,
intermediate L2 speakers appropriately completed only 1 lexical bundle 100%
of the time (what I want to), 7 lexical bundles 80–99% of the time, and 14
bundles less than 50% of the time (e.g., in the absence of, on the basis of, in
terms of the, as a result of, in the case of, in the form of, in the presence of, etc.).
In addition, intermediate L2 speakers completed three lexical bundles as fixed
units, which overlapped with those produced by L1 and advanced L2 speakers
(at the same time, on the other hand, one of the most).

Summary of the Findings


To summarize the findings of Experiment 1, whereas intermediate L2 speakers
scored significantly lower on the gap-filling task than the other two groups,
L1 and advanced L2 speakers did not show any difference in their knowledge
of target structures, which does not support the findings of previous studies
that compared L1 and L2 speakers (De Cock et al., 1998; Granger, 1998). The
results of Experiment 1 also suggest that L2 speakers become more capable
of accurately producing more lexical bundles as their proficiency increases,
which is consistent with previous studies (Schmitt, Grandage, et al., 2004). Fi-
nally, all three participant groups displayed greater familiarity with the targeted
discourse-organizing bundles than referential bundles, a finding which will be
discussed in more detail in the general discussion section.
Experiment 1 demonstrated that in a controlled-production task participants
displayed the knowledge of some bundles more frequently than others. This
task, however, prompted participants’ recognition of lexical bundles by provid-
ing them with immediate contexts for each missing word. In order to further
explore this issue, Experiment 2 was carried out to investigate participants’ pro-
duction of lexical bundles in a more extensive production activity: a dictation
task. Furthermore, Experiment 2 was designed to investigate if participants’

Language Learning 59:3, September 2009, pp. 647–686 660


Nekrasova Knowledge of Lexical Bundles

use of lexical bundles helped them free up additional attentional resources to


be able to retain more information when comprehending a text, which would
provide additional support for the holistic status of lexical bundles. The first
research question asked in Experiment 2 was similar to the research question
in Experiment 1: Do English L1 and L2 speakers differ in their knowledge of
lexical bundles that serve different discourse functions? The second research
question asked in Experiment 2 was Does the use of lexical bundles allow L1
and L2 speakers to retain more information during discourse comprehension
and subsequent production? Based on the findings of Experiment 1, it was pre-
dicted that higher proficiency L2 speakers would be very similar to L1 speakers
in their recall of lexical bundles and both groups would outperform lower profi-
ciency L2 speakers. Additionally, because Experiment 1 demonstrated that the
discourse function of lexical bundles affected their completion by L1 and L2
English speakers, it was predicted that participants in all three groups would
recall discourse-organizing bundles more often than referential bundles.

Experiment 2
Method
Participants
L1 speakers. The L1 speakers in this study were L1 speakers of American
English who were students at a regional university in the western United States.
Twenty-one participants were recruited on a voluntary basis from among stu-
dents enrolled in a Freshmen Composition course and were offered five extra
credit points for their participation in the study. None of the participants in
Experiment 2 took part in Experiment 1. The participants consisted of 9 males
and 12 females, aged between 18 and 23 years (M = 18.9, SD = 1.09). Only
one participant reported that they had taken a course that discussed language
acquisition.
L2 speakers. In order to account for possible L1 influence, all of the
participants in Experiment 2 were from the same L1 background. The L2
speakers were English as a Foreign Language (EFL) learners (N = 40) enrolled
in a public university in western Siberia, Russia. The participants consisted
of 6 males and 34 females, aged between 19 and 22 years (M = 20.38, SD
= .93), who were all native speakers of Russian. They completed between 7
and 17 years (M = 11.86, SD = 2.29) of formal English instruction, and none
of them reported that they had lived in/visited countries where English was
spoken as a native language. All participants reported that they had never taken
courses in Second Language Acquisition, Psycholinguistics, Genre/Discourse

661 Language Learning 59:3, September 2009, pp. 647–686


Nekrasova Knowledge of Lexical Bundles

Studies, Semantics, or Pragmatics. No objective measure of the L2 speakers’


general proficiency in English was available because they had never taken
standardized tests, such as the Test of English as a Foreign Language (TOEFL).
Therefore, a cloze test, described in the materials section, was used to assess the
L2 speakers’ English abilities and assign them to different proficiency groups
relative to each other. On the basis of the cloze test results, the higher proficiency
group included 23 participants, whereas the lower proficiency group consisted
of 17 participants.

Target Structures
The target structures were lexical bundles as defined previously, which repre-
sented two discourse functions: referential and discourse-organizing. Based on
Biber, Conrad, and Cortes’s (2004) corpus-based study of university discourse,
12 lexical bundles (see Appendix C) were selected from the corpus of classroom
teaching and textbooks. Three criteria were considered when selecting target
bundles. First, bundles in both functional categories were matched for the word
length (12.8 letters/bundle for discourse organizers and 12.2 letters/sequence
for referential bundles). Second, the bundles in both groups were matched for
the frequency range with which they occurred in classroom teaching and text-
books: In each category, three more frequent (40–99 times per million words)
and three less frequent (10–19 times per million words) sequences were used.
Furthermore, an attempt was made to select the bundles that were frequent in
both academic prose and conversation: Of six discourse-organizing bundles,
three were frequent in both registers and three were frequent in the academic
prose only. Likewise, two referential bundles were frequent in both registers and
four bundles were frequent in the academic prose only. Of 12 lexical bundles
tested in Experiment 2, seven bundles were previously employed in Experiment
1 and five bundles were new (was one of the, than or equal to, the nature of
the, has to do with, in this chapter we).

Materials
The materials consisted of the dictation activity, a follow-up questionnaire, and
a cloze test.
Dictation activity. To elicit participants’ immediate recall of lexical bun-
dles, a dictation activity was used. Dictation is widely used in L2 classrooms
as a part of a dictogloss, which is claimed to be an effective language learn-
ing task that provides a context for negotiation and facilitates L2 learning
(Kowal & Swain, 1997; Swain, 1998; Wajnryb, 1990). For research purposes,

Language Learning 59:3, September 2009, pp. 647–686 662


Nekrasova Knowledge of Lexical Bundles

dictation is mostly used in experimental studies as a way to help learners es-


tablish form-meaning connections as they receive L2 input, which is found to
be crucial for developing L2 competence (Izumi, 2002). In the present study,
a dictation activity was employed as a way to elicit participants’ production
data and encourage them to engage in syntactic processing. During the activity,
the participants listened to a recorded text divided into sections and recalled
the text section by section. The text used for the dictation was a section from
an “Introduction to Sociology” textbook (Ferrante, 2003) that was adapted to
match the proficiency level of the lower proficiency L2 speakers and to provide
context for the lexical bundles described previously. The adapted text consisted
of 14 sentences that included the 12 target lexical bundles (see Appendix D).
Twelve of the 14 sentences were based on sentences in the original text, with
over half of the words (57%) in the modified sentences identical to those in the
original. The other two sentences were created by the researcher to maintain
coherence. After the text was modified, it was divided into 13 sections (with 1
of the 13 sections containing two sentences), so that participants could process
and recall the text section by section, rather than the whole text at once. The
mean length of a section was 25 words. Some sections contained more than
one lexical bundle, and four sections did not include any lexical bundles. A
native speaker of American English read each section of the text twice at a
normal speed, which was digitally recorded as an audio file using the Voice
Studio Version 2 software. The audio recording was used to test all of the
participants in the study. The dictation task was pilot tested with L1 (n = 7)
and L2 (n = 10) English speakers. The task successfully elicited both types of
lexical bundles. Based on the pilot test, a practice item was included to help
the participants understand the task. To measure the reliability of the task, the
Guttman split-half coefficient was used to calculate the internal consistency of
the instrument. Because the task was designed to measure the production of
two different discourse types of lexical bundles, two coefficients were obtained:
.75 for discourse-organizing bundles and .73 for referential bundles.
Questionnaire. In order to collect information about linguistic choices
that participants made during the dictation, a follow-up questionnaire was
developed by the researcher. The questionnaire consisted of four items: two
multiple-choice questions in which the participants were asked to circle those
lexical bundles that they remembered from the original text and used in their
own texts and two open-ended questions to explain why they used (or did not
use) lexical bundles in their recall and which bundles the participants found to
be particularly useful for understanding the meaning of the text.

663 Language Learning 59:3, September 2009, pp. 647–686


Nekrasova Knowledge of Lexical Bundles

Cloze test. As mentioned in the participants subsection of this experiment,


no objective measure of the L2 speakers’ language proficiency was available,
and a cloze test was used to rank them relative to each other. A cloze test was
used in the present study due to its ease of development and administration and
its reputation in L2 research as an effective testing instrument that can function
as an integrative measure of EFL language proficiency (Brown, 1980; Fotos,
1991; Heilenman, 1983). The cloze test employed in the study was adapted
from the text used in Sasaki (2000) and included content that was culturally
familiar to Russian participants. The 393-word text included 55 blanks, with
1 blank every 7 words, following Sasaki’s original design. The cloze test was
pilot-tested with 10 L1 English speakers, after which two changes were made
to the test: an indefinite article in the original test was substituted with a definite
article and one blank was eliminated from the test due to L1 speakers’ difficulty
completing it with a contextually appropriate lexical item. Cronbach’s alpha
was used to measure the reliability of the cloze test, and the index obtained was
.85, suggesting that there was high internal consistency to the instrument.

Design
This study employed a cross-sectional design to test the effect of the partici-
pants’ L1 background and the discourse function of lexical bundles on their
immediate recall of lexical bundles during the dictation activity. The dependent
variable was the participants’ immediate recall of lexical bundles, which was
operationalized as their score on the dictation activity.

Procedure

For the dictation activity, the instructions and the task were recorded as a single
audio file. All participants were tested in a computer lab. The participants
listened to the instructions, completed the practice item, and did the dictation
task during which they listened to the recorded text and recalled it section by
section. Each section of the text was recorded twice, which was followed by
a 1–2-min pause for the participants to do a written recall of the section. The
participants were not allowed to take any notes while they were listening to the
recording. The same procedure was repeated for all 13 sections.

L1 Speakers
The L1 speakers enrolled in the Freshmen Composition course were tested by
their instructor during their scheduled class time. The instructor informed the
students about the experiment and reviewed the consent form with them. Those

Language Learning 59:3, September 2009, pp. 647–686 664


Nekrasova Knowledge of Lexical Bundles

students who agreed to participate in the study completed the consent form and
did the dictation task, followed by the questionnaire. Two additional students
who had been absent on the day of testing completed the tasks several days
later.

L2 Speakers
The data from the Russian participants were collected by their instructor during
their scheduled English class. The researcher electronically mailed all test ma-
terials and detailed instructions to the instructor and had a phone conversation
with her to ensure that the same procedure for task administration was fol-
lowed for both participant groups. The instructor informed the students about
the experiment and those students who volunteered to participate in the study
completed the cloze test, followed by the dictation task and the questionnaire.
All typed test answers completed by the students were saved as separate files
and sent electronically to the researcher shortly after the testing. All paper-and-
pencil answers were collected by the instructor, scanned, and electronically
mailed as attached files to the researcher as well.

Analysis
Cloze Test
The cloze test was scored by identifying the two most frequently supplied
answers in the L1 speakers’ pilot tests as the base line for scoring L2 speakers’
responses. Thus the answer to each test item was scored as either correct if it
matched one of the two possible responses, or as incorrect if it did not. Each
correct response was given a score of one point, for a possible total of 55
points.

Dictation
Two analyses of the recalled texts were carried out. First, the recalled texts were
analyzed by the researcher for the participants’ use of the target lexical bundles.
Each lexical bundle used by a participant in their texts was given a score of
one point. All modifications of the original bundles and new sequences were
checked in the TOEFL 2000 Spoken and Written Academic Language Corpus
to ensure that the product sequences were as frequent as at least 10 times per
million words in the corpus. Those new structures and the modifications of
the original form that resulted in sequences that could not be identified in the
corpus were not given any points, as they did not qualify as lexical bundles
(e.g., to do about this, equal to or more, just as if not). Spelling errors were
ignored. The subscores for each discourse type of lexical bundles were totaled

665 Language Learning 59:3, September 2009, pp. 647–686


Nekrasova Knowledge of Lexical Bundles

for each individual participant and submitted to statistical analysis. A subset


of the data (30%) was scored by an independent rater and simple percentage
agreement was 94%.
Second, each recalled text was analyzed in terms of the information units
that were retained from the original text to explore the relationship between
the quantity of text recalled in each section and the presence or absence of
lexical bundles in those sections in the original text. The information unit
(i-unit) was adopted from Chiu and Savignon (2006) and operationalized as
“the richness of information in a t-unit, consisting of one basic i-unit and
several or no supporting i-units” (p. 105). Similarly, a t-unit is defined as the
smallest unit “into which a piece of writing can be divided without fragments of
sentences being left over” (p.104). The number of i-units was counted for each
section (see examples in Appendix E) and two totals were calculated for every
person: the number of recalled i-units in all sections that originally contained
lexical bundles and the number of recalled i-units in those sections that did not
contain any bundles in the original text. Finally, the numbers for the two section
types—with and without lexical bundles—were converted into proportions (the
number of recalled i-units divided by the number of i-units in the original text)
and submitted to statistical analysis.
To supplement the results of the significance tests, ω2 and Cohen’s d were
calculated to measure the effect sizes for major statistical differences. Alpha
was set at .05 for all statistical tests.

Results
Cloze Test
The mean score for the cloze test was calculated (M = 37.35, SD = 7.58), and
all L2 speakers who scored less than or equal to the mean score were assigned
to a lower proficiency group, with the range of scores from 17 to 37 (M =
30.00, SD = 5.26). The participants who scored higher than the mean score
were assigned to a higher proficiency group (38–46, M = 42.78, SD = 3.04).

Dictation
The first research question asked if English L1 and L2 speakers differed in their
knowledge of lexical bundles that served two different discourse functions. The
scores that the participants received on the dictation activity are presented in
Table 3. The higher proficiency L2 speakers recalled more lexical bundles
(M = 6.83, SD = 2.29) than both the L1 speakers (M = 5.14, SD = 1.85) and
the lower proficiency L2 speakers (M = 3.65, SD = 1.90). Table 3 also shows
that all three participant groups recalled the discourse-organizing bundles more
frequently than the referential bundles.

Language Learning 59:3, September 2009, pp. 647–686 666


Nekrasova Knowledge of Lexical Bundles

Table 3 Immediate recall of lexical bundles by English L1 and L2 speakers

Discourse-
Referential org. Total
Group N M SD M SD M SD

L1 speakers 21 1.19 1.17 3.95 1.07 5.14 1.85


Higher proficiency 23 2.57 1.47 4.26 1.05 6.83 2.29
Lower proficiency 17 0.88 1.22 2.76 1.03 3.65 1.90

Table 4 The Bonferroni analysis of multiple comparisons (Experiment 2)

(I) groups (J) groups Mean difference Std. Error Sig.

L1 speakers Higher proficiency −0.84 .25 .004


L1 speakers Lower proficiency 0.75 .27 .022
Higher proficiency Lower proficiency 1.59 .27 .000

To address the first research question, the data were analyzed using a linear
mixed model with group as a between-subjects, three-level factor (L1 speakers
vs. higher proficiency L2 speakers vs. lower proficiency L2 speakers) and
function as a repeated two-level factor (discourse-organizing or referential
bundles). Results of evaluation of assumptions of normality and homogeneity
of variance-covariance were satisfactory. The results indicated that group was
a significant factor, F(2, 111.112) = 17.84, p < .05, which suggests that there
were significant differences among the three participant groups in terms of their
recall of lexical bundles. The ω2 = .13 indicated that approximately 13% of
the variation in participants’ scores was attributed to the differences among the
three participant groups. Function was also a significant factor, F(1, 111.112) =
95.34, p < .05, Cohen’s d = 1.56, showing that there was a significant difference
of a large magnitude between the recall of the two types of lexical bundles by
the participants. However, there was no significant interaction between group
and function (p > .05).
A pairwise comparison of the participant groups using a Bonferroni adjust-
ment (Table 4) indicated that higher proficiency L2 speakers recalled signifi-
cantly more lexical bundles compared to both L1 speakers, p < .05, Cohen’s
d = .81, and lower proficiency L2 speakers (p < .001, Cohen’s d = 1.51). L1
speakers recalled significantly more bundles than lower proficiency L2 speakers
(p < .05, Cohen’s d = .79).
Additionally, the results of Experiment 2 indicated that only 45% of all
possible bundles were recalled during the dictation activity. Furthermore, L1

667 Language Learning 59:3, September 2009, pp. 647–686


Nekrasova Knowledge of Lexical Bundles

speakers recalled only one lexical bundle 100% of the time (nothing to do with)
and three lexical bundles 80–99% of the time (in this chapter we, to do with
the, on the other hand), all of which served discourse-organizing function. The
other eight bundles were recalled less than 50% of the time, with the three least
recalled bundles being in terms of the (9%), was one of the (14%), and the
nature of the (19%). The higher proficiency L2 speakers recalled one bundle
100% of the time (in this chapter we), seven bundles 80–99% of the time (the
nature of the, than or equal to, nothing to do with, in terms of the, to do with
the, the top of the, on the other hand), and four bundles less than 50% of the
time, two of which were recalled the least: the rest of the (13%), and was one
of the (17%). Finally, the lower proficiency L2 speakers recalled three bundles
80% of the time (in this chapter we, nothing to do with, to do with the) and the
other nine bundles less than 50% of the time, with the least recalled bundles
being as well as the (6%), in terms of the (6%), the rest of the (12%), was one
of the (12%), and than or equal to (12%). All three participant groups recalled
three discourse-organizing bundles as fixed units more than 90% of the time
(in this chapter we, on the other hand, nothing to do with).
Research question 2 asked if the use of lexical bundles allowed L1 and
L2 speakers to retain more information during discourse comprehension and
subsequent production. The results of a subsequent analysis of the role of
lexical bundles on the i-units density indicated that only one group—higher
proficiency L2 learners—recalled considerably more i-units for the sections that
contained lexical bundles in the original text, t(22) = 13.21, p < .001, Cohen’s
d = 2.75. The other two participant groups did not show any difference in their
recall of i-units in relation to the presence of lexical bundles in the text (p >
.05). A Pearson correlation test indicated that there was no reliable relationship
between the length of a section and the number of i-units produced by the
participants (p > .05).

Questionnaire
To supplement the results of statistical tests, the results of the questionnaire
are reported here. Overall, 64% of the participants reported that they noticed
more bundles than they actually used. To the question of which bundles they
used in their own text reconstructions, 69% of L1 speakers, 44% of higher
proficiency L2 speakers, and 14% of lower proficiency L2 speakers reported
fewer bundles than they actually used. In terms of the accuracy, 43% of L1
speakers, 17% of higher proficiency L2 speakers, and 65% of lower proficiency
L2 speakers reported that they either noticed or used the bundles that were
not used in the original text. Although L1 speakers reported a variety of new

Language Learning 59:3, September 2009, pp. 647–686 668


Nekrasova Knowledge of Lexical Bundles

bundles, there were only two new bundles that lower proficiency L2 speakers
indicated as noticed or used: at the same time and in the case of . To the question
of why they used (or did not use) certain expressions in their recalled texts, the
majority of the participants reported that they were easy to remember (77%),
whereas some said that these expressions stood out in a sentence (12%), helped
to capture the main idea (3%), or helped to link different ideas (2%). To the last
question about which expressions they thought were particularly helpful for
understanding the meaning of the text, the answers for the two language groups
varied. L1 speakers gave more descriptive answers, explaining that the most
useful expressions were the phrases that “introduced the ideas, such as in this
chapter we or the nature of the,” “were the basis of the sentence,” “consisted of
words that often go together,” and “helped combine the phrases in a sentence.”
L2 speakers were more specific in their answers and listed the bundles that
they found particularly useful, among which the most frequent were on the
other hand (63%), nothing to do with (54%), and has to do with (52%), all
discourse-organizing bundles.

Summary of Findings
To summarize the findings of Experiment 2, the three participant groups were
different in their recall of lexical bundles, with higher proficiency L2 speakers
outperforming the L1 speakers and the lower proficiency L2 speakers. Addi-
tionally, all three participant groups recalled more discourse-organizing bundles
than referential bundles. Finally, although participants produced some bundles
more frequently than others, only the higher proficiency L2 group showed dif-
ferent recall rates of i-units for the sections that contained lexical bundles in
the original text compared to those sections that did not. A general discussion
of the findings of both experiments follows.

General Discussion
Holistic Status of Lexical Bundles
The study employed a production criterion to explore if L1 and L2 English
speakers demonstrated any knowledge of lexical bundles as holistic units. In
their study on the psychological validity of corpus-derived recurrent clusters
Schmitt, Grandage, et al. (2004) argued that not all clusters were produced
intact, hence not all of them could be considered formulaic. At first glance,
Schmitt, Grandage, et al.’s findings seem to be similar to the findings obtained
in the present study. However, because the target sequences utilized in Schmitt,
Grandage, et al.’s study were heterogeneous in terms of their structure (i.e.,

669 Language Learning 59:3, September 2009, pp. 647–686


Nekrasova Knowledge of Lexical Bundles

some more structurally complete than others) and functions served in a text
(pragmatic versus discourse), one should be cautious when generalizing these
findings to lexical bundles. Furthermore, Schmitt, Garndage, et al.’s study
employed only one criterion—intact form—to judge whether corpus-derived
clusters were psychologically real; this criterion cannot always be applied to
lexical bundles, which, due to their structural characteristics, allow more vari-
ation in their form. The results from the present study indicated that L1 and L2
speakers did not use all lexical bundles the same way. Although some of the
bundles were consistently produced in a fixed form more frequently than others
(e.g., on the other hand, at the same time, nothing to do with), other bundles
were contextually appropriate but showed more variation in form (e.g., what I
want to, if you look at), whereas other bundles were contextually appropriate
less than 50% of the time (e.g., in the absence of, in terms of the, in the case
of, in the form of, on the basis of, the nature of the). This distribution of lexical
bundles suggests two things. First, form-fixedness could, on the one hand, indi-
cate that a bundle itself is psychologically fixed. On the other hand, the fact that
L1 speakers produced fewer fixed lexical bundles than advanced L2 speakers in
Experiment 1 could suggest that L1 speakers had a larger inventory of lexical
bundles that allowed them to select the most contextually appropriate variant.
Thus, form-fixedness could be an indicator of one’s language proficiency level,
which is discussed in greater detail in the following section. Next, the results of
this study also suggest that more than one criterion should be considered before
a bundle can be defined as a holistic unit: how frequently it is appropriately pro-
duced in a certain context and how frequently it is produced in a fixed form. The
data from both experiments imply that, instead of a binary distinction of either
being treated as a holistic unit or not, a bundle should be described in terms of
its place on a continuum from more holistic to more compositional units. For
example, based on the two criteria discussed here, the following bundles from
Experiment 1 could be defined as leaning more toward the holistic end—one
of the most, at the same time, know what I mean, I would like to, on the other
hand, what I want to, if you look at, a little bit about, the beginning of the—
whereas in the form of, in the case of, in terms of the, in the absence of, and on
the basis of would lean more toward the compositional end of the continuum.
Second, different production tasks seemed to feature different lexical bun-
dles as holistic units. The issue here is not necessarily that one task was more
accurate than the other. Rather, the two tasks measured two different types of
knowledge of lexical bundles: whereas the gap-filling task measured L1 and
L2 English speakers’ knowledge of the particular word needed to complete the
frame, the dictation task measured participants’ knowledge of an entire bundle.

Language Learning 59:3, September 2009, pp. 647–686 670


Nekrasova Knowledge of Lexical Bundles

The fact that only 45% of all possible bundles were recalled in Experiment 2 (as
opposed to 74% in Experiment 1) suggests that it was easier for the participants
to display the knowledge of a lexical bundle when a frame was provided, as
they were prompted to refer to this knowledge. Thus, the smaller number of
lexical bundles produced in Experiment 2 does not necessarily indicate that
participants did not have the knowledge of these structures; they simply might
not have been prompted to fully demonstrate this knowledge.
In addition, the results of Experiment 2 showed that the relationship between
the presence of lexical bundles in a text section and the number of i-units
recalled by the participants was found significant only for higher proficiency
L2 speakers. This, however, could lead to two different conclusions. On the
one hand, higher proficiency learners could, indeed, have employed lexical
bundles in order to reduce the processing burden during L2 comprehension
and subsequent production. In this case, it would suggest that the participants
could have recognized the holistic status of lexical bundles. On the other hand,
this difference among the groups might have been attributed to the difference in
learning skills acquired by L2 speakers, which increased with L2 proficiency.
This issue is discussed in more detail in the next subsection.
Finally, because the results of Experiment 2 indicated that higher proficiency
L2 learners not only recalled more lexical bundles during the dictation activity,
with most of them being recalled in the original form as presented in the input,
but also recalled significantly more idea units for those sections of the text that
contained lexical bundles, it could be a reasonable assumption to make that
producing these units unmodified during text recall helps a language user to
retain more information, which, again, could be an indication of the holistic
manner of lexical bundles. Thus, an interesting question to explore is whether
the holistic nature of lexical bundles, as indicated by the ease of the processing
burden during text recall, is necessarily reflected in a greater number of lexical
bundles produced intact. Consequently, a post hoc analysis was carried out to
determine whether there was any relationship between the number of idea units
recalled during the dictation activity and the overall number of lexical bundles
produced intact by the three participant groups.
In the post hoc analysis, the total number of lexical bundles produced
intact and the number of idea units recalled during the dictation activity were
calculated and then correlated. Whereas the Pearson correlation coefficient
obtained for L1 speakers indicated a weak correlation between the two variables
(r = .26, p > .05), the coefficient obtained for the two L2 speaker groups
showed moderate correlations between the variables (r = .46, p < .05 for the
higher proficiency L2 speakers and r = .57, p < .05 for the lower proficiency

671 Language Learning 59:3, September 2009, pp. 647–686


Nekrasova Knowledge of Lexical Bundles

L2 speakers). Thus, the post hoc analysis indicated that whereas there was
a positive moderate relationship between the number of idea units recalled
and the number of lexical bundles produced intact, the same did not hold
true for L1 speakers, for whom the recall of text information (i.e., idea units)
did not seem to be related to the intact production of lexical bundles. Taken
together, the findings of Experiment 2 suggest that the holistic nature of lexical
bundles might not necessarily be reflected in a greater number of these units
produced in the exact form in which they appeared in the input. This, again,
provides additional support for the idea that more than one criterion (e.g., intact
production) should be taken into account in order to identify a lexical bundle
as a formulaic unit.

Proficiency Differences
The results of both experiments in terms of the difference (or no difference in
Experiment 1) between L1 and advanced L2 speakers were unexpected. In their
dictation study in which the participants orally reconstructed a story, Schmitt,
Grandage, et al. (2004) found that, on the whole, L1 speakers performed better
in terms of both the accuracy of reproduction and the number of accurately
reproduced chunks. They also discovered that L1 speakers did very little mod-
ification of the original sequence and either used the exact string or did not use
it at all. Furthermore, L2 speakers were found to partially reproduce the strings
or produce them inaccurately. Contrary to these findings, higher proficiency
L2 speakers in Experiment 2 of the present study were found to not only recall
a larger number of lexical bundles but also to use very few modifications of
the original bundles. L1 speakers, however, showed more creativity within the
reproduced bundles and created strings that were very different from the target
bundles (e.g., the following variations of the target sequence than or equal to
were used: better or equal to, important or equal to, equal to or more, etc.).
A great number of the sequences that were modifications of the target bundles
could not be identified in the corpus, suggesting that L1 speakers showed id-
iosyncratic use of these bundles, which might be unrecognizable to others and
not flagged by frequent occurrence in a corpus, a feature that is considered to
be characteristic of L2 speakers’ production (Foster, 2001; Schmitt, Grandage
et al., 2004).
One of the possible explanations to account for these results is the nature
of the L2 use that the L2 speakers practiced in their L2 classroom. Being EFL
learners, the L2 participants learned English in a classroom and were constantly
engaged in activities that focused on memorization of lexical items and oral
reproduction of recorded texts. Thus, although gaining more proficiency in

Language Learning 59:3, September 2009, pp. 647–686 672


Nekrasova Knowledge of Lexical Bundles

English, higher proficiency learners could have developed both the skills to
hold longer stretches of words in short-term memory and to attend to the
language units used in the text to be able to exactly reproduce them. In contrast,
L1 speakers seemed to grasp an overall idea of the text without paying too much
attention to the units of language used in the original text. The L1 speakers’
answers to the questionnaire, which they completed after the dictation activity,
provide some evidence to the idea expressed above. L1 speakers not only used
more lexical bundles in their recalled texts than the number they reported but
also reported more bundles that were not present in the original text as noticed
and used in their own texts.
Comparing the two nonnative groups, one of the reasons why higher pro-
ficiency L2 speakers performed better than lower proficiency L2 speakers is
that they may have acquired greater lexical knowledge, which may lead to the
enhanced knowledge of lexical bundles.

Discourse Function Differences


According to Biber, Conrad, and Cortes (2004), discourse-organizing bun-
dles serve two major functions: topic introduction/focus and topic elabora-
tion/clarification. Topic introduction bundles in academic discourse provide
signals that a new topic is being introduced, whereas topic elaboration bun-
dles provide additional explanation or clarification to the topic that is being
discussed. Referential bundles are used to introduce a particular attribute of an
entity as especially important (e.g., indicating imprecision, specific features,
time/place reference). By comparing the two functions that the two groups of
lexical bundles perform in a text, it becomes clear that discourse-organizing
bundles operate on a higher level in a text—they are used to indicate the
connections between larger pieces of discourse (e.g., new information versus
old information, changes of a topic, providing information that is crucial for
comprehension). Referential bundles, on the other hand, function on a lower,
phrasal or sentence, level and provide additional information to characterize
one of the sentence constituents (i.e., a noun). With this distinction in mind,
discourse-organizing bundles prove to be very important for the overall com-
prehension of the topic being discussed, as they can help the speaker with
discourse development and provide orientation for the listener. This results in
discourse-organizing bundles being recognized as more marked features that
require learners’ attention, compared to referential bundles that play a less
significant role in the comprehension process. Being recognized as “crucial”
for comprehension, discourse-organizing bundles may have an advantage of
being noticed more by the learners and, thus, have more chances to be acquired

673 Language Learning 59:3, September 2009, pp. 647–686


Nekrasova Knowledge of Lexical Bundles

sooner. The data from Experiment 2 provide more support for this argument:
Whereas the three most frequent bundles used in 90% of all text recalls were
discourse-organizing sequences (in this chapter we, on the other hand, and
nothing to do with), the four referential bundles (the rest of the, in terms of the,
the nature of the, and was one of the) were the least frequent and occurred in
only 20% of all text recalls. Furthermore, in their questionnaires, the majority
of the participants referred to discourse-organizing bundles as being the most
helpful for understanding the meaning of the text by either describing their
characteristics (e.g., introducing new ideas) or listing specific examples (e.g.,
on the other hand, nothing to do with, and has to do with).
Another reason why discourse organizers may have been easier for the
participants to produce than referential bundles might have to do with the char-
acteristics of the frames used to produce these two classes of lexical bundles.
Although both types of bundles are usually incomplete structural units, the
same frame might be used with different discourse organizers much less than
with referential bundles (e.g., compare on the other hand to the rest of the,
the beginning of the, the end of the, the top of the). Thus, the strength of as-
sociation of a specific frame with a particular lexical bundle might be higher
for discourse organizers than for referential bundles, which would make the
former more salient and easier to retrieve than the latter. This, however, is just
a hypothesis, which needs to be further tested on larger samples.
Finally, some discourse-organizing bundles might have the advantage of
being more salient to the participants due to the fact that their usefulness as
transition phrases is explicitly discussed in many language classes in both the
L1 and L2 educational contexts (e.g., Cortes, 2006).

Implications
Pedagogical Implications
One of the findings of the study was that lower proficiency English learners
were not able to accurately produce as many lexical bundles as did L1 speakers
and higher proficiency learners. This finding could be interpreted in two ways:
the learners either did not have the knowledge of how to use certain bundles
appropriately or they preferred to use other structures instead. By underusing
lexical bundles whose function is to signal relationships between smaller and
larger pieces of discourse, L2 learners run the risk of creating texts without
cohesiveness and clear organizational structure. Thus, L2 learners need to
become aware of how using lexical bundles can help them improve their writing.
This finding is consistent with Cortes (2004), who demonstrated that by just
being exposed to lexical bundles in a specific register L1 learners could not

Language Learning 59:3, September 2009, pp. 647–686 674


Nekrasova Knowledge of Lexical Bundles

master the use of these structures in their own writing. Cortes argued that in
order for the learners to use lexical bundles appropriately, they need to “notice”
the contexts in which these units are typically used, as well as the discourse
functions they perform in those contexts. Cortes (2006) also suggested that the
exposure to lexical bundles should be long enough for the students to be able
to start using them in their own writing, because these expressions might be
challenging for the learners to acquire. These suggestions could work equally
well for L2 learners who could benefit from more explicit teaching of how
different classes of lexical bundles should be utilized.

Theoretical Implications
One of the most recent tendencies in present-day L2 research is to employ
lexical bundles as target structures in the studies investigating the acquisition
of formulaic sequences (see Jones & Haywood, 2004; Schmitt, Dornyei et al.,
2004; Warga, 2005). Although it might seem logical to treat the two structures
(i.e., lexical bundles and formulaic sequences) as equivalents to each other
because some of the criteria for their identification might overlap (e.g., phrase
length and frequency of occurrence), lexical bundles and formulaic sequences
do not necessarily reflect the same phenomenon. Although formulaic sequences
are traditionally described as complete units that are stored and retrieved holisti-
cally and used as shortcuts in language processing and production (Wray, 2000,
2002; Wray & Perkins, 2000), the results of the present study indicate that not
all lexical bundles have the same psycholinguistic status. Thus, treating the
two structures as equivalent to each other and employing lexical bundles as tar-
get structures in the research exploring the acquisition of formulaic sequences
should be done with caution.
In terms of determining the psychological validity of lexical bundles, the
results of the present study indicated that participants’ knowledge of these
units was affected by their register characteristics and the discourse functions
more so than their frequency of occurrence. For example, whereas most of the
appropriately produced lexical bundles served a discourse-organizing function
(e.g., on the other hand, nothing to do with, at the same time, if you look at),
all of the least appropriately produced bundles served a referential function
and were characteristic of the academic writing register (e.g., in the absence
of, in the form of, on the basis of, the nature of the). This suggests that the
saliency of lexical bundles is determined by the interaction of at least three
factors: frequency of occurrence, distribution in a specific register, and their
discourse function. Furthermore, it appears that lexical bundles, although being
structurally incomplete units, can be further classified into different structural

675 Language Learning 59:3, September 2009, pp. 647–686


Nekrasova Knowledge of Lexical Bundles

patterns, with some of them having more productive frames (e.g., in the case of,
in the middle of, in the form of, in the absence of ) than others (e.g., or something
like that, is one of the, on the other hand). These structural differences of lexical
bundles could affect the way they are perceived by L1 and L2 English speakers.

Limitations and Future Research


The present study focused on participants’ knowledge of a very limited set
of lexical bundles operationalized as their ability to produce these units in a
context. The study employed a relatively small number of participants in both
experiments, and the participants’ L1 backgrounds in Experiment 1 were dif-
ferent from the participants’ L1 backgrounds in Experiment 2. In addition,
some of the bundles used in the study were simply parts of longer stretches that
were cut off by corpus software into shorter individual sequences (e.g., what
I want to, want to do is). This, however, is a common concern in research on
lexical bundles that has not been resolved yet. In order to be able to make gen-
eralizations about whether L1 and L2 English speakers recognize the holistic
status of lexical bundles and what role discourse function plays in this process,
more research should be conducted that would focus on different categories
of lexical bundles in a variety of register-specific production tasks, as well as
with different participant groups. The findings of this study outline useful di-
rections for future research on the use of lexical bundles by L1 and L2 English
speakers. Because this study provided evidence that the discourse function of
lexical bundles influenced their use by L1 and L2 speakers in different aca-
demic registers (i.e., academic prose and conversation), it is worth exploring
the recognition and production patterns of lexical bundles that perform other
discourse functions as well as examining how the patterns vary in different
registers. Furthermore, this study employed a rather indirect measure of par-
ticipants’ knowledge of lexical bundles as holistic units by investigating how
these structures were produced in controlled as well as more extended stretches
of discourse. In future research, it could be useful to also apply a recognition
measure, such as processing times, to obtain more information about how L1
and L2 English speakers process lexical bundles compared to other structures.
Revised version accepted 16 August 2008

References
Altenberg, B. (1998). On the phraseology of spoken English: The evidence of
recurrent word combinations. In A. P. Cowie (Eds.), Phraseology: Theory, analysis
and applications (pp. 101–122). Oxford: Oxford University Press.

Language Learning 59:3, September 2009, pp. 647–686 676


Nekrasova Knowledge of Lexical Bundles

Biber, D., & Conrad, S. (1999). Lexical bundles in conversation and academic prose.
In H. Hasselgård & S. Oksefjell (Eds.), Out of corpora. Studies in honour of Stig
Johansson (pp. 181–190). Amsterdam: Rodopi.
Biber, D., Conrad, S., & Cortes, V. (2004). If you look at. . .: Lexical bundles in
university teaching and textbooks. Applied Linguistics, 25, 371–405.
Biber, D., Conrad, S., Reppen, R., Byrd, P., Helt, M., Clark, V., et al. (2004).
Representing language use in the university: Analysis of the TOEFL 2000 Spoken
and Written Academic Language Corpus. TOEFL Monograph Series. Princeton,
NJ: Educational Testing Service.
Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). The Longman
grammar of spoken and written English. London: Longman.
Brown, J. (1980). Relative merits of four methods for scoring cloze tests. Modern
Language Journal, 64, 311–317.
Chiu, C-Y., & Savignon, S. (2006). Writing to mean: Computer-mediated feedback in
online tutoring of multidraft compositions. CALICO Journal, 24, 97–114.
Conrad, S. (2000). Will corpus linguistics revolutionize grammar teaching in the 21st
century? TESOL Quarterly, 34, 548–560.
Cortes, V. (2004). Lexical bundles in published and student disciplinary writing:
Examples from history and biology. English for Specific Purposes, 23, 397–423.
Cortes, V. (2006). Teaching lexical bundles in the disciplines: An example from a
writing intensive history class. Linguistics and Education, 17, 391–406.
Coulmas, F. (1979). On the sociolinguistic relevance of routine formulae. Journal of
Pragmatics, 3, 239–266.
Coulmas, F. (1981). Introduction: conversational routine. In F. Coulmas (Ed.),
Conversational routine (pp. 1–17). The Hague: Mouton.
Cowie, A. (1998). Phraseology: Theory, analysis and applications. Oxford: Oxford
University Press.
De Cock, S. (2000). Repetitive phrasal chunkiness and advanced EFL speech and
writing. In C. Mair & M. Hundt (Eds.), Corpus linguistics and linguistic theory (pp.
51–68). Amsterdam: Rodopi.
De Cock, S., Granger, S., Leech, G., & McEnery, T. (1998). An automated approach to
the phrasicon of EFL learners. In S. Granger (Ed.), Learner English on computer.
(pp. 67–79). New York: Longman.
Dufon, M. (1995). The acquisition of gambits by classroom foreign language learners
of Indonesian. In M. Alves (Ed.), Papers from the 3rd annual meeting of the
Southeast Asian Linguistic Society (pp. 27–42). Tempe: Arizona State University,
Program for Southeast Asian Studies.
Erman, B. (2007). Cognitive processes as evidence of the idiom principle.
International Journal of Corpus Linguistics, 12, 25–53.
Ferguson, C. (1976). The structure and use of politeness formulas. Language in
Society, 5, 137–151.
Ferrante, J. (2003). Sociology: A global perspective (5th ed., pp. 22–24). Belmont, CA:
Wadsworth.
677 Language Learning 59:3, September 2009, pp. 647–686
Nekrasova Knowledge of Lexical Bundles

Foster, P. (2001). Rules and routines: A consideration of their role in the task-based
language production of native and non-native speakers. In M. Bygate, P. Skehan, &
M. Swain (Eds.), Researching pedagogic tasks: Second language learning,
teaching, and testing (pp. 75–93). San Francisco: Pearson Education.
Fotos, S. (1991). The cloze test as an integrative measure of EFL proficiency: A
substitute for essays on college entrance examinations? Language Learning, 41,
313–336.
Gibbs, R., Jr., & Gonzales, G. (1985). Syntactic frozenness in processing and
remembering idioms. Cognition, 20, 243–259.
Gibbs, R., Jr., Nayak, N., & Cutting, C. (1989). How to kick the bucket and not
decompose: Analyzability and idiom processing. Journal of Memory and Language,
28, 576–593.
Granger, S. (1998). Prefabricated patterns in advanced EFL writing: Collocations and
formulae. In A. H. Cowie (Ed.), Phraseology: Theory, analysis, and applications
(pp. 145–160). Oxford: Clarendon Press.
Hakuta, K. (1974). Prefabricated patterns and the emergence of structure in second
language acquisition. Language Learning, 24, 287–297.
Heilenman, L. (1983). The use of a cloze procedure in foreign language placement.
Modern Language Journal, 67, 121–126.
House, J. (1996). Developing pragmatic fluency in English as a foreign language.
Studies in Second Language Acquisition, 18, 225–252.
Hudson, J. (1998). Perspectives on fixedness: Applied and theoretical. Lund, Sweden:
Lund University Press.
Izumi, S. (2002). Output, input enhancement, and the noticing hypothesis: An
experimental study on ESL relativization. Studies in Second Language Acquisition,
24, 541–577.
Jones, M., & Haywood, S. (2004). Facilitating the acquisition of formulaic sequences:
An exploratory study in an EAP context. In N. Schmitt (Ed.), Formulaic sequences:
Acquisition, processing and use (pp. 269–300). Amsterdam: Benjamins.
Kowal, M., & Swain, M. (1997). From semantic to syntactic processing: How can we
promote it in the French immersion classroom? In R. Johnson & M. Swain (Eds.),
Immersion education: International perspectives (pp. 284–309). New York:
Cambridge University Press.
Lindemann, S., & Mauranen, A. (2001). It’s just real messy: the occurrence and
function of just in a corpus of academic speech. English for Specific Purposes, 20,
459–475.
Moon, R. (1998). Fixed expressions and idioms in English. A corpus-based approach.
Oxford: Clarendon Press.
Nattinger, J., & DeCarrico, J. (1992). Lexical phrases and language teaching. Oxford:
Oxford University Press.
Pawley, A., & Syder, F. H. (1983). Two puzzles for linguistic theory: Nativelike
selection and nativelike fluency. In J. C. Richards & R. W. Schmidt (Eds.),
Language and communication (pp. 191–226). New York: Longman.
Language Learning 59:3, September 2009, pp. 647–686 678
Nekrasova Knowledge of Lexical Bundles

Peters, A. (1983). Units of language acquisition. Cambridge: Cambridge University


Press.
Raupach, M. (1984). Formulae in second language speech production. In H. Dechert &
M. Raupach (Eds.), Second language production (pp. 114–137). Tubingen: Narr.
Rosenthal, R., & Rubin, D. (1982). Comparing effect sizes of independent studies.
Psychological Bulletin, 92, 500–504.
Sasaki, M. (2000). Effects of cultural schemata on students’ test-taking processes for
cloze test: A multiple data source approach. Language Testing, 17, 85–114.
Schmitt, N., Dörnyei, Z., Adolphs, S., & Durow, V. (2004). Knowledge and acquisition
of formulaic sequences: A longitudinal study. In N. Schmitt (Ed.), Formulaic
sequences: Acquisition, processing and use (pp. 55–86). Amsterdam: Benjamins.
Schmitt, N., Grandage, S., & Adolphs, S. (2004). Are corpus-derived recurrent clusters
psycholinguistically valid? In N. Schmitt (Ed.), Formulaic sequences. Acquisition,
processing and use (pp. 127–152). Amsterdam: Benjamins.
Sinclair, J. M. (1991). Corpus, concordance, collocation. Oxford: Oxford University
Press.
Swain, M. (1998). Focus on form through conscious reflection. In C. Doughty & J.
Williams (Eds.), Focus on form in classroom second language acquisition (pp.
64–81). New York: Cambridge University Press.
Swinney, D., & Cutler, A. (1979). The access and processing of idiomatic expressions.
Journal of Verbal Learning and Verbal Behavior, 18, 523–534.
Van Lancker, D., & Kempler, D. (1987). Comprehension of familiar phrases by left-
but not by right-hemisphere damaged patients. Brain and Language, 32, 265–277.
Van Lancker-Sidtis, D., & Rallon, G. (2004). Tracking the incidence of formulaic
expressions in everyday speech: Methods for classification and verification.
Language and Communication, 24, 207–240.
Wajnryb, R. (1990). Grammar dictation. Oxford: Oxford University Press.
Warga, M. (2005). “Je serais très merciable”: Formulaic vs. creatively produced speech
in learners’ request-closing. Canadian Journal of Applied Linguistics, 8(1), 67–93.
Williams, E. (1994). Remarks on lexical knowledge. In L. Gleitman & B. Landau
(Eds.), The acquisition of the lexicon (pp. 7–34). Cambridge, MA: MIT Press.
Wong Fillmore, L. (1976). The second time around: Cognitive and social strategies in
second language acquisition. Unpublished doctoral dissertation, Stanford
University, Stanford, CA.
Wood, D. (2006). Uses and functions of formulaic sequences in second language
speech: An exploration of the foundations of fluency. Canadian Modern Language
Review, 63, 13–33.
Wray, A. (2000). Formulaic sequences in second language teaching: Principle and
practice. Applied Linguistics, 21, 463–489.
Wray, A. (2002). Formulaic language and the lexicon. Cambridge: Cambridge
University Press.

679 Language Learning 59:3, September 2009, pp. 647–686


Nekrasova Knowledge of Lexical Bundles

Wray, A., & Perkins, M. (2000). The functions of formulaic language: An integrated
model. Language and Communication, 20(1), 1–28.

Appendix A
Lexical Bundles Tested in Experiment 1

Referential bundles Discourse-organizing bundles

is one of the what do you think


one of the most if you look at
the rest of the if you have a
the top of the to look at the
in the form of what I want to
in the middle of want to do is
in the case of want to talk about
in terms of the I would like to
as a result of take a look at
on the basis of a little bit about
in the absence of to do with the
the way in which nothing to do with
in the presence of on the other hand
at the same time as well as the
the beginning of the know what I mean
at the end of
or something like that

Appendix B
Test Materials Used in Experiment 1

Recognizing English expressions


Instructions:
In each of the following sentences, one word has been deleted. Read each
sentence and fill in the blank with an appropriate English word for the given
context. If you think other words would also be appropriate, you can write the
alternatives at the end of the sentence in parenthesis. If you have any questions,
let the researcher know. There are 32 sentences. You have 25 minutes to
complete them. Thank you for your participation!
1. If you __________ at the Figure and study it carefully you will be able to
recognize a number of “commonsense” points. (___________________)

Language Learning 59:3, September 2009, pp. 647–686 680


Nekrasova Knowledge of Lexical Bundles

2. Your class work and homework will be assigned by your instructor. Keep
all your work in an organized binder. At the ___________ of each chapter,
you will be assigned a series of problems to help you write a Chapter
Summary. (____________________)
3. You are responsible for material covered in the readings and in the lectures,
with particular emphasis on the latter. If you ___________ a question do
not hesitate to ask. (___________________)
4. I’m going to return some papers here and talk just a little ___________
about them. (___________________)
5. Even the most highly motivated and intelligent patients are likely
to become noncompliant in the __________ of any symptoms.
(____________________)
6. My name is Melanie Graham, I’m a first year master student in RTC
and I have no idea what I __________ to do yet. I’m still learning.
(___________________)
7. Don’t cross your arms and legs at the same ____________ because some
interviewers may think you are shutting them out (a preconception drawn
from the book Body Language). Don’t manipulate objects (like a pen-
cil or keys) during the interview; try to remain natural and at ease.
(____________________)
8. What I want to ___________ is quickly run through the exercise that
we’re going to do. (___________________)
9. Homework will be assigned regularly. It will be collected at the
___________ of the next class and will not be accepted late.
(____________________)
10. Socialism, on the other ___________, is a type of theory which could only
have arisen in societies where the division of labor is highly developed.
(____________________)
11. Personally, I find that I sometimes get new ideas while I am engaged in
activities that have __________ to do with my research at all, such as
gardening, painting in the house, or even shaving when I get up in the
morning. (____________________)
12. An eligible undergraduate student may be awarded a grant of $100 to
$4,000 on the __________ of financial need. A student must complete
the FAFSA in order to be considered. (____________________)
13. To avoid creating an ethical dilemma for yourself as you prepare your
resume, remember the following: Be honest in ____________ of the
information you include. (____________________)

681 Language Learning 59:3, September 2009, pp. 647–686


Nekrasova Knowledge of Lexical Bundles

14. The ___________ in which the logical structure of a passage is perceived


by different readers can differ enormously. (____________________)
15. To conclude this chapter, I would __________ to return to the picture,
discussed in the first chapter, of the organism as a dissipative structure,
maintained by the flow of energy through it. (____________________)
16. [. . .] if a student misses more than one week of classes you should talk
to me immediately, if you know you’re gonna be gone. Let’s say, for
example, you’re gonna go to Montana for a couple of days this week, or
___________ like that, you might let the instructor know you’re gonna
be gone. (____________________)
17. The ability to display information in the form of characters, including
both numbers and letters, graphs of various kinds and even [. . .] pictures,
is _________ of the most powerful talents that the micro possesses.
(____________________)
18. Several years ago, a young woman was stabbed to death in the
___________ of a street in a residential section of New York City.
(____________________)
19. Two aspects of natural selection, variation and competition, are the critical
factors that determine whether any particular animal and its offspring will
enjoy reproductive success. Let’s ___________ a look at each of these
aspects, beginning with variation. (____________________)
20. [. . .] now the first aspect I __________ to talk about is convenience of
the Internet. (____________________)
21. The extent to which these approaches are complementary has to
_________ with the relationship between competence and performance
and is a matter of current debate. (____________________)
22. For these reasons the proposal writer should carefully describe the location
or setting of the project. This includes geographic location as __________
as the nature of the setting. (____________________)
23. What do you __________ of that argument, you know, that law doesn’t
really have a place in the intensive care unit? (____________________)
24. You should nevertheless become acquainted with a variety of American
writers and become more careful and appreciative readers of literature;
you should also become better writers as a ___________ of this course.
(____________________)
25. To write accurately and directly, to learn how to tell the truth in writing, to
speak clearly and without clichés and fillers (like “sort of” and “you know
what I _________ “), these are skills that can be taught and practiced.
(___________________)

Language Learning 59:3, September 2009, pp. 647–686 682


Nekrasova Knowledge of Lexical Bundles

26. One of the ___________ important factors in your grade is the amount
of time you spend reading the text and applying your knowledge.
(____________________)
27. You will be excused from exams only with a physician’s note or verifiable
personal emergency. In the __________ of an excused absence, your exam
score will be the average of your other exams. (____________________)
28. Three o’clock– I just finish off everything that I did not have time to do.
Then the ____________ of the day is my own. (____________________)
29. [. . .] it’s possible to _____________ at the same graph or table for example
and see different things. (__________________)
30. The most satisfactory way of teaching problem-solving is by the method
of guided discovery, in which the teacher presents the problem usually in
the ____________ of a question. (____________________)
31. These critiques should be typed and doubled spaced and must be one to
two pages in length. Be sure to give the name of the speaker, the title of
the lecture, and the date of the presentation at the __________ of the first
page. (____________________)
32. The will must be in writing. It must be signed by the testator or by some
person in his presence and by his express direction. The signature must be
made or acknowledged by the testator in the __________ of two or more
witnesses, both present at the same time. (____________________)

Appendix C
Lexical Bundles Tested in Experiment 2

Discourse-organizing
Referential bundles bundles

was one of the to do with the


the rest of the nothing to do with
the top of the on the other hand
in terms of the as well as the
than or equal to has to do with
the nature of the in this chapter we

683 Language Learning 59:3, September 2009, pp. 647–686


Nekrasova Knowledge of Lexical Bundles

Appendix D
Test Materials Used in Experiment 2

Remembering English texts


Instructions:
You will listen to a text about Sociology section by section. You will listen to
each section two times. After you listen to the section the second time, you will
have 2 minutes to reconstruct it, keeping as close to the original text as possible.
Please do not take any notes while you are listening to the text. Remember,
try to make your reconstruction as close to the original text as possible and
use as many expressions from the text as you can. If you have any questions,
please let the researcher know.
Now, let’s practice. I will read a short passage to you twice, after that you will
reconstruct the passage as close to the original as possible.
Between 1492 and 1800, an independent world began to emerge. Europeans
learned of, or colonized much of North America, South America, Asia, and
Africa and set the tone of international relations for centuries to come.
This is the end of practice. If you have any questions at this point, please
let the researcher know.

Sociology and Career Choices


When employers, parents, and the rest of the world ask “Why did you major
in sociology?” or “Why take sociology classes?” the reply must be convincing.
(261 )
Responses such as “Liking people was one of the reasons” or “Sociology has
to do with people and I want to work with people” are too general. (27)
Parents, friends, and potential employers should understand the nature of the
discipline and realize that sociology is more important than or equal to other
academic disciplines. (26)
In this chapter, we will discuss how to apply sociological knowledge to a
variety of work-related tasks, as well as the real-life situations. (23)

1
Number of words in a section.

Language Learning 59:3, September 2009, pp. 647–686 684


Nekrasova Knowledge of Lexical Bundles

Sociology has almost nothing to do with the individual – it looks beyond the
individual and focuses on a larger population of humans. (22)
People belong to various groups, and human behavior can be interpreted in
terms of the interaction patterns that take place in those groups. (23)
As one example of how the sociological knowledge can be used in a real-life
situation, imagine that you work for a company that employs thousands of
workers. (27)
The problem is that employees from different units have never met each
other, so a company picnic is arranged to help employees meet one another.
(24)
At this event, however, everyone talks only with people they already know.
Thus, company executives do not know what to do with the problem. (24)
As a person who understands how groups work, you would think of ways to
“make” people break out of their limited social circles. (23)
For example, you can separate the crowd into 12 groups according to birthday
month and have people introduce themselves by telling the group members
something memorable. (26)
This exercise may not be on the top of the best conversation topics list, but it
will force people to talk to one another. (24)
Sociology can lead to many careers. On the other hand, sociology is not
connected with specific skills, so students must be able to explain what they
can do with this degree. (31)
Adapted from: Ferrante, J. (2003). Sociology: A Global Perspective (5th ed.).
Belmont, CA: Wadsworth, 22–24.

Thank you for your participation!

Appendix E
Examples of the Computation of Information Units (i-units)
Section 1: When employers, parents, and the rest of the world ask “Why did
you major in sociology?” or “Why take sociology classes?” the reply must be
convincing.

685 Language Learning 59:3, September 2009, pp. 647–686


Nekrasova Knowledge of Lexical Bundles

Basic i-unit → When employers, parents, and the rest of the world ask
Supporting i-unit 1 → why did you major in sociology
Supporting i-unit 2 → why take sociology classes
Supporting i-unit 3 → the reply must be convincing
Total i-units per section = 4 i-units
Section 13: Sociology can lead to many careers. On the other hand, soci-
ology is not connected with specific skills, so students must be able to explain
what they can do with this degree.
Basic i-unit 1 → Sociology can lead to many careers.
Basic i-unit 2 → Sociology is not connected with specific skills
Supporting i-unit 1 → students must be able to explain
Supporting i-unit 2 → what they can do
Supporting i-unit 3 → with this degree
Total i-units per section = 5 i-units

Language Learning 59:3, September 2009, pp. 647–686 686

You might also like