0% found this document useful (0 votes)
14 views

Unlocking THR Power of Corpus-Based Language Padagogy

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Unlocking THR Power of Corpus-Based Language Padagogy

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Empowering Language

Education: Unlocking the


Power of Corpus-Based
Language Pedagogy (CBLP)
for Teachers

Dr. Angel Ma
[email protected]

Department of Linguistics and Modern Language Studies


The Education University of Hong Kong
Outline 01 Introduction to key
terms (e.g., CL, CBLP)
Resources: Free online
02 corpora
Practices: CBLP 2.0
03 empowered by Al
Practices: CBLP lesson
04 designs
05 CBLP research potential

Department of Linguistics and Modern Language Studies


The Education University of Hong Kong 2
Corpus & Corpus Technology
Corpus Corpus Technology
• Defined as “the application of technology
A large authentic language database operated
associated with corpus linguistics and corpora
through computer technology.
for language learning and teaching” (Ma et al.,
2022, p. 1).

• Learners act as researchers to discover


patterns/rules through observing large
quantities of examples (Johns, 1991)

• Corpora and corpus technology could provide


rich, authentic language resources and guide
students in studying language inductively
(Boulton, 2017).
3
Corpus literacy: the base for using corpus technology
Corpus Literacy (CL) is the base for using corpus technology (Mukherjee, 2006).

Heather and Helt (2012) defined CL as the ability to use corpus linguistics
technology to investigate language and enhance student language development.

CL comprising four major components (Mukherjee, 2006):


1 understanding what constitutes a corpus
2 recognising what can and cannot be achieved with corpora
3 analysing corpus data (concordance lines)
4 summarising language use patterns/trends from corpus data
Corps-based language pedagogy (CBLP): the pedagogical skills for
teaching with corpus technology

‘Corpus-based Language Pedagogy’ (CBLP)


Selected free online corpora for supporting teachers’
development of corpus literacy

COCA (the Corpus of Contemporary American English)


https://www.english-corpora.org/coca/
(a powerful corpus search engine, registration required)
SKELL: https://skell.sketchengine.eu/
(three key search functions, registration not required)
Netspeak: https://netspeak.org/
(streamlined search functions, Google Books, registration not required)
Parallel EAP Corpora: https://corpus.eduhk.hk/eap/
(EdUHK, advanced search functions on different sections of research writing,
e.g., introduction, literature review, methodology, etc.)
COCA
A powerful and
user-friendly
concordancer tool
Website: https://www.english-
corpora.org/coca/)
It contains1 billion
words divided into 8 Target students: Secondary; Tertiary level
registers
Target Vocabulary, collocation, grammar,
language/skills writing
Key search functions of COCA
• Search for individual words, parts of words, and synonyms (List)
• Search for phrases with part of speech information (List)
• Search for collocates (Collocates)
• Compare two words (Compare)

Please visit our MOOC to learn about more search functions of COCA
https://pressbooks.pub/cmtry1/
SKELL (Sketch Engine for
language learning
Website: https://skell.sketchengine.eu/

Target students: Primary, Secondary; Tertiary level

Target Vocabulary, collocation, grammar, writing


language/skills
Why SKELL?

• A free and easy interface (modelled from Sketch Engine) adapted


to the needs of English learners.

• No registration or payment required.

• Easy for students and teachers to check how a particular phrase


or a word is used in standard English.

• Provision of good examples of words or phrases that learners may


find difficult to learn
Three key functions of SKELL
1. Examples: 40 student-friendly example sentences of the word or phrase in
context

2. Word sketch: A summary of the most typical collocations in different parts of


speech
• ‘offer’ as verb: companies (n, subject) offer; services (n, object) offered;
offer up/offer along with (phrasal verbs)
• ‘offer’ as noun: offer (subject) expires (v); accepted (v) the offer (object);
valid (adj) offer, etc.

3. Similar Words: A list of 40 most similar words for a target word


similar words for ‘offer’
Interface of SKELL

The interface of SKELL is easy and simple to use

1 Search field Choose a language

2
switch to input content in a keyboard
Search functions (“Examples”, “Word sketch” and “Similar words”)

Please visit our MOOC website to learn about more search functions of
SKELL https://pressbooks.pub/cmtry1/
Sample CBLP teaching activities (SKELL)
Example: ‘especially’ or ‘specially’?
Focus: differentiation of easily confusing words
Level: intermediate (secondary or above)
Activity 1
Ask students to work in pairs and observe the similar words of ‘especially’ and ‘specially’
respectively, and share their findings with the class. (Note: guide them to observe the
most common words, i.e., the similar words with bigger size)

‘especially’ ‘specially’
Sample CBLP teaching activities (SKELL)
Activity 2
• Students go back to the
“examples” function to examine
how the two words are used in
context.
• Teacher can guide students to
further explore the grammatical
features of the two words by
asking:
The two words are all adverbs, but
what usually follow “specially”
and “especially”?

Possible answer:
e.g., specially+ V+ed
e.g., especially + adj.
Netspeak
A web using Google Website: https://netspeak.org/
books to provide Corpus sources: Web-based, google books
collocations
Target settings: Tertiary; Higher secondary level
User-friendly
Target Writing (EAP/ESP courses), grammar,
interface and search
language/skills vocabulary
functions
Key search functions of Netspeak

• Search phrases with one or more words missing, e.g.,


waiting ? response; waiting ? ? Response; waiting * response

• Find the best option, e.g., the same [like as]

• Find the best order, e.g., {blood and flesh}

• Find the best synonym, e.g., become # famous


Interface of Netspeak

1 Query box: to type in any expression that you


want to search
Click on the “x” icon to clear the input field
3 from search history

1 Click to see the search history

2
2 Click on the “i” icon to show a quick
instruction with examples

3 Language options

Please visit our MOOC website to learn about more search functions of Netspeak
https://pressbooks.pub/cmtry1/
Parallel EAP(English for
academic purposes) Corpora
A one-million-word Website: https://corpus.eduhk.hk/eap/
EAP learner corpus
&
A one-million-word
EAP professional Target students: Undergraduate and postgraduate
corpus students
Target Vocabulary, collocation, grammar,
language/skills academic writing
Basic functions of Parallel EAP Corpora

User guide Search Generate Compare


to use the keyword in lists for concordance
website context lemmas and lines between
based on collocations learner and
part of expert corpora
speech
(POS)
Demonstration: The reporting verb “suggest “ in ELT expert genre
1. Go to “POS
Search”
2. Choose
“Advanced Mode”
3. Type the word 4. Choose “Starts
“suggest” with”

5. Choose the 6. Choose 7. Choose the


POS Tag “Verb” 8. Choose the
“Professional subject “ELT
section “Literature
Corpus” or Research” in this
Review”
“Learner Corpus” case
Go to “POS Word List” to
view the frequency list

Click “?” to view the meaning of POS Tag


Question 1: How many times is ‘suggest’ used in literature review section
of ELT research in expert writing?

Change Professional corpus into Learner corpus and repeat.

Question 2: How many times is ‘suggest’ used in literature review section


of ELT research in learner writing?
Activity 1: The most common
reporting structure
Task: search ‘suggest’ in -s from
(present tense) in expert writing (ELT
research, literature review)
Collocations:
Right:
• ‘that’ clauses
Left:
• “Author (year)” pattern;
• nouns indicating research findings
(e.g., research, evidence, finding)
• pronouns for cited authors (e.g., he,
she);
Now, can you summarise the sentence
structure to report cited information?
Activity 2: Present tense and past tense of “suggest”
Note: suggest is one of the most frequently used reporting verbs in ELT
Research and academic writing

Task: Search the frequencies of “suggest” in expert writing and learner


writing when it is used in present tense and past tense

“suggest” and its forms Expert Learner


suggest(s) (present tense) (VV0/VVZ) 35 (63.6%) 21 (29.2%)
suggested (past tense) (VVD) 2 (3.6%) 23 (31.9%)
Total (including other forms) 55 72
Specific citations reported by “suggest ” extracted
from Parallel EAP Corpora
Expert Learner
1. Kuo (2003) suggests that a primary 1. Sokmen (1997) suggested that ESL and EFL
task for language teachers is to “discern instructors need to help their students
the optimal tension between positive establish links and build up associations
and negative feedback” (p. 10), between new words and words that the
striking… students have learnt.
2. These findings suggest that, besides 2. Lin (2006)’s results suggested that literature
instructional and learner characteristics, circles can be a balanced literacy approach for
GEFs must be taken into account in teachers to improve students’ reading
understanding L2 proficiency interests and strategies, to incorporate many
development. different aspects of learning.
Note: All cited information are about cited researchers’ suggestions in language learning and
teaching.
Activity 3: Search the middle/low frequency reporting verbs

Word Frequency (expert) Frequency (learner)


Argue 30 12
Report 21 5

Argue and report are underused by students.

Search more and see what you can find:


describe, note, investigate, discuss…
Additional resources to support teachers and students’ corpus
literacy:
A user-friendly disciplinary concordancer
CorpusMate https://corpusmate.com/

https://corpusmate.com/
Corpus of Contemporary America English
COCA https://www.english-corpora.org/coca/

https://www.english-corpora.org/coca/
Useful online tool for language analysis and learning
Lextutor https://www.lextutor.ca/
Free version of Sketch Engine for Language Learning
https://skell.sketchengine.eu/#home?lang=en

SKELL https://skell.sketchengine.eu/#home?lang=en

https://skell.sketchengine.eu/#home?lang=en
A website for learning academic collocations and sentence
Collocaid https://collocaid.uk/prototype/editor/public/home

patterns https://collocaid.uk/prototype/editor/public/home
Learning how to use words and phrases in academic writing
Netspeak https://netspeak.org/

https://netspeak.org/
CBLP 2.0 empowered by AI (e.g., ChatGPT)
Corpus technology ChatGPT
Advantages Disadvantages
Knowing the data source Data cited without source origin
Authenticity Machine-generated data (based on
algorithms/probability)
Multimodality
Text-based
Safety
Unethical
Active learning Passive learning (?)

Disadvantages Advantages
Level of technical knowledge Simple to use (input natural language)
Complicated/unintuitive interface Easy to interact with
Unsuitable corpus data for younger learners Possibility to generate language suitable for
learners (?)
Difficult to track users’ corpus use over time
Track user input
Crosthwaite & Baisa, 2023
AI tools for CBLP practice
(by Liu Jing, EdD candidate, EdUHK )

1. Providing immediate feedback

2. Preparing teaching materials

3. Transforming students’ written work into


videos to increase student engagement

https://eduhk.ap.panopto.com/Panopto/Pages/Viewer.aspx?id=a324b4e2-1d2b-4ae4-b6f6-
b136006af1b9
Sample CBLP lesson: SKELL + Netspeak + AI
Designers: Dr. Eric Cheung & Ms. Sylvia Lau
Affiliation: College of Professional and Continuing Education,
The Hong Kong Polytechnic University

Increasing Vocabulary Power for Writing


through Differentiating Confusing Word Pairs
“Affect” vs “Effect”, “Impact” vs “Influence”
Target students: Undergraduate Social Sciences Students
Corpora used: Netspeak, SKELL & Corpus of Journal Articles 2014
(RCPCE PolyU)
Lesson Duration: 100 minutes

The work in this competition was fully supported by a grant from the
Research Grants Council of the Hong Kong Special Administrative
Region, China (UGC/FDS24/H11/22)
Stage 1: Testing Students’ Knowledge of the Target
Language Items
T uses ChatGPT to create a passage
✓ To include the target words and
phrases to be taught in the lesson
(affect, effect, impact, influence)
Students pair up, discuss and fill out the
cloze passage activity
✓ Stimulate them to think of the
differences among the confusing
word pairs
Stage 2: Hands-on Corpus Consultation to
Identify Language Patterns
Flow of the Grammar Battle
Based on
Provide step-by- Explain the Give time to study the concordances, ask
step instructions to functions of SKELL concordance lines questions related
search on SKELL
to the target word
Stage 2: Hands-on Corpus Consultation to
Identify Language Patterns

Group with the lowest mark in the


Grammar Battle is required to
finish the output exercise as
punishment
✓ Motivate them to participate
actively
✓ Provide them a chance to use
the function of SKELL
Stage 3: Inductive Discovery
- Introduce more functions of SKELL and introduce Netspeak (“find the best option function")
- Raise questions regarding the target word, involving the concept of semantic prosody
→ More difficult questions than those in the grammar battle (e.g., negatively impact
vs. profoundly influence)
- Teacher-guided observation
- Students share their thoughts and discuss with their classmates and teacher

Stage 4: Output exercise


1. Paraphrasing exercise
Refer to Ma et al. (2022a)
for the four-step CBLP ✓ Practice using the target words in a particular part of speech
design model
2. After–class exercise: more word pairs
✓ Provide more chances to use SKELL and Netspeak
Please visit our MOOC website to read more AI-empowered CBLP lesson materials:
https://pressbooks.pub/cmtry1/
WHY MOOC? https://pressbooks.pub/cmtry1/

“Enhancing Corpus Literacy and CBLP for In- and Pre-


service English Teachers”

FEATURES AND OBJECTIVES


Resources to support corpus-based language pedagogy (CBLP)
The Corpus-Aided Platform for Language Teachers (CAP)
https://corpus.eduhk.hk/cap/

What can you find on the website?


1. A package for teacher self-training
• Introduction to corpus linguistics and corpus
resources
• Principles for designing corpus-based materials
2. A rich collection of resources for developing CBLP
• 65 Videos of the previously mentioned corpus
websites and tools (classic and 2nd generation corpus tools)
• 68 Lesson plans and worksheets (primary, secondary
and tertiary settings for various language skills)
• 9 Videos of corpus-based classroom teaching
(primary, secondary and tertiary settings)
Research: Using COCA to help secondary school students
improve collocations in English writing

Fang, L., Ma, Q., Yan, J. (2021). The Effectiveness of Corpus-Based Training on Collocation
Use in L2 Writing for Chinese Senior Secondary School Students. Journal of China Computer-
Assisted Language Learning, 1(1), 80-109.
Research: Validation of CL and linking CL to CBLP
• Ma et al. (2023) constructed a 5-factor CL and
empirically tested with SEM:
1. Understanding of corpora 3. Analysis of corpus data 5. Limitations of corpora

2. Corpora search skills 4. Advantages of corpora


• In addition, CL contributes to teacher intention to use corpus technology in teaching

Ma, Q., Chiu, M., Lin, S., & Mendoza, N. (2023). Teachers’ perceived corpus literacy and their
intention to integrate corpora into classroom teaching: A survey study. ReCALL, 35(1), 19–39.
Research: validating a two-step framework for developing CBLP

A two-step framework for providing CBLP training


(Ma et al., 2022a)

Ma, Q., Tang, J., & Lin, S. (2022a). The development of corpus-based language pedagogy for TESOL
teachers: A two-step training approach facilitated by online collaboration. Computer Assisted Language
Learning, 35(9), 2731–2760.
Research: In-service university teachers’ CBLP
development
Methodology: case study (2 participants)
Participants: Data:
1) The corpus-based lesson plans
(a) 2 experienced teachers in 2) Pre-interview (before classroom
tertiary English education. teaching) focusing on the
preparation of materials
(b) One had great familiarity 3) Recorded videos of classroom
with corpus technology; the teaching
other had no knowledge of 4) Post-interview focusing on the
corpus technology. teachers’ instruction, evaluation
and reflection of their classroom
teaching.

Ma, Q., Yuan, R., Cheung, E. L. M., Yang, J. (2022b). Teacher paths for developing corpus-
based language pedagogy: a case study. Computer Assisted Language Learning, 1-32.
Research: in-service teachers’ CBLP development (Ma et al., 2022b)
Key findings:
Five components of teacher knowledge and teacher practice will influence teachers’ CBLP
development:

Tim: a functional linguist and lover of May: a curriculum reformer and


corpus experimenter of pedagogy
May’s CBLP competence is influenced by the
Tim’s CBLP development is
other four domains of knowledge, especially
substantially influenced by his good
her advanced pedagogical knowledge after
knowledge of corpus technology.
learning and practising this new CBLP
approach.
Conclusion
CL and CBLP are two key skills to be developed by teachers for effectively using corpus
technology in teaching
The use of corpus technology can effectively facilitate student language learning and teachers’
teaching
More teacher-friendly corpus websites/tools should be created
to facilitate language teachers’ learning of corpus literacy (e.g.,
Lextutor, COCA, Netspeak, SKELL) and development of
CBLP in classroom teaching (e.g., CAP)
To further promote the application of corpus technology in
language education, teacher professional development
involving collaborative learning plays an important role in
teacher training (Ma et al., 2024)

Investigation of CBLP is an emerging and promising research


area that may enhance student learning, language teacher
education and professional development.
References
Boulton, A., & Cobb, T. (2017). Corpus use in language learning: A meta-analysis. Language Learning,67(2), 348–393.
Fang, L., Ma, Q., Yan, J. (2021). The Effectiveness of Corpus-Based Training on Collocation Use in L2 Writing for Chinese
Senior Secondary School Students. Journal of China Computer-Assisted Language Learning, 1(1), 80-109.
Heather, J., & Helt, M. (2012). Evaluating corpus literacy training for student language teachers: Six case studies. Journal
of Technology and Teacher Education, 20(4), 415–440.
Johns, T. (1991). Should you be persuaded: Two examples of data-driven learning. In T. Johns & P. King (Eds.), Classroom
concordancing (Vol. 4, pp. 1–16). ELR.
Ma, Q., Tang, J., & Lin, S. (2022a). The development of corpus-based language pedagogy for TESOL teachers: A two-step
training approach facilitated by online collaboration. Computer Assisted Language Learning, 35(9), 2731–2760.
Ma, Q., Yuan, R., Cheung, L. M. E., & Yang, J. (2022b). Teacher paths for developing corpus-based language pedagogy: a
case study. Computer Assisted Language Learning, 1–32.
Ma, Q., Chiu, M., Lin, S., & Mendoza, N. (2023). Teachers’ perceived corpus literacy and their intention to integrate
corpora into classroom teaching: A survey study. ReCALL, 35(1), 19–39. doi:10.1017/S0958344022000180
Ma, Q., & Lee, H. T, Gao, X., Chai, C.-S. (2024). Learning by Design: Enhancing Online Collaboration in Developing Pre-
Service TESOL Teachers’ TPACK for Teaching with Corpus Technology. British Journal of Educational Technology.
Mukherjee, J. (2006). Corpus linguistics and language pedagogy: The state of the art–and beyond. In S. Braun, K. Kohn, &
J. Mukherjee (Eds.), Corpora and language pedagogy: New resources, new tools, new methods (pp. 5–24). Frankfurt
am Main, Germany: Peter Lang.
Q & A & Evaluation
• Please visit our MOOC Please spend 1-2 minutes evaluating
(https://corpus.eduhk.hk/cap/) to
(https://corpus.eduhk.hk/cap/)

this seminar:
learn about CL, CBLP, and how to
combine CBLP with AI into classroom.

• Please visit our CAP website


(https://corpus.eduhk.hk/cap/) for
more ideas of implementing CBLP into
classroom.

[email protected]

You might also like