Unlocking THR Power of Corpus-Based Language Padagogy
Unlocking THR Power of Corpus-Based Language Padagogy
Dr. Angel Ma
[email protected]
Heather and Helt (2012) defined CL as the ability to use corpus linguistics
technology to investigate language and enhance student language development.
Please visit our MOOC to learn about more search functions of COCA
https://pressbooks.pub/cmtry1/
SKELL (Sketch Engine for
language learning
Website: https://skell.sketchengine.eu/
2
switch to input content in a keyboard
Search functions (“Examples”, “Word sketch” and “Similar words”)
Please visit our MOOC website to learn about more search functions of
SKELL https://pressbooks.pub/cmtry1/
Sample CBLP teaching activities (SKELL)
Example: ‘especially’ or ‘specially’?
Focus: differentiation of easily confusing words
Level: intermediate (secondary or above)
Activity 1
Ask students to work in pairs and observe the similar words of ‘especially’ and ‘specially’
respectively, and share their findings with the class. (Note: guide them to observe the
most common words, i.e., the similar words with bigger size)
‘especially’ ‘specially’
Sample CBLP teaching activities (SKELL)
Activity 2
• Students go back to the
“examples” function to examine
how the two words are used in
context.
• Teacher can guide students to
further explore the grammatical
features of the two words by
asking:
The two words are all adverbs, but
what usually follow “specially”
and “especially”?
Possible answer:
e.g., specially+ V+ed
e.g., especially + adj.
Netspeak
A web using Google Website: https://netspeak.org/
books to provide Corpus sources: Web-based, google books
collocations
Target settings: Tertiary; Higher secondary level
User-friendly
Target Writing (EAP/ESP courses), grammar,
interface and search
language/skills vocabulary
functions
Key search functions of Netspeak
2
2 Click on the “i” icon to show a quick
instruction with examples
3 Language options
Please visit our MOOC website to learn about more search functions of Netspeak
https://pressbooks.pub/cmtry1/
Parallel EAP(English for
academic purposes) Corpora
A one-million-word Website: https://corpus.eduhk.hk/eap/
EAP learner corpus
&
A one-million-word
EAP professional Target students: Undergraduate and postgraduate
corpus students
Target Vocabulary, collocation, grammar,
language/skills academic writing
Basic functions of Parallel EAP Corpora
https://corpusmate.com/
Corpus of Contemporary America English
COCA https://www.english-corpora.org/coca/
https://www.english-corpora.org/coca/
Useful online tool for language analysis and learning
Lextutor https://www.lextutor.ca/
Free version of Sketch Engine for Language Learning
https://skell.sketchengine.eu/#home?lang=en
SKELL https://skell.sketchengine.eu/#home?lang=en
https://skell.sketchengine.eu/#home?lang=en
A website for learning academic collocations and sentence
Collocaid https://collocaid.uk/prototype/editor/public/home
patterns https://collocaid.uk/prototype/editor/public/home
Learning how to use words and phrases in academic writing
Netspeak https://netspeak.org/
https://netspeak.org/
CBLP 2.0 empowered by AI (e.g., ChatGPT)
Corpus technology ChatGPT
Advantages Disadvantages
Knowing the data source Data cited without source origin
Authenticity Machine-generated data (based on
algorithms/probability)
Multimodality
Text-based
Safety
Unethical
Active learning Passive learning (?)
Disadvantages Advantages
Level of technical knowledge Simple to use (input natural language)
Complicated/unintuitive interface Easy to interact with
Unsuitable corpus data for younger learners Possibility to generate language suitable for
learners (?)
Difficult to track users’ corpus use over time
Track user input
Crosthwaite & Baisa, 2023
AI tools for CBLP practice
(by Liu Jing, EdD candidate, EdUHK )
https://eduhk.ap.panopto.com/Panopto/Pages/Viewer.aspx?id=a324b4e2-1d2b-4ae4-b6f6-
b136006af1b9
Sample CBLP lesson: SKELL + Netspeak + AI
Designers: Dr. Eric Cheung & Ms. Sylvia Lau
Affiliation: College of Professional and Continuing Education,
The Hong Kong Polytechnic University
The work in this competition was fully supported by a grant from the
Research Grants Council of the Hong Kong Special Administrative
Region, China (UGC/FDS24/H11/22)
Stage 1: Testing Students’ Knowledge of the Target
Language Items
T uses ChatGPT to create a passage
✓ To include the target words and
phrases to be taught in the lesson
(affect, effect, impact, influence)
Students pair up, discuss and fill out the
cloze passage activity
✓ Stimulate them to think of the
differences among the confusing
word pairs
Stage 2: Hands-on Corpus Consultation to
Identify Language Patterns
Flow of the Grammar Battle
Based on
Provide step-by- Explain the Give time to study the concordances, ask
step instructions to functions of SKELL concordance lines questions related
search on SKELL
to the target word
Stage 2: Hands-on Corpus Consultation to
Identify Language Patterns
Fang, L., Ma, Q., Yan, J. (2021). The Effectiveness of Corpus-Based Training on Collocation
Use in L2 Writing for Chinese Senior Secondary School Students. Journal of China Computer-
Assisted Language Learning, 1(1), 80-109.
Research: Validation of CL and linking CL to CBLP
• Ma et al. (2023) constructed a 5-factor CL and
empirically tested with SEM:
1. Understanding of corpora 3. Analysis of corpus data 5. Limitations of corpora
Ma, Q., Chiu, M., Lin, S., & Mendoza, N. (2023). Teachers’ perceived corpus literacy and their
intention to integrate corpora into classroom teaching: A survey study. ReCALL, 35(1), 19–39.
Research: validating a two-step framework for developing CBLP
Ma, Q., Tang, J., & Lin, S. (2022a). The development of corpus-based language pedagogy for TESOL
teachers: A two-step training approach facilitated by online collaboration. Computer Assisted Language
Learning, 35(9), 2731–2760.
Research: In-service university teachers’ CBLP
development
Methodology: case study (2 participants)
Participants: Data:
1) The corpus-based lesson plans
(a) 2 experienced teachers in 2) Pre-interview (before classroom
tertiary English education. teaching) focusing on the
preparation of materials
(b) One had great familiarity 3) Recorded videos of classroom
with corpus technology; the teaching
other had no knowledge of 4) Post-interview focusing on the
corpus technology. teachers’ instruction, evaluation
and reflection of their classroom
teaching.
Ma, Q., Yuan, R., Cheung, E. L. M., Yang, J. (2022b). Teacher paths for developing corpus-
based language pedagogy: a case study. Computer Assisted Language Learning, 1-32.
Research: in-service teachers’ CBLP development (Ma et al., 2022b)
Key findings:
Five components of teacher knowledge and teacher practice will influence teachers’ CBLP
development:
this seminar:
learn about CL, CBLP, and how to
combine CBLP with AI into classroom.