Li18 Lecture 1 Slides (2021)
Li18 Lecture 1 Slides (2021)
Nigel Collier
Faculty of Modern and Medieval Languages and Linguistics
1
Summary
1. Course Admin.
3. Language Models?
2
Course Admin.
• Should be in touch with you by the end of the first week of term - if
not then please contact me (nhc30)
3
Course Admin.
Course Supervisors
4
Course Admin.
Course Textbook
5
Course Admin.
You’re Be mindful
ready to
go
6
Course Admin.
7
Undergraduates – Part IIA
8
Undergraduates – IIB, MML, Erasmus
9
Course Admin.
Other Undergraduates
TAL MPhils
Welcome onboard!
10
Study and Supervisions
11
Michaelmas Overview
12
What Topics Changed in 2021?
13
Undergraduates: marking and examinations in 2021/22
14
Information about Programming
Python is one of the most popular and well supported programming languages used
for Natural Language Processing. Python for Computational Linguists – self-paced
Jupyter notebooks:
https://github.com/cambridgeltl/python4cl
15
Information about Programming
16
Python for Computational Linguists
17
Summary
1. Course Admin.
3. Language Models?
18
What is Computational Linguistics?
19
Computational Linguistics is a Multi-disciplinary Field
20
Computational Linguistics Splits into Two Broad Areas
21
Summary
1. Course Admin.
3. Language Models?
22
Natural Language Analysis
The models used in NLP are used to automatically analyse language to produce
the possible structures/annotations that you have been taught to think about.
o Morphology
o Syntax
o Semantics
o Pragmatics
o ...
You have been learning to associate structure (or annotation) to linguistic units
and in cases of ambiguity, demonstrating that there was more than one possible
structure
23
Combining Language Models
24
Combining Language Models
25
Combining Language Models
26
Summary
1. Course Admin.
3. Language Models?
27
Complexity of Language Tasks and Applications
28
Complexity of Language Tasks
"... Julie Delpy is far too good for this movie. She imbues Serafine with spirit,
spunk, and humanity. This isn’t necessarily a good thing, since it prevents us
from relaxing and enjoying AN AMERICAN WEREWOLF IN PARIS as a
completely mindless, campy entertainment experience. Delpy’s injection of class
into an otherwise classless production raises the spectre of what this film could
have been with a better script and a better cast ... She was radiant, charismatic,
and effective ...“
- "a good actor trapped in a bad movie" from Po Bang et al. (2002).
Pang, B., Lee, L., & Vaithyanathan, S. (2002, July). Thumbs up?: sentiment classification using
machine learning techniques. In Proceedings of the ACL-02 conference on Empirical methods in
natural language processing-Volume 10 (pp. 79-86)
29
Complexity of Language Tasks
What features of the text help to predict the number of stars? Are the features
hard to identify and disambiguate?
"... Julie Delpy is far too good for this movie. She imbues Serafine with spirit,
spunk, and humanity. This isn’t necessarily a good thing, since it prevents us
from relaxing and enjoying AN AMERICAN WEREWOLF IN PARIS as a
completely mindless, campy entertainment experience. Delpy’s injection of class
into an otherwise classless production raises the spectre of what this film could
have been with a better script and a better cast ... She was radiant, charismatic,
and effective ...“
30
Complexity of Language Tasks
"... Julie Delpy is far too good for this movie. She imbues Serafine with spirit,
spunk, and humanity. This isn’t necessarily a good thing, since it prevents us
from relaxing and enjoying AN AMERICAN WEREWOLF IN PARIS as a
completely mindless, campy entertainment experience. Delpy’s injection of class
into an otherwise classless production raises the spectre of what this film could
have been with a better script and a better cast ... She was radiant, charismatic,
and effective ...“
31
Complexity of Language Tasks
32
Complexity of Language Tasks
33
Adherence to Linguistic Theory
34
What Types of Language Models Will We Look At?
• There are many different types of language model and ways of describing
them.
• The choice of the model will depend on the linguistic unit being
described, and often the task to which it is applied.
• In this course we will look at: rule-based models, finite state machines,
(lexical and context-free grammar) statistical models, neural models.
35
Exercises (see Lecture Notes for details)
Post-Lecture Exercise
Pre-Lecture Exercises
1. Read about Eliza in Weizenbaum (1966) and try it online. Think about the
Process that Eliza uses to identify keywords and transform them into
responses. What linguistic knowledge would be necessary to make it more
proficient in its task?
36