0% found this document useful (0 votes)

24 views

Natural Language Processing-Course Handout September 2022

The document provides details about a course on natural language processing including course objectives, textbooks, reference materials, modular content structure, and contact session plans. The course covers fundamental NLP concepts and techniques including language models, word embeddings, part-of-speech tagging, parsing, and applications.

Uploaded by

Mahesh Anem

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views

Natural Language Processing-Course Handout September 2022

Uploaded by

Mahesh Anem

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

BIRLA INSTITUTE OF

TECHNOLOGY & SCIENCE, PILANI

WORK INTEGRATED LEARNING PROGRAMMES
COURSE HANDOUT
Part A: Content Design
Course Title Natural Language Processing
Course No(s)
Credit Units 4 units
Course Author Dr. Chetana Gavankar
Version No 1.0
Date September 2022

Course Objectives
No Course Objective

CO1 To learn the fundamental concepts and techniques of natural language processing (NLP)
including Language Models, Word Embedding, Part pf speech Tagging, Parsing

CO2 To learn computational properties of natural languages and the commonly used algorithms
for processing linguistic information

CO3 To introduce basic mathematical models and methods used in NLP applications to
formulate computational solutions.

CO4 To introduce students research and development work in Natural language Processing
Text Book(s)
T1 Jurafsky and Martin, SPEECH and LANGUAGE PROCESSING: An Introduction to
Natural Language Processing, Computational Linguistics, and Speech Recognition,
McGraw Hill
T2 Manning and Schütze, Foundations of Statistical Natural Language Processing, MIT Press.
Cambridge, MA

Reference Book(s) & other resources

R1 Allen James, Natural Language Understanding
R2 Neural Machine Translation by Philipp Koehn
R3 Semantic Web Primer (Information Systems) By Antoniou, Grigoris; Van Harmelen, Frank
Modular Content Structure

1. Natural Language Understanding and Generation

 The Study of Language.
 Applications of Natural Language Understanding.
 Evaluating Language Understanding Systems.
 The Different Levels of Language Analysis.
 The Organization of Natural Language Understanding Systems.

2. N-gram Language Modelling

 N-Grams
 Generalization and Zeros.
 Smoothing
 The Web and Stupid Backoff
 Evaluating Language Models
 Smoothing
 The Web and Stupid Backoff

3 Neural networks and Neural language Models

 Units
 The XOR problem
 Feed-Forward Neural Networks
 Training Neural Nets
 Neural Language Models -expand spend more time

4. Part-of-Speech Tagging
 (Mostly) English Word Classes
 The Penn Treebank Part-of-Speech Tag set
 Part-of-Speech Tagging
 Markov Chains
 The Hidden Markov Model
 HMM Part-of-Speech Tagging
 Part-of-Speech Tagging for Morphological Rich Languages

5. Hidden Markov Models and MEMM

 The Hidden Markov Model
 Likelihood Computation: The Forward Algorithm
 Decoding: The Viterbi Algorithm
 HMM Training: The Forward-Backward Algorithm
 Maximum Entropy Markov Models
 Bidirectionality

6. Topic Modelling
 Mathematical foundations for LDA : Multinomial and Dirichlet distributions
 Intuition behind LDA
 LDA Generative model
 Latent Dirichlet Allocation Algorithm and Implementation
 Gibbs Sampling

7. Vector semantics and Embedding

 Lexical semantics
 Vector semantics
 Word and Vectors
 TFIDF
 Word2Vec, Skip gram and CBOW
 Glove
 Visualizing Embedding’s

8. Grammars and Parsing.

 Grammars and Sentence Structure.
 What Makes a Good Grammar
 A Top-Down Parser.
 Bottom-Up Chart Parser.
 Top-Down Chart Parsing.
 Finite State Models and Morphological Processing.
 Grammars and Logic Programming.
9. Statistical Constituency Parsing
 Probabilistic Context-Free Grammars
 Probabilistic CKY Parsing of PCFGs
 Ways to Learn PCFG Rule Probabilities
 Problems with PCFGs
 Improving PCFGs by Splitting Non-Terminals
 Probabilistic Lexicalized CFGs
10. Dependency Parsing
 Dependency Relations
 Dependency Formalisms
 Dependency Treebanks
 Transition-Based Dependency Parsing
 Graph-Based Dependency Parsing
 Dependency parser using neural network

11. Encoder-Decoder Models, Attention and Contextual Embeddings

 Neural Language Models and Generation
 Encoder-Decoder Networks, Attention
 Applications of Encoder-Decoder Networks
 Self-Attention and Transformer Networks
 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 Contextual Word Representations: A Contextual Introduction
 The Illustrated BERT, ELMo, and co.
 XLM

12. Word sense disambiguation

 Word Senses
 Relations between Senses
 WordNet: A Database of Lexical Relations
 Word Sense Disambiguation
 Alternate WSD algorithms and Tasks
 Using Thesauruses to Improve Embedding’s
 Word Sense Induction
13. Semantic web ontology and Knowledge Graph
 Introduction to semantic web
 Semantic web ontology
 Semantic web languages
 Ontology Engineering
 Ontology Learning
 Knowledge graph –construction of graph

14. Introduction to NLP Applications

 Brief introduction of state of art applications
 Text Summarization
 Machine Translation

Part B: Contact Session Plan

Academic Term
Course Title
Course No
Lead Instructor
Course Contents

Contact List of Topic Title Topic # Text /

session (from content structure in Part A) (from Ref Book /
content External
structure resource
in Part A)

1 Natural Language Understanding and Generation Chapter1 T2

1.1 The Study of Language.
1.2 Applications of Natural Language Understanding.
1.3 Evaluating Language Understanding Systems.
1.4 The Different Levels of Language Analysis.
1.5 The Organization of Natural Language Understanding
Systems.

2 N-gram Language Modelling Chapter 3 T1

 N-Grams
 Generalization and Zeros.
 Smoothing
 The Web and Stupid Backoff
 Evaluating Language Models
 Smoothing
 The Web and Stupid Backoff

3 Neural Network and Neural Language Modelling Chapter 4 R2

 Units
 The XOR problem
 Feed-Forward Neural Networks
 Training Neural Nets
 Neural Language Models

4 Vector semantics and Embedding Chapter 6 T1 and

 Lexical semantics lecture notes
 Vector semantics
 Word and Vectors https://www.
youtube.com
 TFIDF
/watch?v=hQ
 Word2Vec, Skip gram and CBOW wFeIupNP0
 Glove
 Visualizing Embedding’s

5 Part-of-Speech Tagging Chapter8 T1 and class

 (Mostly) English Word Classes notes
 The Penn Treebank Part-of-Speech Tag set
 Part-of-Speech Tagging
 Markov Chains
 The Hidden Markov Model
 HMM Part-of-Speech Tagging
 Part-of-Speech Tagging for Morphological Rich
Languages

6 Hidden Markov Model Algorithms Appendix T1 and class

 Likelihood Computation: The Forward Algorithm chapter A notes
 Decoding: The Viterbi Algorithm
 HMM Training: The Forward-Backward Algorithm
 Maximum Entropy Markov Model
 Bidirectionality

7 Topic modelling Class Notes

 Mathematical foundations for LDA
 Multinomial and Dirichlet distributions
 Intuition behind LDA
 LDA Generative model
 Latent Dirichlet Allocation Algorithm and
Implementation
 Gibbs Sampling

Review of M1 to M7

9 Grammars and Parsing Chapter3 T2

 Grammars and Sentence Structure.
 What Makes a Good Grammar
 A Top-Down Parser.
 A Bottom-Up Chart Parser.
 Top-Down Chart Parsing.
 Finite State Models and Morphological Processing.
 Grammars and Logic Programming.
 Parsing
10 Statistical Constituency Parsing Chapter 14 T1
 Probabilistic Context-Free Grammars
 Probabilistic CKY Parsing of PCFGs
 Ways to Learn PCFG Rule Probabilities
 Problems with PCFGs
 Improving PCFGs by Splitting Non-Terminals
 Probabilistic Lexicalized CFGs

11 Dependency Parsing Chapter 19 T1 and class

 Dependency Relations notes
 Dependency Formalisms
 Dependency Treebanks
 Transition-Based Dependency Parsing
 Graph-Based Dependency Parsing
 Dependency parsers using neural network

12 Encoder-Decoder Models, Attention and Contextual Chapter10 T1

Embeddings https://colab.
 Neural Language Models and Generation research.goo
 Encoder-Decoder Networks, Attention gle.com/driv
 Applications of Encoder-Decoder Networks e/1iqs9Y5_z
LI6R6mAwl
 Self-Attention and Transformer Networks
napcxcUbKj
 BERT: Pre-training of Deep Bidirectional Transformers pv2CC?usp=
for Language Understanding sharing
 Contextual Word Representations: A Contextual
Introduction
 The Illustrated BERT, ELMo, and co.
 XLM

13 Word sense and word net Chapter15 T1

14 Semantic web ontology and Knowledge Graphs Chapter 24 R1 and class

 Introduction notes
 Ontology and Ontologies
 Ontology Engineering
 Ontology Learning

15 State of art applications Class Notes

and web
references
16 Review of session 9 to session 15

Detailed Plan for Lab work

Lab Lab Sheet Session

Lab Objective Reference
No. Access URL
1 Introduction to NLTK, Spacy and other open 1
source tools
2 Language Modelling- Neural 2,3

3 Part of speech tagging 4,5

4 Topic Modeling 7
5 Parsing-Dependency-neural 9,10,11
6 Wordnet, Ontology and Knowledge Graph 12,13,14

Evaluation Scheme
Evaluation Name Type Weight Duration Day, Date, Session,
Component (Quiz, Lab, Project, (Open book, Time
Midterm exam, End Closed book,
semester exam, etc) Online, etc.)

EC – 1 Quiz 10% To be announced

EC – 2 Assignment 20% To be announced

EC – 3 Mid-term Exam Open book 30% To be announced

EC – 4 End Semester Exam Open book 40% To be announced

Important Information
Syllabus for Mid-Semester Test (Closed Book): Topics in Weeks 1-8 (1-18 Hours)
Syllabus for Comprehensive Exam (Open Book): All topics given in plan of study

Notes
 Quiz and Assignments timelines will be announced on the canvas portal.
 Deadlines for evaluation components will NOT be extended and the student is requested not
to wait for the deadline to start working on Quiz/Assignment
 Syllabus for Mid-Semester Test (Closed Book): Topics in Session Nos. 1 to 8
 Syllabus for Comprehensive Exam (Open Book): All topics (Session Nos. 1 to 16)
 Strictly NO MAKEUPS for Quiz and Assignments and all submissions after the announced
deadlines will not be considered for evaluation.
 All assignments will be subjected to plagiarism check, and if violated will be subject to
disciplinary action apart from nullifying all the marks/grades assigned.

Important links and information:

Canvas: Students are expected to visit the Canvas portal on a regular basis and stay up to date with
the latest announcements and deadlines.

Contact sessions: Students should attend the online lectures as per the schedule provided.
Evaluation Guidelines:
1. EC-1 consists of Assignments and Quizzes. Announcements regarding the same will be made
in a timely manner.
2. For Closed Book tests: No books or reference material of any kind will be permitted.
Laptops/Mobiles of any kind are not allowed. Exchange of any material is not allowed.
3. For Open Book exams: Use of prescribed and reference text books, in original (not photocopies)
is permitted. Class notes/slides as reference material in filed or bound form is permitted.
However, loose sheets of paper will not be allowed. Use of calculators is permitted in all exams.
Laptops/Mobiles of any kind are not allowed. Exchange of any material is not allowed.
4. If a student is unable to appear for the Regular Test/Exam due to genuine exigencies, the student
should follow the procedure to apply for the Make-Up Test/Exam. The genuineness of the
reason for absence in the Regular Exam shall be assessed prior to giving permission to appear
for the Make-up Exam. Make-Up Test/Exam will be conducted only at selected exam centres.
It shall be the responsibility of the individual student to be regular in maintaining the self-study schedule
as given in the course handout, attend the lectures, and take all the prescribed evaluation components
such as Assignment/Quiz, Mid-Semester Test and Comprehensive Exam according to the evaluation
scheme provided in the handout.

Learning Outcomes:
No Learning Outcomes

LO1 Should have a good understanding of the field of natural language processing.

LO2 Should have knowledge of important techniques like language modelling, parsing, used
in natural language processing

LO3 Should be able to apply NLP algorithms along with deep learning algorithms for state of
art areas like word embedding

Natural language processing with TensorFlow Teach language to machines using Python s deep learning library 1st Edition Thushan Ganegedara 2024 scribd download
50% (2)
Natural language processing with TensorFlow Teach language to machines using Python s deep learning library 1st Edition Thushan Ganegedara 2024 scribd download
62 pages
Natural Language Processing Handout
No ratings yet
Natural Language Processing Handout
8 pages
Natural Language processing-Regular-HO
No ratings yet
Natural Language processing-Regular-HO
10 pages
NLP Short Que Ans
No ratings yet
NLP Short Que Ans
21 pages
Unit-III-1
No ratings yet
Unit-III-1
11 pages
Ai Unit - 5
No ratings yet
Ai Unit - 5
12 pages
Eisenstein
No ratings yet
Eisenstein
305 pages
NLP Semester 7
No ratings yet
NLP Semester 7
1,072 pages
NLP FINAL
No ratings yet
NLP FINAL
33 pages
NLP assignment notes
No ratings yet
NLP assignment notes
28 pages
Introduction to NLP_first_week_lecture_1st
No ratings yet
Introduction to NLP_first_week_lecture_1st
6 pages
NLP unit1
No ratings yet
NLP unit1
24 pages
vnd.openxmlformats-officedocument.wordprocessingml.document&rendition=1
No ratings yet
vnd.openxmlformats-officedocument.wordprocessingml.document&rendition=1
5 pages
NLP Notes
No ratings yet
NLP Notes
16 pages
CS702B
No ratings yet
CS702B
114 pages
CCS369
No ratings yet
CCS369
2 pages
63105984
No ratings yet
63105984
78 pages
01 Introduction
No ratings yet
01 Introduction
13 pages
Ima 2000
No ratings yet
Ima 2000
56 pages
2020 NLPDeepLearning
No ratings yet
2020 NLPDeepLearning
72 pages
Embeddings in Natural Language Processing Theory and Advances in Vector Representations of Meaning 1st Edition Mohammad Taher Pilehvar - Download the ebook now to never miss important content
100% (1)
Embeddings in Natural Language Processing Theory and Advances in Vector Representations of Meaning 1st Edition Mohammad Taher Pilehvar - Download the ebook now to never miss important content
61 pages
Session 6 - Part-Of-Speech Tagging, Sequence Labeling
No ratings yet
Session 6 - Part-Of-Speech Tagging, Sequence Labeling
86 pages
CD AAT (Techtalk)
No ratings yet
CD AAT (Techtalk)
22 pages
Unit 1a
No ratings yet
Unit 1a
53 pages
Draft: Natural Language Processing For The Working Programmer
No ratings yet
Draft: Natural Language Processing For The Working Programmer
79 pages
AI_M3_Merged.pdf
No ratings yet
AI_M3_Merged.pdf
98 pages
2023 07 28 Evolution of Language Models
No ratings yet
2023 07 28 Evolution of Language Models
73 pages
MScIT-Sem4
No ratings yet
MScIT-Sem4
8 pages
NLP A
No ratings yet
NLP A
6 pages
Hocken Maier 25
No ratings yet
Hocken Maier 25
46 pages
Natural language processing with TensorFlow Teach language to machines using Python s deep learning library 1st Edition Thushan Ganegedara pdf download
100% (2)
Natural language processing with TensorFlow Teach language to machines using Python s deep learning library 1st Edition Thushan Ganegedara pdf download
68 pages
Speech and Language Processing - J&M
No ratings yet
Speech and Language Processing - J&M
599 pages
lect1-intro-3jan08 (1)
No ratings yet
lect1-intro-3jan08 (1)
94 pages
cs224n 2023 Lecture01 Wordvecs1
No ratings yet
cs224n 2023 Lecture01 Wordvecs1
40 pages
Where can buy Embeddings in Natural Language Processing Theory and Advances in Vector Representations of Meaning 1st Edition Mohammad Taher Pilehvar ebook with cheap price
100% (13)
Where can buy Embeddings in Natural Language Processing Theory and Advances in Vector Representations of Meaning 1st Edition Mohammad Taher Pilehvar ebook with cheap price
44 pages
MOD-1
No ratings yet
MOD-1
71 pages
NLP BOOK
No ratings yet
NLP BOOK
599 pages
Speech and Language Processing An Introduction to Natural Language Processing Computational Linguistics and Speech Recognition 3rd Edition Dan Jurafsky - Download the ebook now and own the full detailed content
No ratings yet
Speech and Language Processing An Introduction to Natural Language Processing Computational Linguistics and Speech Recognition 3rd Edition Dan Jurafsky - Download the ebook now and own the full detailed content
65 pages
Natural language processing notes
No ratings yet
Natural language processing notes
61 pages
Raymond S. T. Lee - Natural Language Processing. A Textbook With Python Implementation-Springer (2024)
No ratings yet
Raymond S. T. Lee - Natural Language Processing. A Textbook With Python Implementation-Springer (2024)
454 pages
Natural Language Processing: Zhao Hai 赵海 Department of Computer Science and Engineering Shanghai Jiao Tong University
No ratings yet
Natural Language Processing: Zhao Hai 赵海 Department of Computer Science and Engineering Shanghai Jiao Tong University
61 pages
Unit 5 NLP
No ratings yet
Unit 5 NLP
24 pages
Unit I NLP
No ratings yet
Unit I NLP
5 pages
Nptel: Natural Language Processing - Video Course
No ratings yet
Nptel: Natural Language Processing - Video Course
3 pages
Natural Language Processing A Machine Learning Perspective by Yue Zhang, Westlake University Zhiyang Teng, Westlake University
No ratings yet
Natural Language Processing A Machine Learning Perspective by Yue Zhang, Westlake University Zhiyang Teng, Westlake University
768 pages
CS-3-LESSON PLAN
No ratings yet
CS-3-LESSON PLAN
3 pages
Brochure CMU NLP 24-08-2022 V13
No ratings yet
Brochure CMU NLP 24-08-2022 V13
13 pages
Syllabus NLP (UE19CS334)
No ratings yet
Syllabus NLP (UE19CS334)
2 pages
CH2
No ratings yet
CH2
119 pages
cs224n spr2024 Lecture01 Wordvecs1
No ratings yet
cs224n spr2024 Lecture01 Wordvecs1
40 pages
Get Natural language processing with TensorFlow Teach language to machines using Python s deep learning library 1st Edition Thushan Ganegedara PDF ebook with Full Chapters Now
100% (2)
Get Natural language processing with TensorFlow Teach language to machines using Python s deep learning library 1st Edition Thushan Ganegedara PDF ebook with Full Chapters Now
65 pages
NLP_Front_Matter
No ratings yet
NLP_Front_Matter
28 pages
Ed 3 Book
No ratings yet
Ed 3 Book
577 pages
NLP PPT1 (1)
No ratings yet
NLP PPT1 (1)
29 pages
Natural Language
No ratings yet
Natural Language
68 pages
NLP StudyMaterial
No ratings yet
NLP StudyMaterial
540 pages
Natural Language Processing_NOTES
No ratings yet
Natural Language Processing_NOTES
4 pages
Literary research on NLP
No ratings yet
Literary research on NLP
4 pages
Nlp Lab Manual (2)
No ratings yet
Nlp Lab Manual (2)
28 pages
M.Tech DSE Batch 8 - Dissertation Guidlines - 06.12.2023 - Mid Semester
No ratings yet
M.Tech DSE Batch 8 - Dissertation Guidlines - 06.12.2023 - Mid Semester
2 pages
Thermo Dynamics Assignment
No ratings yet
Thermo Dynamics Assignment
13 pages
M.Tech DSE Batch 8 - Dissertation Guidlines - 06.12.2023 - Abstract Outline
No ratings yet
M.Tech DSE Batch 8 - Dissertation Guidlines - 06.12.2023 - Abstract Outline
4 pages
NLP Session 16 - Post Midsem Review
No ratings yet
NLP Session 16 - Post Midsem Review
189 pages

Natural Language Processing-Course Handout September 2022

Uploaded by

Natural Language Processing-Course Handout September 2022

Uploaded by

BIRLA INSTITUTE OF

TECHNOLOGY & SCIENCE, PILANI

Reference Book(s) & other resources

1. Natural Language Understanding and Generation

2. N-gram Language Modelling

3 Neural networks and Neural language Models

5. Hidden Markov Models and MEMM

7. Vector semantics and Embedding

8. Grammars and Parsing.

11. Encoder-Decoder Models, Attention and Contextual Embeddings

12. Word sense disambiguation

14. Introduction to NLP Applications

Part B: Contact Session Plan

Contact List of Topic Title Topic # Text /

1 Natural Language Understanding and Generation Chapter1 T2

2 N-gram Language Modelling Chapter 3 T1

3 Neural Network and Neural Language Modelling Chapter 4 R2

4 Vector semantics and Embedding Chapter 6 T1 and

5 Part-of-Speech Tagging Chapter8 T1 and class

6 Hidden Markov Model Algorithms Appendix T1 and class

7 Topic modelling Class Notes

9 Grammars and Parsing Chapter3 T2

11 Dependency Parsing Chapter 19 T1 and class

12 Encoder-Decoder Models, Attention and Contextual Chapter10 T1

13 Word sense and word net Chapter15 T1

14 Semantic web ontology and Knowledge Graphs Chapter 24 R1 and class

15 State of art applications Class Notes

Detailed Plan for Lab work

Lab Lab Sheet Session

3 Part of speech tagging 4,5

EC – 1 Quiz 10% To be announced

EC – 2 Assignment 20% To be announced

EC – 3 Mid-term Exam Open book 30% To be announced

EC – 4 End Semester Exam Open book 40% To be announced

Important links and information:

You might also like