CS 607    

    Seminar on Deep Learning for Natural Language Processing    

Course Description

Deep learning has recently proved itself as a powerful branch of machine learning that has applications in a variety of domains (i.e., computer vision, robotics, etc.). In natural language processing (NLP), deep learning has been transformative and led to a new generation of methods with better performance, portability and robustness (e.g., machine translation, text generation/dialog systems). Especially, with the recent breakthrough in pretrained language models (e.g., BERT, GPT), building state-of-the-art NLP models can be done efficiently for different domains and languages. All of those advances are very recent and the demand for data scientists with deep learning expertise is growing very quickly. At the beginning of this seminar, we will cover the basic concepts of deep learning and NLP, and possibly provide some hand-on experience to implement the models. Afterward, we will review and discuss a collection of research papers on NLP with deep learning, including but not limited to the typical tasks of language modeling, question answering, information extraction, machine translation, natural language inference, dialog, summarization, domain adaptation, transfer learning, and multilingual learning. NLP is growing fast these days and we expect to read many exciting recent papers in this field.

Instructor

Thien Huu Nguyen, [email protected]

Lectures

One 80-minute presentation and discussion session is delivered in-person each week.

Prerequisites

Textbooks and Readings

Major Topics

Expected Learning Outcomes

In this seminar, we will review and discuss a collection of research papers on NLP with deep learning, including but not limited to the typical tasks of language modeling, question answering, information extraction, natural language inference, dialog and summarization.

Upon successful completion of the course, students will be able to:

Acquired Skills

Upon successful completion of the course, students will have acquired the following skills:

Class Organization

Each student in the class will need to choose some paper(s) on particular topics, present them in some classes and lead the discussions on the topics.

Each presentation will be given 30 minutes along with 5-10 minutes for discussions. More discussion is encouraged on Piazza.

After the presentation, the presenter should submit a summary about the presented paper/topic. The summary should follow the NAACL format (a.k.a. ACL style for 8.5x11 paper). The required length of the summary is between 2 and 3 pages.

For each presentation, we might have one student (other than the presenter) to serve as the reviewer. The role of the reviewer is to provide judgement/comments/suggestions or ask questions about the paper/topic in the discussion time after the presentation. Although all students need to read the papers being presented before each class to be able to actively contribute to the discussions, the reviewer would help to provide deeper judgement by reading and thinking critically about the papers/topics ahead of time.

IMPORTANT:

Please select a paper you want to present in this list. Write your name next to the papers you select (for presentations and reviews). You are welcome to choose another paper that is not in the list. Please talk with the instructor if you want to do this. All the paper assignments should be done before April 9 (no later than that) so we can schedule the presentations.

Tentative Schedule

Please sign up on Piazza for discussions.
Week Topics Presenter Slides Reviewer
04/04 Large Language Models in NLP Thien Link
04/11 Transforming Sequence Tagging Into A Seq2Seq Task Amir Link -
Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy Zayd Link -
04/18 Don’t Prompt, Search! Mining-based Zero-Shot Learning with Language Models Viet Link -
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity Gabriel Link -
04/25 Cross-Linguistic Syntactic Difference in Multilingual BERT: How Good is It and How Does It Affect Transfer? Hakyung Link -
Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt Navya Link -
05/02 Interpreting Language Models with Contrastive Explanations Hieu Link -
RankGen: Improving Text Generation with Large Ranking Models Timmy Link -
05/09 Parallel Context Windows Improve In-Context Learning of Large Language Models Paul Link -
A Length-Extrapolatable Transformer Gabriel Link -
05/16 Structured Prompting: Scaling In-Context Learning to 1,000 Examples Viet Link -
Prompt-based Distribution Alignment for Domain Generalization in Text Classification Hieu Link -
05/23 Active Example Selection for In-Context Learning Amir Link -
Gradient-based Constrained Sampling from Language Models Timmy Link -
05/30 Minority Voices "Filtered" Out of Google Natural Language Processing Models Viet Link -
Efficient Pre-training of Masked Language Model via Concept-based Curriculum Masking Hakyung Link -
06/06 Entity Extraction in Low Resource Domains with Selective Pre-training of Large Language Models Navya Link -
Evade the Trap of Mediocrity: Promoting Diversity and Novelty in Text Generation via Concentrating Attention Paul Link -

Course Requirements and Grading

This course will be taught in-person. We might also stream and record the lectures over Zoom upon request. We use Canvas and Piazza for communication and discussion.

Grading will be based on P/NP.