Home Instructors TAs Course Materials Course Exams Assignments Contact Us


CS772: Deep Learning for Natural Language Processing

Announcement

  • MS Teams Code: op38ybr
  • Previous iterations of the course: 2023 | 2022

Course Details

CS772: Deep Learning for Natural Language Processing
Department of Computer Science and Engineering
Indian Institute of Technology Bombay

Time Table and Venue

  • Monday: 03:30 PM to 04:55 PM
  • Thursday: 03:30 PM to 04:55 PM
  • Venue: LH-102

Motivation

Deep Learning (DL) is a framework for solving AI problems based on a network of neurons organized in many layers. DL has found heavy use in Natural Language Processing (NLP) too, including problems like machine translation, sentiment and emotion analysis, question answering, information extraction and so on, improving performance on automatic systems by order of magnitude. The course CS626 (Speech, NLP and the Web) being taught in the first semester in CSE Dept IIT Bombay for last several years creates a strong foundation of NLP covering the whole NLP stack starting from morphology to part of speech tagging, to parsing and discourse and pragmatics. Students of the course which typically number more than 100, acquire a grip on tasks, techniques and linguistics of a plethora of NLP problems. CS772( Deep Learning for Natural Language Processing) comes as a natural sequel to CS626. Language tasks are examined through the lens of Deep Learning. Foundations and advancements in Deep Learning are taught, integrated with NLP problems. For example, sequence to sequence transformer is covered with application in machine translation. Similarly, various techniques in word embedding are taught with application to text classification, information extraction etc.

Course Description

Will be updated soon. The general approach in the course will be covering (i) a language phenomenon, (ii) the corresponding language processing task, and (iii) techniques based on deep learning, classical machine learning and knowledge base. On one hand we will understand the language processing task in detail using linguistics, cognitive science, utility etc., on the other hand we will delve deep into techniques for solving the problem. The topics are given now.
  • Sound: Biology of Speech Processing; Place and Manner of Articulation; Peculiarities of Vowels and Consonants; Word Boundary Detection; Argmax based computations; Hidden Markov Model and Speech Recognition; deep neural nets for speech processing.
  • Morphology: Morphology fundamentals; Isolating, Inflectional, Agglutinative morphology; Infix, Prefix and Postfix Morphemes, Morphological Diversity of Indian Languages; Morphology Paradigms; Rule Based Morphological Analysis: Finite State Machine Based Morphology; Automatic Morphology Learning; Deep Learning based morphology analysis.
  • Shallow Parsing: Part of Speech (POS) Tagging; HMM based POS tagging; Maximum Entropy Models and POS; Random Fields and POS; DNN for POS.
  • Parsing: Constituency and Dependency Parsing; Theories of Parsing; Scope Ambiguity and Attachment Ambiguity Resolution; Rule Based Parsing Algorithms; Probabilistic Parsing; Neural Parsing.
  • Meaning: Lexical Knowledge Networks, Wordnet Theory and Indian Language Wordnets; Semantic Roles; Word Sense Disambiguation; Metaphors.
  • Discourse and Pragmatics: Coreference Resolution; Cohesion and Coherence.
  • Applications: Machine Translation; Sentiment and Emotion Analysis; Text Entailment; Question Answering; Code Mixing; Analytics and Social Networks, Information Retrieval and Cross Lingual Information Retrieval (IR and CLIR)

References

Will be updated soon.
  • Allen, James, Natural Language Understanding, Second Edition, Benjamin/Cumming, 1995.
  • Charniack, Eugene, Statistical Language Learning, MIT Press, 1993
  • Jurafsky, Dan and Martin, James, Speech and Language Processing, Speech and Language Processing (3rd ed. draft), Draft chapters in progress, October 16, 2019.
  • Manning, Christopher and Heinrich, Schutze, Foundations of Statistical Natural Language Processing, MIT Press, 1999.
  • Jacob Eisenstein, Introduction to Natural Language Processing, MIT Press, 2019.
  • Deep Learning, Ian Goodfellow, Yoshua Bengio, and Aaron Courville, MIT Press, 2016.
  • Radford, Andrew et. al., Linguistics, an Introduction, Cambridge University Press, 1999.
  • Pushpak Bhattacharyya, Machine Translation, CRC Press, 2017.
  • Journals: Computational Linguistics, Natural Language Engineering, Machine Learning, Machine Translation, Artificial Intelligence
  • Conferences: Annual Meeting of the Association of Computational Linguistics (ACL), Computational Linguistics (COLING), European ACL (EACL), Empirical Methods in NLP (EMNLP), Annual Meeting of the Special Interest Group in Information Retrieval (SIGIR), Human Language Technology (HLT).

Pre-requisites

Data Structures and Algorithms, Python (or similar language) Programming skill

Course Instructors

Teaching Assistants

Lecture Slides

Lecture Topics Readings and useful links
Week 1
(Week of 4th January)
  • Introduction to DL-NLP, Motivation
  • Course Info
Week 1 Lecture
Week 2
(Week of 8th Jan)
  • Perceptron Model
  • Proof of Convergence Theorem
  • Sigmoid and Softmax
Week 2 Lecture
Week 3
(Week of 15th Jan)
  • Recurrent Perceptron
  • Deepfake, Bias, Convergence of PTA
  • Weight change rules
Week 3 Lecture
Week 4
(Week of 22nd Jan)
  • Small Language Models
  • Backpropagation
  • Word Vector, Co-occurence matrix
Week 4 Lecture
Week 5
(Week of 29th Jan)
  • Capturing word association
  • Important concepts with FFNN-BP
  • Linguistic foundation of word representation by vectors
Week 5 Lecture
Week 6
(Week of 5th Feb)
  • Harris Distributional Hypothesis
  • Word2vec weight change rule
  • Representation Learning, WordNet
Week 6 Lecture
Week 7
(Week of 12th Feb)
  • HMM
  • Training Language Models
  • BPTT
Week 7 Lecture
Week 8
(Week of 19th Feb)
  • Long Distance Dependency
  • LSTM
  • Decoding A* Algorithm
Week 8 Lecture
Week 9
(Week of 04th Mar)
  • Attention Mechanism
  • Transformer Architecture
Week 9 Lecture
Week 10
(Week of 11th Mar)
  • Neural Machine Translation
  • NMT Evaluation
  • Pivot-based NMT
Week 10 Lecture
Week 11
(Week of 18th Mar)
  • Summarization
  • Prompt-based Summarization
  • Opinion Summarization
Week 11 Lecture
Week 12
(Week of 25th Mar)
  • Neural Cross Attention vs Statistical Alignment Learning
  • RLHF
Week 12 Lecture
Week 13
(Week of 1st Apr)
  • Self Attention
  • Vaswani et al.
Week 13 Lecture
Week 14
(Week of 8th Apr)
  • Grammar as LM and Computation
  • Summary of generations of LMs
  • Convolution Neural Networks
Week 14 Lecture
Week 15
(Week of 15th Apr)
  • Conversational AI
  • Pragmatics
Week 15 Lecture

Lecture videos

Lecture videos are regularly uploaded on MSTeams. Lecture videos will be available here Link.

Course Exams

Assignments

Date Assignment# Topic Deadline Link
03rd March, 2024 Assignment 2 Given a POS-tagged corpus, train a single recurrent perceptron to mark noun chunks in a sentence. Assignment Zip

Contact Us

CFILT Lab
Room Number: 401, 4th Floor, new CC building
Department of Computer Science and Engineering
Indian Institute of Technology Bombay
Mumbai 400076, India