Home Instructors Course Materials TAs Contact Us

CS772: Deep Learning for Natural Language Processing


  • The first assignment is uploaded on moodle course page.
  • The first interaction happened on Friday 8:30 AM (8th January 2021)

Course Details

CS772: Deep Learning for Natural Language Processing
Department of Computer Science and Engineering
Indian Institute of Technology Bombay

Time Table and Venue

  • Monday: 9:30 AM to 10:25 AM
  • Tuesday: 10:35 AM to 11:30 AM
  • Thursday: 11:35 AM to 12:30 PM


Deep Learning (DL) is a framework for solving AI problems based on a network of neurons organized in many layers. DL has found heavy use in Natural Language Processing (NLP) too, including problems like machine translation, sentiment and emotion analysis, question answering, information extraction and so on, improving performance on automatic systems by order of magnitude.

The course CS626 (Speech, NLP and the Web) being taught in the first semester in CSE Dept IIT Bombay for last several years creates a strong foundation of NLP covering the whole NLP stack starting from morphology to part of speech tagging, to parsing and discourse and pragmatics. Students of the course which typically number more than 100, acquire a grip on tasks, techniques and linguistics of a plethora of NLP problems.

CS772( Deep Learning for Natural Language Processing) comes as a natural sequel to CS626. Language tasks are examined through the lens of Deep Learning. Foundations and advancements in Deep Learning are taught, integrated with NLP problems. For example, sequence to sequence transformer is covered with application in machine translation. Similarly, various techniques in word embedding are taught with application to text classification, information extraction etc.

CS772 is definitely the need of the hour. While CS626 concentrates on algorithmics and linguistics of NLP, the proposed course will concentrate on Data, Distributions, Neural Models, Non-parametric estimation, Information Coding, Representation and such questions

Course Content

  • Background: History of Neural Nets; History of NLP; Basic Mathematical Machinery- Linear Algebra, Probability, Information Theory etc.; Basic Linguistic Machinery- Phonology, morphology, syntax, semantics
  • Introducing Neural Computation: Perceptrons, Feedforward Neural Network and Backpropagation, Recurrent Neural Nets
  • Difference between Classical Machine Learning and Deep Learning: Representation- Symbolic Representation, Distributed Representation, Compositionality; Parametric and non-parametric learning
  • Word Embeddings: Word2Vec (CBOW and Skip Gram), Glove, FastText
  • Application of Word Embedding to Shallow Parsing- Morphological Processing, Part of Speech Tagging and Chunking
  • Sequence to Sequence (seq2seq) Transformation using Deep Learning: LSTMs and Variants, Attention, Transformers
  • Deep Neural Net based Language Modeling: XLM, BERT, GPT2-3 etc; Subword Modeling; Transfer Learning and Multilingual Modeling
  • Application of seq2seq in Machine Translation: supervised, semi supervised and unsupervised MT; encoder-decoder and attention in MT; Memory Networks in MT
  • Deep Learning and Deep Parsing: Recursive Neural Nets; Neural Constituency Parsing; Neural Dependency Parsing
  • Deep Learning and Deep Semantics: Word Embeddings and Word Sense Disambiguation; Semantic Role Labeling with Neural Nets
  • Neural Text Classification; Sentiment and Emotion labelling with Deep Neural Nets (DNN); DNN based Question Answering
  • The indispensability of DNN in Multimodal NLP; Advanced Problems like Sarcasm, Metaphor, Humour and Fake News Detection using multimodality and DNN
  • Natural Language Generation; Extractive and Abstractive Summarization with Neural Nets
  • Explainability


  • Ian Goodfellow, Yoshua Bengio and Aaron Courville, Deep Learning, MIT Press, 2016.
  • Dan Jurafsky and James Martin, Speech and Language Processing, 3rd Edition, October 16, 2019.
  • Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola, Dive into Deep Learning, e-book, 2020.
  • Christopher Manning and Heinrich Schutze, Foundations of Statistical Natural Language Processing, MIT Press, 1999.
  • Daniel Graupe, Deep Learning Neural Networks: Design and Case Studies, World Scientific Publishing Co., Inc., 2016.
  • Pushpak Bhattacharyya, Machine Translation, CRC Press, 2017.
  • Journals:Computational Linguistics, Natural Language Engineering, Journal of Machine Learning Research (JMLR), Neural Computation, IEEE Transactions on Neural Networks and Learning Systems
  • Conferences: Annual Meeting of the Association of Computational Linguistics (ACL), Neural Information Processing (NeuiPS), Int’l Conf on Machine Learning (ICML), Empirical Methods in NLP (EMNLP).


Data Structures and Algorithms, Python (or similar language) Programming skill

Course Instructors

Teaching Assistants

Lecture Slides

Lecture Topics Readings and useful links
Lecture 1
  • Introduction
  • Motivation
Lecture 2, 3 & 4
(Week of 11th January)
  • Neuron I-O Functions
  • Huggingface
Lecture 5, 6 & 7
(Week of 18th January)
  • NNLM cntd.
  • Project suggestions
Lecture 8
(Week of 25th January)
  • Neural Language Model
Lecture 9 (Guest Lecture)
  • Learning Representations in NLP
Lecture 10
(Week of 1st February)
  • Backpropagation
Lecture 11 (Guest Lecture)
  • Recurrent Neural Networks
Lecture 12 (Guest Lecture)
  • Machine Translation & Sequence-2-Sequence Tasks
Lecture 13 (Guest Lecture)
  • Machine Translation & Sequence-2-Sequence Tasks
Lecture 14
(Week of 8th February)
  • Backpropagation and associated concepts
Lecture 15 & 17
(Week of 15th February)
  • BP and Start of CNN
Lecture 16 (Guest Lecture)
Lecture 16
  • Convolutional Neural Networks For NLP
Lecture 18
(Week of 22nd February)
  • Convolutional Neural Nets
Lecture 19
(Week of 1st March,2021)
  • BP of softmax, RELU, CE and start of Transformer
Lecture 20,21 & 22
(Week of 8th March,2021)
  • Attention
Lecture 23, 24 & 25
(Week of 15th March,2021)
  • Attention & Transformer
Lecture 26
(Week of 22nd March,2021)
  • Learning to Align and Translate Jointly
Lecture 27( Guest Lecture )
  • Transformer applications
Lecture 28 & 29
(Week of 29th March,2021)
  • Deep Learning for Sentiment, Emotion and Dialogue Analysis
Lecture 30, 31 & 32
(Week of 5th April,2021)
  • Deep Learning for Dialogue Processing, Sarcasm and Politeness in Dialogues; Start of Evaluation
Week of 12th April,2021
  • Evaluation
Week of 19th April,2021
  • Evaluation cntd., use of Deep Learning in MT evaluation, Hypothesis Testing

Lecture videos

Lecture videos are regularly uploaded on MSTeams.


Contact Us

Room Number: 401, 4th Floor, new CC building
Department of Computer Science and Engineering
Indian Institute of Technology Bombay
Mumbai 400076, India