CS772: Deep Learning for Natural Language Processing

Announcement

Join the MS Teams using the code 8ht1h3m

Course Details

CS772: Deep Learning for Natural Language Processing
Department of Computer Science and Engineering
Indian Institute of Technology Bombay

Time Table and Venue

Monday: 8:30 AM to 9:25 AM
Tuesday: 9:30 AM to 10:25 AM
Thursday: 10:30 PM to 11:25 AM

Motivation

Deep Learning (DL) is a framework for solving AI problems based on a network of neurons organized in many layers. DL has found heavy use in Natural Language Processing (NLP) too, including problems like machine translation, sentiment and emotion analysis, question answering, information extraction and so on, improving performance on automatic systems by order of magnitude.

The course CS626 (Speech, NLP and the Web) being taught in the first semester in CSE Dept IIT Bombay for last several years creates a strong foundation of NLP covering the whole NLP stack starting from morphology to part of speech tagging, to parsing and discourse and pragmatics. Students of the course which typically number more than 100, acquire a grip on tasks, techniques and linguistics of a plethora of NLP problems.

CS772( Deep Learning for Natural Language Processing) comes as a natural sequel to CS626. Language tasks are examined through the lens of Deep Learning. Foundations and advancements in Deep Learning are taught, integrated with NLP problems. For example, sequence to sequence transformer is covered with application in machine translation. Similarly, various techniques in word embedding are taught with application to text classification, information extraction etc.

CS772 is definitely the need of the hour. While CS626 concentrates on algorithmics and linguistics of NLP, the proposed course will concentrate on Data, Distributions, Neural Models, Non-parametric estimation, Information Coding, Representation and such questions

Course Content

Background: History of Neural Nets; History of NLP; Basic Mathematical Machinery- Linear Algebra, Probability, Information Theory etc.; Basic Linguistic Machinery- Phonology, morphology, syntax, semantics
Introducing Neural Computation: Perceptrons, Feedforward Neural Network and Backpropagation, Recurrent Neural Nets
Difference between Classical Machine Learning and Deep Learning: Representation- Symbolic Representation, Distributed Representation, Compositionality; Parametric and non-parametric learning
Word Embeddings: Word2Vec (CBOW and Skip Gram), Glove, FastText
Application of Word Embedding to Shallow Parsing- Morphological Processing, Part of Speech Tagging and Chunking
Sequence to Sequence (seq2seq) Transformation using Deep Learning: LSTMs and Variants, Attention, Transformers
Deep Neural Net based Language Modeling: XLM, BERT, GPT2-3 etc; Subword Modeling; Transfer Learning and Multilingual Modeling
Application of seq2seq in Machine Translation: supervised, semi supervised and unsupervised MT; encoder-decoder and attention in MT; Memory Networks in MT
Deep Learning and Deep Parsing: Recursive Neural Nets; Neural Constituency Parsing; Neural Dependency Parsing
Deep Learning and Deep Semantics: Word Embeddings and Word Sense Disambiguation; Semantic Role Labeling with Neural Nets
Neural Text Classification; Sentiment and Emotion labelling with Deep Neural Nets (DNN); DNN based Question Answering
The indispensability of DNN in Multimodal NLP; Advanced Problems like Sarcasm, Metaphor, Humour and Fake News Detection using multimodality and DNN
Natural Language Generation; Extractive and Abstractive Summarization with Neural Nets
Explainability

References

Ian Goodfellow, Yoshua Bengio and Aaron Courville, Deep Learning, MIT Press, 2016.
Dan Jurafsky and James Martin, Speech and Language Processing, 3rd Edition, October 16, 2019.
Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola, Dive into Deep Learning, e-book, 2020.
Christopher Manning and Heinrich Schutze, Foundations of Statistical Natural Language Processing, MIT Press, 1999.
Daniel Graupe, Deep Learning Neural Networks: Design and Case Studies, World Scientific Publishing Co., Inc., 2016.
Pushpak Bhattacharyya, Machine Translation, CRC Press, 2017.

Journals:Computational Linguistics, Natural Language Engineering, Journal of Machine Learning Research (JMLR), Neural Computation, IEEE Transactions on Neural Networks and Learning Systems
Conferences: Annual Meeting of the Association of Computational Linguistics (ACL), Neural Information Processing (NeuiPS), Int’l Conf on Machine Learning (ICML), Empirical Methods in NLP (EMNLP).

Pre-requisites

Data Structures and Algorithms, Python (or similar language) Programming skill

Lecture Slides

Lecture	Topics	Readings and useful links
Week 1 (Week of 3rd January)	Introduction and Motivation Applications of NLP	Week 1 Lecture
Week 2 (Week of 10th January)	Neural POS Tagging Neural Language Models	Week 2 Lecture
Week 3 (Week of 17th January)	Skip-gram Perceptron	Week 3 Lecture
Week 4 (Week of 24th January)	Gradient Descent Backpropagation	Week 4 Lecture
Week 5 (Week of 31st January)	Word2Vec Feedforward NN, Backpropagation	Week 5 Lecture
Week 6 (Week of 7th February)	Cross Entropy Loss, Softmax Relu, RNN	Week 6 Lecture
Week 7 (Week of 14th February)	BPTT, Hopfield net LSTM	Week 7 Lecture
Week 9 (Week of 28th February)	RNN, Sequence Labelling Language Modeling, LSTMS Encoder-Decoder Models, Machine Translation NMT, Attention	Week 9 Lecture
Week 10 (Week of 7th March)	CNN	Week 10 Lecture
Week 11 (Week of 14th March)	CNN, Eye Tracking, Sarcasm	Week 11 Lecture
Week 12 (Week of 21st March)	Attention and Transformers	Week 12 Lecture
Week 13 (Week of 28th March)	Attention and Transformers contd	Week 13 Lecture
Week 14 (Week of 4th April)	NLP Applications of Attention and Transformer	Week 14 Lecture
Week 15 (Week of 11th April)	Evaluationr	Week 15 Lecture

Lecture videos

Lecture videos are regularly uploaded on MSTeams. Lecture videos are also available on the Google Drive

Assignments

Date	Assignment#	Topic	Deadline	Link
18/01/2022	Assignment1	Skip-gram Implementation	No Deadline	Assignment1

CS772: Deep Learning for Natural Language Processing

Announcement

Course Details

CS772: Deep Learning for Natural Language Processing
Department of Computer Science and Engineering
Indian Institute of Technology Bombay

Time Table and Venue

Motivation

Course Content

References

Pre-requisites

Course Instructors

Teaching Assistants

Lecture Slides

Lecture videos

Assignments

ReadingList

Contact Us

CS772: Deep Learning for Natural Language Processing

Announcement

Course Details

CS772: Deep Learning for Natural Language Processing Department of Computer Science and Engineering Indian Institute of Technology Bombay

Time Table and Venue

Motivation

Course Content

References

Pre-requisites

Course Instructors

Teaching Assistants

Lecture Slides

Lecture videos

Assignments

ReadingList

Contact Us

CS772: Deep Learning for Natural Language Processing
Department of Computer Science and Engineering
Indian Institute of Technology Bombay