Computation for Indian Language Technology

Computation for Indian Language Technology (CFILT) was set up with a generous grant from the Department of Information Technology (DIT), Ministry of Communication and Information Technology, Government of India in 2000 at the Department of Computer Science and Engineering, IIT Bombay. Prior to this the Natural Language Processing (NLP) activity of the CSE Department, IIT Bombay took off in 1996 with a grant from the United Nations University, Tokyo to create a multilingual information exchange system for the web. The project called Universal Networking Language (UNL; www.undl.org) was participated in by 15 research groups across continents.

At any point of time about 30 research members work in CFILT, which includes PhD , masters and bachelor students, faculty members, linguists and lexicographers.

Deep semantics and multilinguality has throughout played a pivotal role in the activities of CFILT. The stress on semantics has led to research in the following fronts:

  • Lexical Resources: Multilingual wordnets and ontologies and their linking
  • Lexical and Structural Disambiguation: Resolve word and attachment ambiguities
  • Shallow Parsing: Identifying correct parts of speech, named entities and non-recursive noun phrases for Marathi and Hindi
  • Cross Lingual Information Retrieval: Indian language query to English and Hindi Retrieval
  • Machine Translation: Automatic translation involving Marathi, Hindi and English
  • Text Entailment: Testing if a piece text (hypothesis) is inferable from another (text)
  • Sentiment Analysis: Detecting polarity- positive/negative/neutral- of a given document, especially reviews
  • Cognitive NLP: Study of cognitive aspects of language processing and understanding using eye-tracking

Wordnets

Hindi WordNet

Browse the Hindi Wordnet through this interface. Inspired by the English WordNet, It is more than a conventional Hindi dictionary.

Marathi WordNet

Click here to browse the Marathi WordNet. It gives different relations between synsets or synonym sets which represent unique concepts.

Sanskrit WordNet

Browse the Sanskrit Wordnet through this interface. Inspired by the English WordNet, It is more than a conventional Sanskrit dictionary.

Indo WordNet

Browse the Indo Wordnet through this interface. It is a linked lexical knowledge base of wordnets of 18 scheduled languages of India, viz., Assamese, Bangla, Bodo, Gujarati, Hindi, Kannada, Kashmiri, Konkani, Malayalam, Meitei (Manipuri), Marathi, Nepali, Oriya, Punjabi, Sanskrit, Tamil, Telugu and Urdu.

Shabdamitra

Hindi Shabdamitra

हिंदी शब्दमित्र (Hindi Shabdamitra) is an e-learning product meant for Hindi language teaching and learning. It uses Hindi Wordnet as a resource, which is, then, further augmented with audio-visual features, grammatical properties and is presented in a learner-friendly layered format.

Marathi Shabdamitra

मराठी शब्दमित्र (Marathi Shabdamitra) is an e-learning product meant for Marathi language teaching and learning. It uses Marathi Wordnet as a resource, which is, then, further augmented with audio-visual features, grammatical properties and is presented in a learner-friendly layered format.

Sanskrit Shabdamitra

संस्कृत शब्दमित्र (Sanskrit Shabdamitra) is an e-learning product meant for Marathi language teaching and learning. It uses Sanskrit Wordnet as a resource, which is, then, further augmented with audio-visual features, grammatical properties and is presented in a learner-friendly layered format.

CFILT Heads

Prof. Pushpak
Bhattacharyya

Coordinator

Prof. Malhar
Kulkarni

Investigator

Prof. Ganesh
Ramakrishnan

Investigator

Prof. Preethi
Jyothi

Investigator

: