Computation for Indian Language Technology (CFILT) was set up with a generous grant from the Department of Information Technology (DIT), Ministry of Communication and Information Technology, Government of India in 2000 at the Department of Computer Science and Engineering, IIT Bombay. Prior to this the Natural Language Processing (NLP) activity of the CSE Department, IIT Bombay took off in 1996 with a grant from the United Nations University, Tokyo to create a multilingual information exchange system for the web. The project called Universal Networking Language (UNL; www.undl.org) was participated in by 15 research groups across continents.
At any point of time about 30 research members work in CFILT, which includes PhD , masters and bachelor students, faculty members, linguists and lexicographers.
Deep semantics and multilinguality has throughout played a pivotal role in the activities of CFILT. The stress on semantics has led to research in the following fronts:
- Lexical Resources: Multilingual wordnets and ontologies and their linking
- Lexical and Structural Disambiguation: Resolve word and attachment ambiguities
- Shallow Parsing: Identifying correct parts of speech, named entities and non-recursive noun phrases for Marathi and Hindi
- Cross Lingual Information Retrieval: Indian language query to English and Hindi Retrieval
- Machine Translation: Automatic translation involving Marathi, Hindi and English
- Text Entailment: Testing if a piece text (hypothesis) is inferable from another (text)
- Sentiment Analysis: Detecting polarity- positive/negative/neutral- of a given document, especially reviews
- Cognitive NLP: Study of cognitive aspects of language processing and understanding using eye-tracking
हिंदी शब्दमित्र (Hindi Shabdamitra) is an e-learning product meant for Hindi language teaching and learning. It uses Hindi Wordnet as a resource, which is, then, further augmented with audio-visual features, grammatical properties and is presented in a learner-friendly layered format.