Questionnaire
Section I
(to be filled by the Organisations)
1. Organization Details:
Name: I.I.T Bombay
Address: Powai, Mumbai – 400 076
Telephone: 5722545
Fax: 5720290
e-mail:
URL Address of website: http://www.iitb.ac.in
2. Contact Person: Prof. Pushpak Bhattacharya
Telephone: 5767718
Fax: 5720290
e-mail: pb@cse.iitb.ac.in
Chief/Head of R&D Prof. Pushpak Bhattacharya
Telephone: 5767718
Fax: 5720290
e-mail: pb@cse.iitb.ac.in
3. Indian Language Tools :
(Please furnish following information
separately for each tool along with
the copy of published brochure)
Name: iLeap
Nature (h/w or s/w): Software
Minimum Platform Requirement: Windows platform.
(h/w and Operating System)
Languages supported: Assamese, Bengali, English,
Gujarati, Hindi, Kannada,
Malayalam, Marathi, Oriya,
Punjabi, Sanskrit, Tamil, Telugu.
Fonts supported for each language: ---
Bought-out Fonts/Developed in-house Developed in-house.
Functionality:
Is it web-enabled? Yes
Keyboard Lay-outs: Phonetic, Typewriter, Inscript
Coding scheme: ISCII
(ISCII/ Unicode/ Proprietory)
Convertor modules for above: Converters available for
inter-conversion between ISCII
and Unicode.
Product evolution cycle: Detailed information not available
Portability/ expandability: Cannot be used with other similar
(Inter-operatibility with products available in the market.
other similar products
available in the market)
Date of launch: 1995
No. of copies sold so far: Not revealed
Developed in-house/
Contracted to other agencies: Developed in-house.
Development Efforts
in Man-Hours: Inadequate information.
Any Technology Upgradation plans: ---
Name: Akruti
Nature (h/w or s/w): Software
Minimum Platform Requirement: Windows platform.
(h/w and Operating System)
Languages supported: Assamese, Bengali, English,
Gujarati, Hindi, Kannada,
Malayalam, Marathi, Oriya,
Punjabi, Sanskrit, Tamil, Telugu.
Fonts supported for each language: ---
Bought-out Fonts/Developed in-house Developed in-house.
Functionality:
Is it web-enabled? No
Keyboard Lay-outs: Phonetic, Typewriter, Inscript
Coding scheme: Proprietory
(ISCII/ Unicode/ Proprietory)
Convertor modules for above: Available for offline
inter-convertion between
ISCII and Akruti DBF
Product evolution cycle: Detailed information not available
Portability/ expandability: Cannot be used with other similar
(Inter-operatibility with products available in the market.
other similar products
available in the market)
Date of launch: 15-8-1999
No. of copies sold so far: Not revealed
Developed in-house/
Contracted to other agencies: Developed in-house.
Development Efforts
in Man-Hours: Inadequate information.
Any Technology Upgradation plans: ---
Name: Windows 2000
(Indian Language Support)
Nature (h/w or s/w): Software
Minimum Platform Requirement: IBM-PC Compatibles.
(h/w and Operating System)
Languages supported: Assamese, Bengali, English,
Gujarati, Hindi, Kannada,
Malayalam, Marathi, Oriya,
Punjabi, Sanskrit, Tamil, Telugu.
Fonts supported for each language: ---
Bought-out Fonts/Developed in-house Developed in-house.
Functionality:
Is it web-enabled? Yes
Keyboard Lay-outs: Phonetic, Typewriter, Inscript
Coding scheme: Unicode
(ISCII/ Unicode/ Proprietory)
Convertor modules for above: Convertors available from iLeap
for inter-conversion between
ISCII and Unicode.
Product evolution cycle: Detailed information not available
Portability/ expandability: Cannot be used with other similar
(Inter-operatibility with products available in the market.
other similar products
available in the market)
Date of launch: 2000
No. of copies sold so far: Not revealed
Developed in-house/
Contracted to other agencies: Contracted to NCST, Mumbai.
Development Efforts
in Man-Hours: Inadequate information.
Any Technology Upgradation plans: ---
4. Products under development: Hindi Wordnet,
Marathi portal complete with
search engine,
Machine translation software
along with dictionary,
Online textbooks for schools
in Marathi,
Text to speech converter for Marathi.
For details see below.
Name and functionality: As above.
Stage of completion: Machine translation softwares
reasonably developed,
other activities started about
6 months back.
Plans for commercialization:
Local industries will be contacted.
The Market assessment: Local industries also are working on
Marathi portals.
(Assessment of the likely competition from
Microsoft’s Indian language products and other vendors)
5. Technical Capabilities Developed for processing of
Indian languages:
Tools development: Hindi Wordnet,
Marathi portal complete with
search engine,
Machine translation software
along with dictionary,
Online textbooks for schools
in Marathi,
Text to speech converter for Marathi.
Fonts Development: Nil.
Web enabled applications: Marathi portal complete with
search engine, Online textbooks for
schools in Marathi.
Multilingual and Multimedia content creation:
Speech Technology for Marathi.
Capturing the Ancient heritage into knowledge based system:
Development of Marathi text corpus
in electronic form (including a Marathi
dictionary, and 10 Marathi classics)
Sorting and searching: ---
Search engines: Efficient and linguistic search engines
Optical Character recognition: ---
Text to speech systems: Speech Technology for Marathi.
Voice recognition: ---
Machine translation systems: Machine translation between
Marathi on one hand and Hindi and
English on the other.
6. Tools/contents contributed for the public domain:
Marathi portal complete with search engine.
Machine translation software along with dictionary.
Online textbooks for schools in Marathi.
Hindi WordNet.
7. Limitations with regard to technology development/
growth of market (Indicative list of parameters for
providing inputs):
- Standards (for coding and keyboard Layouts, ISCII/Unicode):
Difficulty is being faced in deciding on the
appropriate standard for representing Indian language text.
This is mainly because there does not seem to be consensus
on the encoding adopted by various organizations,
notably publishing houses, newspapers, etc.
For example: Maharashtra Times and Loksatta use totally
different encodings.
It is urgently required that the ministry enforces a
common standard accepted across the length and breadth
of the country.
- Return on investments: ---
- Potential Buyers/User Organizations:
No limitation is envisaged.
- Support from respective State Governments:
For the Marathi portal to be effective, government
organizations should all create their sites and make them
accessible. If not, this will pose a serious limitation.
- Long development Cycles and gestation periods and piracy issues:
The Indian language text to speech state-of-the-art is
in a primitive state. Foundation level research will be
required before proceeding with the actual implementation.
- Intricacies with regard to technology application to multiple
Indian Languages:
The Hindi WordNet development is both linguistically and
computationally a highly challenging task. In particular
it is extremely difficult to find people combining both
linguistic finesse and computational ability.
Relevancy calculation for indexed documents
(in the context of Marathi portal) is also a technologically
challenging job.
Finally the translation software requires enormous amount of
lexical knowledge.
- Competition from other companies and multi-nationals and so on:
Few local industries are competitors as far as
Marathi portal development is concerned.
|