Monday, February 25, 2002

Indian languages get their due
V. P. Prabhakar

A number of initiatives have been taken under the Technology Development for Indian Languages (TDIL) programme of the Ministry of Information Technology for the development of Indian languages processing tools, human-machine interface systems, translation support systems, corpora and lexical resources.

To promote information processing in Indian languages a project has been taken up at 13 educational and research and development institutions across the country.

The core objectives of these resource centres are to act as a repository of all knowledge tools and products concerned with computer processing of Indian languages and bring out yearly resource documents to develop methodologies for seamless integration of language processing tools with existing and evolving software environment; to network with centres concerned with processing of Indian languages and potential user agencies; create content and databases on the resources available in Indian languages and to put at least 10 well-recognised books (related to Indian heritage) in Indian languages on the Web. They will also work with local newspapers to make them available online.


These resource centres are also to create awareness and organise training programmes for agencies and personnel connected with the deployment of Indian language processing systems, to facilitate language technology research in machine-aided translation, optical character recognition, text-to-speech and speech recognition for Hindi, to organise IT localisation clinics for small businesses to provide consultancy on the use of Indian language tools in developing IT solutions and to take up development of requisite niche technologies.

According to the report of the ministry, the TDIL has made significant achievements:

Computer courseware in Hindi: DOEACC "O" level courseware (five modules) in machine-readable form has been developed in Hindi in collaboration with the DOEACC. Two modules, viz. information technology and PC software, have been published in book form. Sanskrit authoring system: A Sanskrit authoring system has been developed that can handle special Sanskrit conjuncts. It also allows word processing in Sanskrit and provides search/sort algorithms, transliteration facility and word split programmes for sandhi and samasa. Multilingual content creation: Electronic dictionaries, viz. Bharat Bhasha Kosh, SAARC countries language dictionary and UN selected languages dictionary and a Health Vishwa Kosh are being Web-enabled.

Localisation of Linux: Development is being carried out for localising suitable components within the Linux operating system to enable applications to create, edit and display content in Hindi.

Text-speech synthesis system: The Alpha version of the text-speech synthesis system is being ported to the Windows platform and the speech quality is also being improved by incorporating the rules for prosody.

TDIL Website: The TDIL programme has developed its Website (http://vishwabharat.tdil.gov.in). It provides information on TDIL activities as well as free downloadable software like iLEAP, Akshar for Windows, Indian language keyboard driver and font, Samadhan Seva and Gyan Nidhi Seva.

According to the Ministry of Information Technology, the 13 resource centres for Indian languages technology solutions, covering a11 constitutional languages, are: The Thapar Institute of Engineering and Technology, Patiala, for Punjabi; the Indian Institute of Technology, Kanpur, for Hindi and Nepali; the Indian Institute of Technology, Mumbai, for Marathi and Konkani; the Indian Institute of Technology, Guwahati, for Assamese and Manipuri; the Indian Institute of Science, Bangalore, for Kannada, Sanskrit and cognitive models; the Indian Statistical Institute, Calcutta, for Bengali; Jawaharlal Nehru University, New Delhi, for foreign languages Japanese and Chinese and Sanskrit language learning systems; the University of Hyderabad, Hyderabad, for Telugu; Anna University, Chennai, for Tamil; MS University, Baroda, for Gujarati; Utkal University and Orissa Computer Application Centre for Oriya; ER & DC, Thiruvananthapuram, for Malayalam, and C-DAC, Pune, for Urdu, Sindhi and Kashmiri.