Machine translation (MT) is the automatic translation of human language by computers. For instance, an English --> Hindi MT system translates English (the source language) into Hindi (the target language). With the advent of the Internet and the World Wide Web, and ever-expanding international communication and commerce, there is an increasing need for quick and inexpensive translation. New Web pages are created daily in tremendous numbers, and many Web page authors would like their material to be readable immediately all around the globe. Likewise, there is need for fast e-mail communication between speakers of different languages. It is difficult to keep up with the volume via human translation alone.
Importance of MT
The results of MT research could impact major aspects of life, including politics, culture, science, philosophy, and business. If MT can become accurate and efficient enough, it can break down cultural barriers and make communication between speakers of different languages much easier. Commercially, MT can allow companies to translate product manuals or other addenda more quickly into the target language or languages. Thus, MT systems can expand a company’s market, save translators time and companies money in the process of translation. Scientifically and philosophically, the results from MT can be applied to areas such as Artificial Intelligence, Linguistics, and the philosophy of language.
Different Approaches to MT
There are three different approaches to MT -- direct, transfer and interlingua.
Direct: The direct method was the strategy adopted by most early MT systems. It is the most primitive and uses a one-stage process in which words in the source language are replaced with words in the target language. Some systems augment direct translation with a small set of rules to deal with simple grammatical differences between languages. This is also known as the transformer method.
Transfer: The transfer method first parse the sentence of the source language. It then applies rules that map the grammatical segments of the source sentence to a representation in the target language. Syntactic ambiguities are handled much better than in the direct approach, as the transfer method parses the sentence. Because of the parsing, lexical ambiguities can sometimes be resolved as well.
Interlingua: The main idea behind the interlingua method is that the analysis of any source language should result in a language-independent representation. The target language is then generated from that language-neutral representation. In a pure interlingua system there are no transfer rules as the representations should be common to all languages used by the system. The interlingual approach requires an analyzer for each source language and a generator for each target language. Analysis of source text requires a deep semantic analysis that requires extensive world knowledge.
Resources for MT
Dictionary
Corpus
Knowledge base