Multilingual Enablement of an Existing Application
A crucial component in the multilingual enablement of an existing application is a "Transnizer", a tightly integrated speech recognition and translation system. A prototype system has been developed for Spanish-to-English translation in the context of "How May I Help You" (a customer care application). The bilingual translation system enables a speaker to converse naturally in Spanish using the existing English language semantic and dialog components of the HMIHY prototype system. We are working to extend the same paradigm to more languages and other speech-enabled services.
A transnizer is a stochastic finite-state transducer that integrates the language model of a speech recognizer and the translation model of a speech translator into one single finite-state transducer. Thus a transnizer directly maps source language phones into target language word sequences. A transnizer can be used in place of a language model of a speech recognizer to obtain a speech translation system. Speech-to-speech translation is achieved in one step, contrary to the more popular two-step approach of using a translation component as a backend that translates the output of a speech recognizer.
The challege for building a transnizer is to model, using finite-state transducers, the two sub problems of language translation (a) lexical selection: selecting the appropriate target word/phrase for a given source language word/phrase (b) lexical reordering: rearranging the selected target words/phrases into a well-formed target language utterance. The papers below discuss the construction of these finite-state transducers.
Rapid Development of a Machine Translation System
The present day machine translation systems have been built over a period of decades. These systems involve manual creation of rules which is both tedious and time-consuming. In recent times, corpus-based, statistical translation has emerged as an alternate paradigm to building machine translation systems. In this paradigm, rules for translation are automatically learned from a corpus consisting of source and target language sentence pairs (parallel corpus). Over the past few years, translation research at AT&T has focused on building different models in the statistical translation paradigm. We have demonstrated translation models for English-Spanish and English-Japanese and have shown them to perform well for spontaneous speech in domain-specific applications, such as "How May I Help You?".
Although the statistical translation paradigm has significantly reduced
the time to build a machine translation system, it relies heavily
on the availability of a parallel corpus. In project Anuvvad, we address
the issue of inducing and deriving parallel corpora with a view to further
decrease the time to build a machine translation system. We induce multiple
instances of parallel corpora from a monolingual corpus, by using off-the-shelf
machine translation systems. Each instance of the parallel corpus is viewed
as a canditate corpus translation. We then use algorithms to combine these
translation hypotheses into the "consensus" translation. We have investigated
various techniques that arrive at consensus translations using multi-sequence
aligment of the hypotheses translations. We have shown that our techniques
outperform off-the-shelf translation engines on two applications, a conference
registration system and a multilingual instant-messenger system (video
below).
Long Version (about 5 minutes)
Short Version (abount 2 minutes)
El Hubbub (about 3 minutes)
Find out about Mandolin
Real-time Broadcast News Machine TranslationOlder version
Real-time Broadcast News Machine Translation (about 12.5 minutes)