Spectra - Speech-to-speech translation
More and more nowadays devices talk back to you. Where before it was common to hear phone dialog systems speaking to you (and understanding), an increasing number of personal devices--laptops, smart phones, GSP navigation systems, and game devices talk to you, too.
How do they do that? What makes devices able to talk?
Text-to-speech (TTS) is the technology that makes it possible. The application takes text and from it produces artificial, machine-made speech. TTS has been around for many years, though only in the past few years has synthesized speech reached a high level of naturalness. Better sounding speech combined with the explosive popularity of small mobile devices with even smaller screens has increased consumer demand for TTS, especially since it frees people to multitask and drive more safely while using their devices.
People with special needs also benefit from TTS. For people with low vision, TTS reads text from files, books, and websites, making information accessible. For people who can't speak, TTS gives them a voice to speak with. Stephen Hawking is a famous example (he prefers his own instantly recognizable version of TTS). Students learning a new language can improve pronunciation or listening skills with TTS.
Corporations also like TTS because the technology can be a way to provide information effectively over the telephone.
Natural Voices is AT&T's state-of-the-art TTS product. Its starts with a database of high-quality recorded speech produced under optimum conditions with high-quality recording equipment. The individual sounds in the speech (called phonemes) are carefully labeled so that when a new word or sentence is required, the algorithms can select the best set of sounds to retrieve from the database, joining them together to be spoken. Knowing how to do this effectively is hard, and much of our research is devoted to improving these algorithms to achieve even more natural-sounding TTS in the future.
To try AT&T Natural VoicesTM, go to the demo page and enter text. The words you type are transmitted to an AT&T server running Natural Voices so you can hear your words spoken. Natural Voices supports English (both US and UK versions), German, Spanish, and French.
Read more about the evolution of the Natural Voices technology from the article "Mathmematics of . . . Artificial Speech" in Discover magazine.