
AT&T Natural VoicesTM Text-to-Speech
Spectra - Speech-to-speech translation
More and more nowadays devices talk back to you. Where before it was common to hear phone dialog systems speaking to you (and understanding), an increasing number of personal devices--laptops, smart phones, GSP navigation systems, and game devices talk to you, too.
How do they do that? What makes devices able to talk?
Text-to-speech (TTS) is the technology that makes it possible. The application takes text and from it produces artificial, machine-made speech. TTS has been around for many years, though only in the past few years has synthesized speech reached a high level of naturalness. Better sounding speech combined with the explosive popularity of small mobile devices with even smaller screens has increased consumer demand for TTS, especially since it frees people to multitask and drive more safely while using their devices.
People with special needs also benefit from TTS. For people with low vision, TTS reads text from files, books, and websites, making information accessible. For people who can't speak, TTS gives them a voice to speak with. Stephen Hawking is a famous example (he prefers his own instantly recognizable version of TTS). Students learning a new language can improve pronunciation or listening skills with TTS.
Corporations also like TTS because the technology can be a way to provide information effectively over the telephone.
Natural Voices is AT&T's state-of-the-art TTS product. Its starts with a database of high-quality recorded speech produced under optimum conditions with high-quality recording equipment. The individual sounds in the speech (called phonemes) are carefully labeled so that when a new word or sentence is required, the algorithms can select the best set of sounds to retrieve from the database, joining them together to be spoken. Knowing how to do this effectively is hard, and much of our research is devoted to improving these algorithms to achieve even more natural-sounding TTS in the future.
To try AT&T Natural VoicesTM, go to the demo page and enter text. The words you type are transmitted to an AT&T server running Natural Voices so you can hear your words spoken. Natural Voices supports English (both US and UK versions), German, Spanish, and French.
Read more about the evolution of the Natural Voices technology from the article "Mathmematics of . . . Artificial Speech" in Discover magazine.
Project Members
Related Projects
AT&T Application Resource Optimizer (ARO) - For energy-efficient apps
CHI Scan (Computer Human Interaction Scan)
CoCITe – Coordinating Changes in Text
E4SS - ECharts for SIP Servlets
Scalable Ad Hoc Wireless Geocast
Graphviz System for Network Visualization
Information Visualization Research - Prototypes and Systems
Swift - Visualization of Communication Services at Scale
StratoSIP: SIP at a Very High Level
Content Augmenting Media (CAM)
Content Acquisition Processing, Monitoring, and Forensics for AT&T Services (CONSENT)
MIRACLE and the Content Analysis Engine (CAE)
Social TV - View and Contribute to Public Opinions about Your Content Live
Enhanced Indexing and Representation with Vision-Based Biometrics
Visual Semantics for Intuitive Mid-Level Representations
eClips - Personalized Content Clip Retrieval and Delivery
iMIRACLE - Content Retrieval on Mobile Devices with Speech
AT&T WATSON (SM) Speech Technologies
Wireless Demand Forecasting, Network Capacity Analysis, and Performance Optimization