Assistive Technology

Get the Flash Player to see this video.

Spectra - Speech-to-speech translation


Introduction

New technological advancements in speech, language, and media are changing the way people interact with devices and with one another. These same technologies have tremendous potential in building assistive technologies that enable users with disabilities to more easily use computers, communicate with others, browse the web, log in to secure websites, and navigate city streets.

Speech technologies are especially adaptable. Speech programs that convert between text and audio make it possible to convert emails and other text sources to audio for users with low vision, and for audio to be converted to text so users with impaired hearing can read voice mails and captions from broadcasts. The same speech interfaces that make mobile phones easier to use can be incorporated into almost any device -- from laptops to TV remotes to game platforms, further expanding the range of tasks that users with disabilities (including those with limited motion) can perform for themselves, giving users independence, autonomy, and privacy.

AT&T Research is focusing on how best to adapt its speech and language projects for the assistive technology market, estimated to be 41 million in the US alone (US Census Bureau, 2006 American Community Survey).   We are building assistive technology prototypes for our core markets (mobility, internet and entertainment services) -- see some examples below.  We are also pursuing basic research in collaboration with universities, companies and non-profits that work with people with disabilities.  If you are interested in working with us, get in touch with any of the researchers listed on this page!

Making technology work for those with disabilities improves technology for everyone, including the growing population of senior citizens. And by freeing users from staring at a screen or using a keyboard or other input device, assistive technologies enable hands-free control of devices and computers, allowing everyone to more safely multi-task and interact more freely with devices.

Get more info on:

Enabling Technologies

Our assistive technology research and prototypes are built on our basic speech, language and multimedia processing technologies, including:

Basic Research

Text-to-Speech Synthesis

AT&T researchers, along with the ASA Text-to-Speech (TTS) Technology working group (S3-WG91), are currently collecting data comparing the intelligibility of seven different synthetic speech systems at various speaking rates. The aim is to make TTS more usable by all, including users of mobile devices, children with learning disabilities, people with visual disabilities (see paper) or hearing impairments.

To participate, click here or, if you are legally blind, here.  A preliminary report of the results of these experiments can be found here.

Prototypes

Mobile

The iWalk prototype gives speech-mainly access to local business listings and walking directions.  iWalk was designed primarily as an assistive technology.  Features of iWalk include:

  • Runs on smart phones -- no special-purpose hardware required!
  • Accessibility is built in to the graphical user interface, which is high-contrast, has large text, and will read out the text in any screen area if the user hovers over it.
  • Permits speech and text input.
  • Provides text and speech output.
  • Includes the same large listing database as used in Speak4It.

See a demo of iWalk here. You can also read a short paper about iWalk here, or an interview with one of the developers of iWalk here.

Spectra is an iPhone application for interactive speech-to-speech translation. Features of Spectra include:

  • User choice of input and output languages.
  • Speech and text input.
  • Speech and text output of translations.
  • Presentation of output locally, on the user's device, and remotely through the network to a distant conversational partner.

See a demo of Spectra here. Find out more about the underlying technology here. Spectra is available from the iTunes store.

Television and Multimedia

iMIRACLE is a prototype iPad application that lets users search for video content by station, genre, title, or content keywords. Users can browse retrieved content, and can watch it on the iPad or on connected televisions. iMIRACLE could be useful for users with hearing loss or physical dexterity disabilities, who cannot "fast forward" through a TV show to the segment they want to watch.

See a demo of iMIRACLE here. More information about iMIRACLE is available here.

The iRemote prototype is an electronic program guide designed to reduce the television guide search problem:  it permits users to search for TV and movie listings by title, actor, genre or keyword, and integrates with Windows Media Center and the Microsoft set-top box. AT&T Labs researchers have made an accessible version of iRemote with the following features:

  • Speech commands.  A user with limited physical dexterity can do searches, browse search results, play a show or save to DVR using speech commands.
  • Text-to-speech for listing summaries and details.  A user with visual disabilities can turn on speech output with a single voice command.
  • A high-contrast graphical user interface for the iPhone with basic remote control functionalities.

You can read a short paper about EPGAAC here.

Education

The eReader prototype is another example of an existing technology that is being repurposed to make it more accessible - in this case, specifically for people with visual disabilities. It is built over the Calibre open-source eReader software.  Features include:

  • Speech-only and multimodal access to ebooks and RSS feeds. 
  • Speech commands and keyboard shortcuts.
  • Additional functionalities for people with print disabilities.  For example, the user can lookup a word or phrase by using a speech command.
  • Runs on many platforms -- Calibre runs on many platforms, and the speech functionalities are 'in the network'.

See a demo of the eReader prototype here.

The iPad-based StorEbook prototype is a different kind of eReader. It is designed to engage children with learning disabilities. It features:

  • Cross-modal feedback - text is highlighted as it is read to users.
  • Expressive speech - the text to be read is analyzed, and emotional and expressive features are incorporated into the synthesized speech.
  • Selectable voices - users can choose different synthetic voices for different characters in stories.

See a demo of the StorEbook prototype here.

Security and Privacy

SAFE is a prototype for multi-factor authentication. Following a simple enrollment procedure, users download the SAFE application onto their mobile device. Then, instead of using a password, users use SAFE to log into any participating website or application. SAFE can help users with visual disabilities, who may find it hard to use web forms for authentication. SAFE features:

  • Mobile authentication - authenticate from any location
  • Multi-factor authentication - using something you know, something you have and something you are

See a demo of SAFE here.

Partnerships

We are actively seeking partners in this research.  If you are:

  • A researcher at a university laboratory interested in assistive technology, please contact us!  We can discuss technology sharing and other possible collaborations.
  • A startup or small company interested in taking assistive technologies to market, please contact us!  We can discuss a range of possible collaborations.

Also, we periodically run evaluations of our basic technologies.  If you are interested in participating in a user study or evaluation, please get in touch!

Also at AT&T

AT&T values diversity in its workforce and customer base.  In 2012, AT&T ranked No. 1 in CAREERS & the disABLED magazine's 2012 list of “Top 50 Employers” for people with disabilities. Take a look at:

 

Documents (presentations, white papers)
1047e-johnston    1047e-johnston.pdf (731k)
1069p-stent    1069p-stent.pdf (582k)
afp057-stent    afp057-stent.pdf (384k)

Multimedia (videos, demos, interviews)
MIRACLE video content analysis  Demonstration of the MIRACLE video content analysis engine.  MIRACLE (1k)
AT&T SAFE  A demonstration of authentication using AT&T SAFE.  SAFE (2k)
Spectra - Speech-to-speech translation  This video is a demonstration of the Spectra speech-to-speech translation application.  Spectra (2k)
Taniya Mishra demonstrates the StorEBook expressive e-reader  Taniya Mishra demonstrates the StorEBook expressive e-reader.  StorEBook (3k)
eReader for people with visual disabilities  In this video, researcher Ben Stern introduces eReader, a speech-enabled e-reader for people with visual disabilities built over the open-source tool Calibre.  eReader (2k)
Demonstration of the iMIRACLE content-based multimedia retrieval system  Bernie Renger demonstrates the iMIRACLE content-based multimedia retrieval system on the iPad.  iMiracle (2k)
The iWalk navigation service for people with visual disabilities  In this video, ALFP fellow Shiri Azenkot and researcher Amanda Stent demo the initial prototype of iWalk, a local business search and navigation service for people with visual disabilities.  iwalk (2k)


Project Members

Alistair Conkie

Michael Johnston

Taniya Mishra

Bernard Renger

Horst Schroeter

Amanda Stent

Benjamin Stern

Ann Syrdal

Jay Wilpon

Eric Zavesky

Related Projects

Smart Grid

Telehealth

Content-Based Copy Detection

Enhanced Indexing and Representation with Vision-Based Biometrics

Project Space

Omni Channel Analytics

AT&T Application Resource Optimizer (ARO) - For energy-efficient apps

CHI Scan (Computer Human Interaction Scan)

CoCITe – Coordinating Changes in Text

Connecting Your World

Darkstar

Daytona

E4SS - ECharts for SIP Servlets

Scalable Ad Hoc Wireless Geocast

AT&T 3D Lab

Graphviz System for Network Visualization

Information Visualization Research - Prototypes and Systems

Swift - Visualization of Communication Services at Scale

AT&T Natural VoicesTM Text-to-Speech

Speech Mashup

Speech translation

StratoSIP: SIP at a Very High Level

Content Augmenting Media (CAM)

Content Acquisition Processing, Monitoring, and Forensics for AT&T Services (CONSENT)

MIRACLE and the Content Analysis Engine (CAE)

Social TV - View and Contribute to Public Opinions about Your Content Live

Visual API - Visual Intelligence for your Applications

Visual Semantics for Intuitive Mid-Level Representations

eClips - Personalized Content Clip Retrieval and Delivery

iMIRACLE - Content Retrieval on Mobile Devices with Speech

AT&T WATSON (SM) Speech Technologies

Wireless Demand Forecasting, Network Capacity Analysis, and Performance Optimization