Inside the Labs: A Summer Intern Making A Difference

by: Staff, Mon Oct 12 10:07:00 EDT 2009


This summer AT&T Labs Research hosted 27 Interns who came to Florham Park to work with researchers on special projects. Shiri Azenkot was one of them.  

From Pomona College with a degree in computer science and an interest in human-computer interaction, Shiri came by her internship courtesy of a fellowship awarded by the AT&T Labs Fellowship Program (ALFP). She was one of this year's three ALFP fellows.

Shiri's particular focus is making technology accessible to people with disabilities, especially people with low vision (she is one). She understands first hand how inaccessible software and websites can be, especially those with a heavy component of fancy graphics and small icons, which these days is pretty much all of them.

But she's also an engineer aware of the possible solutions, and in coming to AT&T Labs Research, she wanted to explore how technology could be used to create a navigation tool for enabling low-vision users to walk to a destination. Her requirements were that the tool had to be easy to use and run on existing, easily available hardware. What she wanted to avoid were specially designed (expensive) devices that would be bulky or require special learning.

For its part, AT&T Research could offer a range of speech technologies, including speech recognition (WATSON) and text-to-speech (Natural Voices).

Since speech as an interface has obvious benefits for users who have trouble seeing, Shiri and AT&T Research were a good fit from the start. Initial conversations with her mentors Amanda Stent and Ben Stern quickly centered on having the user speak an address or business name into a smartphone--small, connected, and ubiquitous--and hear back the walking instructions.

Use of a smartphone was possible by incorporating a third AT&T Research project, the speech mashup, to relay spoken utterances over the cell network to the servers running WATSON speech recognizers and Natural Voices text-to-speech conversion, essentially turning WATSON and Natural Voices into web services. The iPhone--though not known as a friendly device for users with low vision--was chosen as the first platform, and the tool was given the name iWalk.

With the speech interface pieces quickly falling into place, Shiri could concentrate on the server side programming and finding the right mapping service to provide the walking route (eventually deciding on the free CloudMade service). Plus there were the real-world problems to contend with, such as a best-case GPS accuracy of only 20 feet (an important distance at intersections), though even 20-foot accuracy was not always attainable; sometimes there was no GPS signal at all.

But in a relatively short time and in a manner than impressed her mentors, Shiri had iWalk up and running as follows: Using an iPhone, the user makes a general or specific query ("pizza" or "Stan's pizza") and gets back a list of options meeting the criteria, from which the user selects one; the tool then plots the route (using the iPhone's GPS coordinates) and speaks the route length and walking time. If the user says to continue, instructions for each leg of the route are spoken and reinforced by large-type screen text. The system monitors the progress of the user, speaking a new instruction as each leg or turn is completed.

Regular feedback in the form of beeps or vibrations was thought necessary to give the user reassurance that the system was active and hadn't stopped working mid-route; feedback could also warn when the user walked off course.

The tool as it works now fits within the initial constraints. But what isn't known is how well the tool fits in with user behavior. What information is most important when choosing a route? Time or distance? (Both are given now.) Should the entire course be described at the outset to help the user decide whether to take the route? How much information is too much for the user to process at a time? Is constant feedback helpful, or annoying? What form of feedback is best--beeps or vibrations? or is each user's preference unique?

Answers to these questions will come only after usability studies, which have yet to be done. In addition, Shiri wants to make the tool more usable by incorporating landmarks and public transportation; both are hard problems but both would substantially increase usability, as would expanding the number of smartphone platforms.

Even without knowing the particulars of the interaction, the tool already promises wider applicability than it was originally designed for: You don't need to be visually impaired to use it. Anyone needing directions can benefit, including aging boomers whose eyes are less than they were, and the younger set already conditioned to letting technology handle mundane tasks such as route mapping.

Shiri, now doing graduate work at the University of Washington, hopes to continue working on iWalk and finding the answers to the questions she and the tool have posed.

Here's the story of one intern who came this year to the Labs and what she was able to accomplish in one summer. Meet Shiri Azenkot and iWalk.