
9505 Arboretum Blvd
Austin, TX
Enriching Domain-based Language Models Using Domain-independent WWW N-gram Corpus
Hisao Chang
Artificial Intelligence and Soft Computing Lecture Notes in Computer Science,
2012.
[BIB]
Springer Copyright
The definitive version was published in International Conference on Artificial Intelligence and Soft Computing. , Volume 7268/2012, Issue 10.1007/978-3-642-29350-4_5, 2012-04-29, 10.1007/978-3-642-29350-4_5
{This paper describes the techniques developed to enhance the search coverage of domain-specific Language Models (LM) by incorporating domain-independent language knowledge from the Google 1 trillion n-grams corporal created from general World Wide Web (WWW) documents. The purpose of our study is to explore a Natural Language (NL) based multimodal search interface for TV-oriented online Electronic Programming Guides (EPG) supporting both typed and spoken queries. The proposed method uses a two-pass procedure that combines the domain knowledge derived from an EPG-specific training corpus with the Google n-gram corporal as a domain-independent phrase dictionary with built-in usage counts from the web page authors. The enhanced LMs are able to achieve an absolute improvement of 23% on the model coverage (recall accuracy) without reducing the precision accuracy measured by Word Error Rate (WER) on a test set of in-domain spoken query utterances.}

Voice-Enabled Social TV
Bernard Renger, Junlan Feng, Ovidiu Dan, Hisao Chang, Luciano Barbosa
WWW2011,
2011.
[PDF]
[BIB]
ACM Copyright
The definitive version was published in WWW 2011. , 2011-03-28
{Until today, the TV viewing experience has been very unsocial compared to the World Wide Web. In this demo, we will present a Voice-enabled Social TV system, VoiSTV, which allows users to access twitter through the TV using voice. With this application, a user can receive and send twitter messages (tweets) through the TV while watching TV. Users can input tweets to be sent using spoken language. Beyond accessibility, VoiSTV also provides users metadata information about TV shows such as trends, hot topics, popularity of the show as well as aggregated sentiment of show-related tweets. }
Topics Inference by Weighted Mutual Information Measures Computed from Structured Corpus
Hisao Chang
16th International Conference on Applications of Natural Language to Information Systems,
2011.
[PDF]
[BIB]
Springer-Verlag Copyright
The definitive version was published in 16th International Conference on Applications of Natural Language to Information Systems (Springer, LNCS). , Volume LNCS 6716, 2011-06-27
{This paper proposes a new topic inference framework that is built on the scalability and adaptability of mutual information (MI) techniques. The framework is designed to explore a more robust language model (LM) for general topic-oriented search terms in the domain of electronic programming guide (EPG) for broadcast TV programs. �The topic inference system selects the most relevant topics from a search term, based on a simplified MI-based classifier trained from a highly structured XML-based text corpus, which is derived from continuously updated EPG data feeds.� The proposed framework is evaluated against a set of EPG-specific queries from a large user population collected from a real web-based IR system. The MI-base topic induction system is able to achieve 98 percent accuracy in recall measurement and 82 percent accuracy in precision measurement on the test set.}
System And Method To Search A Media Content Database Based On Voice Input Data,
Tue Jan 22 14:43:55 EST 2013
A computer implemented method includes initiating a call from an interactive voice response (IVR) system to a first device associated with a user in response to a user request. The computer implemented method includes receiving voice input data at the IVR system via the call. The computer implemented method also includes performing a search of a media content database based at least partially on the voice input data. The computer implemented method further includes sending search results identifying media content items based on the search of the media content database to a second device associated with the user.
Automated Demographic Analysis By Analyzing Voice Activity,
Tue Oct 30 16:12:08 EDT 2012
A method of generating demographic information relating to an individual is provided. The method includes monitoring an environment for a voice activity of an individual and detecting the voice activity of the individual. The method further includes analyzing the detected voice activity of the individual and determining, based on the detected voice activity of the individual, a demographic descriptor of the individual.
Methods And Apparatus To Present A Video Program To A Visually Impaired Person,
Tue Jul 24 16:11:07 EDT 2012
Methods and apparatus to present a video program to a visually impaired person are disclosed. An example method comprises receiving a video stream and an associated audio stream of a video program, detecting a portion of the video program that is not readily consumable by a visually impaired person, obtaining text associated with the portion of the video program, converting the text to a second audio stream, and combining the second audio stream with the associated audio stream.
System And Method For Automatically Transcribing Voicemail,
Tue Jun 12 16:10:44 EDT 2012
Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for automatically transcribing voicemail. The method includes receiving a plurality of voicemail messages from callers, identifying for each voicemail message in the plurality of voicemail messages a first frequency with which the respective caller leaves voicemails, identifying for each voicemail message in the plurality of voicemail messages a second frequency with which a user requests transcription of each voicemail, assigning a priority ranking to each voicemail message in the plurality of voicemail messages based on the respective first frequency and the respective second frequency, and transcribing untranscribed voicemail messages with a highest priority ranking. The method can include establishing a priority ranking threshold and repeatedly transcribing a next highest ranking untranscribed voicemail message until no further untranscribed voicemail messages remain above the priority ranking threshold.
Method And System For An Automated Departure Strategy,
Tue Jan 03 16:08:46 EST 2012
A method and system for an automated departure strategy from an automated system includes a track engine and an error engine. The track engine allows for the tracking and storing of one or more utterances spoken by a caller in response to one or more prompts. In addition, the track engine classifies each of the utterances with a confidence level. The error engine determines when one of the utterances initiates an error condition and transfers the caller to an operator on the occurrence of the error condition. In addition to transferring the caller to the operator, the error engine plays to the operator an error utterance, causing the error condition, and a preceding utterance, preceding the error utterance. Furthermore, the error engine populates an operator screen with information provided by the caller in the utterances for utterances classified with a high level of confidence.
Voice XML And Rule Engine Based Switchboard For Interactive Voice Response (IVR) Services,
Tue Jan 03 16:08:45 EST 2012
Call routing systems and methods are provided. A particular routing method comprises decoding a message based on an incoming call to determine whether a live agent is required. When a live agent is not required, a destination interactive voice response (IVR) application is identified, a determination of whether the destination IVR application is VoiceXML capable is made, and the incoming call is sent to the destination IVR application when the destination IVR application is VoiceXML capable. When the destination IVR application is not VoiceXML capable, a determination of whether the destination IVR application is capable of supporting an external data interface is made, and incoming call session data is routed to the destination IVR application when the destination IVR application is capable of supporting the external data interface. An audio file including one or more dual tone multi-frequency (DTMF) commands based on the incoming call session data is constructed and sent to the destination IVR application when the destination IVR application is not VoiceXML capable and is not capable of supporting an external data interface.
Directory Assistant Dialog With Configuration Switches To Switch From Automated Speech Recognition To Operator-assisted Dialog,
Tue Mar 22 16:04:42 EDT 2011
A method of providing a caller with a directory assistance dialog. The dialog is configurable, at any level of the dialog, from an automated speech recognition (ASR) dialog to an operator-assisted (OP) dialog. The dialog is handed off to an operator if any level of the speech recognition dialog fails. Also, a configuration switch may be set to cause the dialog to be handed off to an operator even if a response at some level of the ASR dialog is successfully recognized.
Automated Directory Assistance System For A Hybrid TDM/VoIP Network,
Tue Feb 01 16:04:27 EST 2011
An automated directory assistance platform architecture is provided for at least partial automatic processing of 411 calls from TDM-based telephone networks and from VoIP networks. The architecture includes three layers. One layer is a telephony network interface that accepts information from both TDM and VoIP based DA networks. The telephony layer sequesters the other two layers from the complexities of interacting with different source networks. Another layer is a VoiceXML-based IVR dialog engine that directs information received from the telephony interface. The third layer is an App Server Layer that processes information received from the dialog engine by retrieving information from an internet-accessible database. Calls that cannot be handled completely by automation are handed off to a live operator working in all IP environment.
System And Method Of Provding Multimedia Communiction Services,
Tue Jan 18 16:04:23 EST 2011
A system and method of providing multimedia communication services is disclosed. In a particular embodiment, the method includes receiving contextual information including a subscriber identification associated with a wireline communication device at an intelligent service switch (ISS) of an integrated wireline-wireless (IWW) network from a network edge device, where the network edge device has detected a service request at the wireline communication device. The method also includes determining at least one multimedia communication service based on the contextual information and at least one service filter associated with the subscriber identification.
Network Based Voice Activated Auto-Attendant Service With B2B Connectors,
Tue Dec 07 15:05:12 EST 2010
A network-based voice activated auto-attendant service is disclosed. In a particular embodiment, a data processor is provided that can construct an enterprise voice directory by executing instructions to encrypt eXtended Markup Language (XML)-based files using an encryption key issued by a voice activated auto-attendant service provider network to form encrypted XML-based files. The instructions are further to store the encrypted XML-based files in a manner that is accessible to the voice activated auto-attendant service provider network, and to create the enterprise voice directory based on the encrypted XML-based files. The enterprise voice directory is configured to provide run-time access to the voice activated auto-attendant service provider network.
System And Method For Identifying Telephone Callers,
Tue Feb 02 15:03:22 EST 2010
A method of processing calls received at an interactive voice response (IVR) server is provided and includes receiving a telephone call at the IVR. Caller identity data that is associated with the telephone call is received and a customer profile that includes a list of individual names associated with the caller identity data is received. Each of the individual names is mapped to a speech recognition grammar pattern. Further, a caller of the telephone call is prompted to speak their name. A spoken name from the caller is received and recorded. Moreover, the spoken name is converted into a speech recognition grammar pattern. Thereafter, the speech recognition grammar pattern associated with the spoken name is compared to each of the speech recognition grammar patterns associated with the individual names retrieved from the customer profile.
Bio-phonetic Multi-phrase Speaker Identity Verification,
Tue Jul 28 16:07:43 EDT 2009
Systems and methods for bio-phonetic multi-phrase speaker identity verification are disclosed. Generally, a speaker identity verification engine generates a dynamic phrase including at least one dynamically-generated word. The speaker identity verification engine prompts a user to speak the dynamic phrase and receives a dynamic phrase utterance. The speaker identity verification engine extracts at least one voice characteristic from the dynamic phrase utterance and compares the at least one voice characteristic with a voice profile the generate a score. The speaker identity verification engine then determines whether to accept a speaker identity claim based on the score.
Connecting Your World,
The need to be connected is greater than ever, and AT&T Researchers are creating new ways for people to connect with one another and with their environments, whether it's their home, office, or car.
Interview with Harry
Easy_Remote
(0k)