
180 Park Ave - Building 103
Florham Park, NJ
Subject matter expert in speaker recognition, signal and speech processing
My major research interest is in robust, text-independent, speaker identification and verification systems. That includes deriving long-term acoustic features, like syllable, word and sentence level pitch and formant features and their fusion with short-term acoustic features.
I am also working on combining web and other data sources to build geo-centric language and voice models for better local businesses search.
Method of using a natural language interface to retrieve information from one or more data resources,
Tue Jan 09 13:31:03 EST 2001
A method of using at least one natural language query to retrieve information from one or more data resources and further performing a requested action using the retrieved information is disclosed. At least one natural language query directed to retrieving particular information is received. At least one object from the natural language query is extracted. The relationship between each of the at least one extracted objects is determined. A semantic representation is created from the at least one extracted objects. The semantic representation is compared to a knowledge structure. The knowledge structure is comprised of one or more grammars which are extracted from a plurality of data resources. The semantic representations are matched to the grammar. A database query is generated based on the matched objects. The query is applied to one or more of the data resources and information is retrieved. The requested action is then performed using the retrieved information.
Interface for a voice-activated connection system,
Tue Oct 24 13:31:04 EDT 2000
A method and apparatus for defining an interface for a system using voice commands to connect a first user to a second user over a network. The interface receives a request to define the interface for a particular user. The interface receives a first information item, and searches at least one database for a second information item indexed by the first information item. Another search is performed for a third information item indexed by the second information item. This process is continued until the interface has gathered sufficient information items to construct a natural language grammar for use by the interface. The interface uses the natural language grammar to parse commands given by the user for future communications connections.
Unsupervised HMM adaptation based on speech-silence discrimination,
Tue Jun 13 13:31:05 EDT 2000
An unsupervised, discriminative, sentence level, HMM adaptation based on speech-silence classification is presented. Silence and speech regions are determined either using a speech end-pointer or the segmentation obtained from the recognizer in a first pass. The discriminative training procedure using a GPD or any other discriminative training algorithm, employed in conjunction with the HMM-based recognizer, is then used to increase the discrimination between silence and speech.