
200 S Laurel Ave - Bldg A
Middletown, NJ
http://www.research.att.com/~haffner/
Subject matter expert in Machine Learning
Making Sense of Customer Tickets in Cellular Networks
Yu Jin, Nicholas Duffield, Alexandre Gerber, Patrick Haffner, Wen Hsu, Guy Jacobson, Shobha Venkataraman, Zhi-Li Zhang, Subhabrata Sen
in Proc. IEEE INFOCOM Mini-Conference,
2011.
[PDF]
[BIB]
{Abstract�Effective management of large-scale cellular data
networks is critical to meet customer demands and expectations.
Customer calls for technical support provides direct indication as
to the issues and problems customers encounter. In this paper we
study the customer tickets � free-text recordings and classifications
by customer support agents � collected at a large cellular network
provider, with two inter-related goals: i) to characterize and
understand the major factors which lead to customers to call
and seek support; and ii) to utilize such customer tickets to
help identify potential network problems. For this purpose, we
develop a novel statistical approach to model customer call rates
which account for customer-side factors (e.g., user tenure and
handset types) as well as geo-locations. We show that most calls
are due to customer-side factors and can be well captured by the
model. Furthermore, we also demonstrate that location-specific
deviations from the model provide a good indicator of potential
network-side issues. The latter is corroborated with the detailed
analysis of customer tickets and other independent data sources
(non-ticket customer feedback and network performance data).}

Large-scale App-based Reporting of Customer Problems in Cellular Networks: Potential and Limitations
Yu Jin, Nicholas Duffield, Alexandre Gerber, Patrick Haffner, Wen Hsu, Guy Jacobson, Subhabrata Sen, Shobha Venkataraman, Zhi-Li Zhang
in Proc. ACM SIGCOMM Workshop on Measurements Up the STack (W-MUST),
2011.
[PDF]
[BIB]
ACM Copyright
(c) ACM, 2011. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM SIGCOMM Workshop on Measurements Up the STack (Wâ€MUST) , 2011-08-19.
{Multidimensional distributions are often used in data min- ing to describe and summarize different features of large datasets. It is natural to look for distinct classes in such datasets by clustering the data. A common approach entails the use of methods like k-means clustering. However, the k-means method inherently relies on the Euclidean metric in the embedded space and does not account for additional topology underlying the distribution.
In this paper, we propose using Earth Mover Distance (EMD) to compare multidimensional distributions. For a n-bin histogram, the EMD is based on a solution to the transportation problem with time complexity O(n3 log n). To mitigate the high computational cost of EMD, we pro- pose an approximation that reduces the cost to linear time.
Other notions of distances such as the information theo- retic Kullback-Leibler divergence and statistical χ2 distance, account only for the correspondence between bins with the same index, and do not use information across bins, and are sensitive to bin size. A cross-bin distance measure like EMD is not affected by binning differences and meaningfully matches the perceptual notion of “nearness”.
Our technique is simple, efficient and practical for clus- tering distributions. We demonstrate the use of EMD on a practical application of clustering over 400,000 anonymous mobility usage patterns which are defined as distributions over a manifold. EMD allows us to represent inherent re- lationships in this space. We show that EMD allows us to successfully cluster even sparse signatures and we compare the results with other clustering methods. Given the large size of our dataset a fast approximation is crucial for this application.}

Combining Frame and Segment Level Processing via Temporal Pooling for Phonetic Classification
Sumit Chopra, Patrick Haffner, Dimitrios Dimitriadis
12th Annual Conference of the International Speech Communication Association,
2011.
[PDF]
[BIB]
International Speech Communication Association Copyright
The definitive version was published in 12th Annual Conference of the International Speech Communication Association. , 2011-08-27
{We propose a simple, yet novel, multi-layer model for the problem of phonetic classification. Our model combines the frame level transformation of the acoustic signal with the segment level transformation via a temporal pooling architecture to compute class conditional probabilities of phones. Without the use of any phonetic knowledge, our model achieved the state-of-the-art performance on the TIMIT phone classification task. The flexibility of our model allows us to mix a variety of pooling architectures, leading
to further significant performance improvements.}
NEVERMIND, the Problem Is Already Fixed: Proactively Detecting and Troubleshooting Customer DSL Problems
Yu Jin, Nicholas Duffield, Alexandre Gerber, Patrick Haffner, Subhabrata Sen, Zhi-Li Zhang
in Proc. of ACM CoNext,
2010.
[PDF]
[BIB]
ACM Copyright
(c) ACM, 2010. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM CoNext 2010 , 2010-11-30.
{Traditional DSL troubleshooting solutions are reactive, relying
mainly on customers to report problems, and tend to
be labor-intensive, time consuming, prone to incorrect resolutions
and overall can contribute to increased customer
dissatisfaction. In this paper, we propose a proactive approach
to facilitate troubleshooting customer edge problems
and reducing customer tickets. Our system consists of: i) a
ticket predictor which predicts future customer tickets; and
ii) a trouble locator which helps technicians accelerate the
troubleshooting process during field dispatches. Both components
infer future tickets and trouble locations based on
existing sparse line measurements, and the inference models
are constructed automatically using supervised machine
learning techniques. We propose several novel techniques to
address the operational constraints in DSL networks and to
enhance the accuracy of NEVERMIND. Extensive evaluations
using an entire years worth of customer ticket and measurement
data from a large network show that our method
can predict thousands of future customer tickets per week
with high accuracy and reduce significantly reduce the time
and effort for diagnosing these tickets. This is beneficial as it
has the effect of both reducing the number of customer care
calls and improving customer satisfaction.}
Method And System For Classifying Image Elements,
Tue May 14 17:26:21 EDT 2013
A method, system, and machine-readable medium for classifying an image element as one of a plurality of categories, including assigning the image element based on a ratio between an unoccluded perimeter of the image element and an occluded perimeter of the image element and coding the image element according to a coding scheme associated with the category to which the image element is classified. Exemplary applications include image compression, where categories include image foreground and background layers.
Method And System For Classifying Image Elements,
Tue Aug 14 16:11:26 EDT 2012
A method, system, and machine-readable medium for classifying an image element as one of a plurality of categories, including assigning the image element based on a ratio between an unoccluded perimeter of the image element and an occluded perimeter of the image element and coding the image element according to a coding scheme associated with the category to which the image element is classified. Exemplary applications include image compression, where categories include image foreground and background layers.
System And Method Of Customizing Animated Entities For Use In A Multi-Media Communication Application,
Tue Feb 14 16:09:22 EST 2012
In an embodiment, a method is provided for creating a personal animated entity for delivering a multi-media message from a sender to a recipient. An image file from the sender may be received by a server. The image file may include an image of an entity. The sender may be requested to provide input with respect to facial features of the image of the entity in preparation for animating the image of the entity. After the sender provides the input with respect to the facial features of the image of the entity, the image of the entity may be presented as a personal animated entity to the sender to preview. Upon approval of the preview from the sender, the image of the entity may be presented as a sender-selectable personal animated entity for delivering the multi-media message to the recipient.
Multi-Class Classification Learning On Several Processors,
Tue Jul 19 16:05:44 EDT 2011
The time taken to learn a model from training examples is often unacceptable. For instance, training language understanding models with Adaboost or SVMs can take weeks or longer based on numerous training examples. Parallelization through the use of multiple processors may improve learning speed. The disclosure describes effective systems for distributed multiclass classification learning on several processors. These systems are applicable to multiclass models where the training process may be split into training of independent binary classifiers.
Method And System For Classifying Image Elements,
Tue Jul 19 16:05:42 EDT 2011
A method, system, and machine-readable medium for classifying an image element as one of a plurality of categories, including assigning the image element based on a ratio between an unoccluded perimeter of the image element and an occluded perimeter of the image element and coding the image element according to a coding scheme associated with the category to which the image element is classified. Exemplary applications include image compression, where categories include image foreground and background layers.
Method And Apparatus For Providing Fast Kernel Learning On Sparse Data,
Tue Jun 28 16:05:34 EDT 2011
A method and apparatus based on transposition to speed up learning computations on sparse data are disclosed. For example, the method receives an support vector comprising at least one feature represented by one non-zero entry. The method then identifies at least one column within a matrix with non-zero entries, wherein the at least one column is identified in accordance with the at least one feature of the support vector. The method then performs kernel computations using successive list merging on the at least one identified column of the matrix and the support vector to derive a result vector, wherein the result vector is used in a data learning function.
System And Method For Automatic Generation Of A Natural Language Understanding Model,
Tue Apr 26 16:05:03 EDT 2011
A system and method is provided for rapidly generating a new spoken dialog application. In one embodiment, a user experience person labels the transcribed data (e.g., 3000 utterances) using a set of interactive tools. The labeled data is then stored in a processed data database. During the labeling process, the user experience person not only groups utterances in various call type categories, but also flags (e.g., 100-200) specific utterances as positive and negative examples for use in an annotation guide. The labeled data in the processed data database can also be used to generate an initial natural language understanding (NLU) model.
System And Method Of Customizing Animated Entities For Use In A Multi-Media Communication Application,
Tue Apr 12 16:04:52 EDT 2011
In an embodiment, a method is provided for creating a personal animated entity for delivering a multi-media message from a sender to a recipient. An image file from the sender may be received by a server. The image file may include an image of an entity. The sender may be requested to provide input with respect to facial features of the image of the entity in preparation for animating the image of the entity. After the sender provides the input with respect to the facial features of the image of the entity, the image of the entity may be presented as a personal animated entity to the sender to preview. Upon approval of the preview from the sender, the image of the entity may be presented as a sender-selectable personal animated entity for delivering the multi-media message to the recipient.
On-Demand Language Translation For Television Programs,
Tue Oct 05 15:04:54 EDT 2010
In an embodiment, a method of providing an on demand translation service is provided. A subscriber may be charged a reduced fee or no fee for use of the on demand translation service in exchange for displaying commercial messages to the subscriber, the commercial messages being selected based on subscriber information. A multimedia signal including information in a source language may be received. The information may be obtained as text in the source language from the multimedia signal. The text may be translated from the source language to a target language. Translated information, based on the translated text, may be transmitted to a processing device for presentation to the subscriber. The received multimedia signal may be sent to a multimedia device for viewing.
Sequence Classification For Machine Translation,
Tue Aug 24 15:04:30 EDT 2010
Classification of sequences, such as the translation of natural language sentences, is carried out using an independence assumption. The independence assumption is an assumption that the probability of a correct translation of a source sentence word into a particular target sentence word is independent of the translation of other words in the sentence. Although this assumption is not a correct one, a high level of word translation accuracy is nonetheless achieved. In particular, discriminative training is used to develop models for each target vocabulary word based on a set of features of the corresponding source word in training sentences, with at least one of those features relating to the context of the source word. Each model comprises a weight vector for the corresponding target vocabulary word. The weights comprising the vectors are associated with respective ones of the features; each weight is a measure of the extent to which the presence of that feature for the source word makes it more probable that the target word in question is the correct one.
Systems And Methods For Monitoring Speech Data Labelers,
Tue May 04 15:03:49 EDT 2010
Systems and methods for using an annotation guide to label utterances and speech data with a call type. A method embodiment monitors labelers of speech data by presenting via a processor a test utterance to a labeler, receiving input from the labeler that selects a particular call type from a list of call types and determining via the processor if the labeler labeled the test utterance correctly. Based on the determining step, the method performs at least one of the following: revising the annotation guide, retraining the labeler or altering the test utterance.
On-Demand Language Translation For Television Programs,
Tue May 04 15:03:47 EDT 2010
A method, a system and a machine-readable medium are provided for an on demand translation service. A translation module including at least one language pair module for translating a source language to a target language may be made available for use by a subscriber. The subscriber may be charged a fee for use of the requested on demand translation service or may be provided use of the on demand translation service for free in exchange for displaying commercial messages to the subscriber. A video signal may be received including information in the source language, which may be obtained as text from the video signal and may be translated from the source language to the target language by use of the translation module. Translated information, based on the translated text, may be added into the received video signal. The video signal including the translated information in the target language may be sent to a display device.
Apparatus And Method Of Customizing Animated Intities For Use In Multi-media Communication Application,
Tue Mar 02 15:03:35 EST 2010
A method of creating an animated entity for delivering a multi-media message from a sender to a recipient comprises receiving from the sender an image file to a server, the image file having associated sender-assigned name, gender, category and indexing information. The server presents to the sender the image file and a group of generic face model templates. After the sender selects one of the generic face model templates, the server presents the image file and the selected model template to the sender and requests the sender to mark features on the image file. After the sender marks the image file, the server presents to the sender a preview of at least one expression associated with the marked image file. If the user does not accept the image file after the preview, the server presents again the image file and selected model template for the sender to redo or add marked features on the image file. If the user accepts the image file after the preview, the server presents the image file as an optional animated entity when the sender chooses an animated entity to deliver a multi-media message.
Method and Apparatus For Providing Fast Kernal Learning On Sparse Data,
Tue Feb 16 15:03:34 EST 2010
A method and apparatus based on transposition to speed up learning computations on sparse data are disclosed. For example, the method receives an support vector comprising at least one feature represented by one non-zero entry. The method then identifies at least one column within a matrix with non-zero entries, wherein the at least one column is identified in accordance with the at least one feature of the support vector. The method then performs kernel computations using successive list merging on the at least one identified column of the matrix and the support vector to derive a result vector, wherein the result vector is used in a data learning function.
Method And System For Classifying Image Elements,
Tue Feb 02 15:03:23 EST 2010
A method, system, and machine-readable medium for classifying an image element as one of a plurality of categories, including assigning the image element based on a ratio between an unoccluded perimeter of the image element and an occluded perimeter of the image element and coding the image element according to a coding scheme associated with the category to which the image element is classified. Exemplary applications include image compression, where categories include image foreground and background layers.
System And Method Of Customizing Animated Entities For Use In A Multi-Media Communication Application,
Tue Oct 27 16:08:07 EDT 2009
In an embodiment, a method is provided for creating a personal animated entity for delivering a multi-media message from a sender to a recipient. An image file from the sender may be received by a server. The image file may include an image of an entity. The sender may be requested to provide input with respect to facial features of the image of the entity in preparation for animating the image of the entity. After the sender provides the input with respect to the facial features of the image of the entity, the image of the entity may be presented as a personal animated entity to the sender to preview. Upon approval of the preview from the sender, the image of the entity may be presented as a sender-selectable personal animated entity for delivering the multi-media message to the recipient.
Systems And Methods For Generating An Annotation Guide,
Tue Jul 28 16:07:46 EDT 2009
Systems and methods for generating an annotation guide. Speech data is organized and presented to a user. After the user selects some of the utterances in the speech data, the selected utterances are included in a class and/or call type. Additional utterances that belong to the class and/or call type can be found in the speech data using relevance feedback, data mining, data clustering, support vector machines, and the like. After a call type is complete, it is committed to the annotation guide. After all call types are completed, the annotation guide is generated.
Methods To Distribute Multiclass Classification Learning On Several Processors,
Tue Jun 23 16:07:30 EDT 2009
The time taken to learn a model from training examples is often unacceptable. For instance, training language understanding models with Adaboost or SVMs can take weeks or longer based on numerous training examples. Parallelization thought the use of multiple processors may improve learning speed. The invention describes effective methods to distributed multiclass classification learning on several processors. These methods are applicable to multiclass models where the training process may be split into training of independent binary classifiers.
System and method of customizing animated entities for use in a multi-media communication application,
Tue May 27 18:12:52 EDT 2008
In an embodiment, a method is provided for creating a personal animated entity for delivering a multi-media message from a sender to a recipient. An image file from the sender may be received by a server. The image file may include an image of an entity. The sender may be requested to provide input with respect to facial features of the image of the entity in preparation for animating the image of the entity. After the sender provides the input with respect to the facial features of the image of the entity, the image of the entity may be presented as a personal animated entity to the sender to preview. Upon approval of the preview from the sender, the image of the entity may be presented as a sender-selectable personal animated entity for delivering the multi-media message to the recipient.
Systems and methods for monitoring speech data labelers,
Tue Oct 09 18:12:17 EDT 2007
Systems and methods for monitoring labelers of speech data. To test or train labelers, a labeler is presented with utterances that have already been identified as belonging to a particular class or call type. The labeler is asked to assign a call type to the utterances. The performance of the labeler is measured by comparing the call types assigned by the labeler with the existing call types of the utterances. The performance of a labeler can also be monitored as the labeler labels speech data by occasionally having the labeler label an utterance that is already labeled and by storing the results.
Systems and methods for generating an annotation guide,
Tue May 15 18:12:01 EDT 2007
Systems and methods for generating an annotation guide. Speech data is organized and presented to a user. After the user selects some of the utterances in the speech data, the selected utterances are included in a class and/or call type. Additional utterances that belong to the class and/or call type can be found in the speech data using relevance feedback, data mining, data clustering, support vector machines, and the like. After a call type is complete, it is committed to the annotation guide. After all call types are completed, the annotation guide is generated.
System and method of customizing animated entities for use in a multi-media communication application,
Tue Aug 15 18:11:29 EDT 2006
A method of creating a personal animated entity for delivering a multi-media message from a sender to a recipient is disclosed. The method comprises receiving from the sender an image file at a server, the image file having an entity and a background image. The server presents to the sender the image file and requests the sender to mark features on the image file. After the sender marks the image file, the server presents to the sender the image file as an optional animated entity when the sender chooses an animated entity to deliver a multi-media message. If the sender selects the image file for delivering the multi-media message, the server delivers the multi-media message using the personal animated entity in the context of the background image of the image file. Extrapolation is used to fill in background voids created by the movement of the personal animated entity in the context of the background image.
Method And System For Classifying Image Elements,
Tue May 31 18:10:23 EDT 2005
A method, system, and machine-readable medium for classifying an image element as one of a plurality of categories, including assigning the image element based on a ratio between an unoccluded perimeter of the image element and an occluded perimeter of the image element and coding the image element according to a coding scheme associated with the category to which the image element is classified. Exemplary applications include image compression, where categories include image foreground and background layers.