
180 Park Ave - Building 103
Florham Park, NJ
Connecting Your World,
The need to be connected is greater than ever, and AT&T Researchers are creating new ways for people to connect with one another and with their environments, whether it's their home, office, or car.
Two of a Kind or The Ratings Game? Adaptive Pairwise Preferences and Latent Factor Models
Suhrid Balakrishnan, Sumit Chopra
Frontiers of Computer Science Journal,
2012.
[PDF]
[BIB]
Hihger Education Press Copyright
The definitive version was published in Frontiers of Computer Science Journal. , Issue 2, 2012-04-01, http://journal.hep.com.cn/computer/
{Latent factor models have become a workhorse for a large number of recommender systems. While these systems are built using ratings data, which is typically assumed static, the ability to incorporate different kinds of subsequent user feedback is an important asset. For instance, the user might want to provide additional information to the system in order to improve his personal recommendations. To this end, we examine a novel scheme for efficiently learning (or refining) user parameters from such feedback. We propose a scheme where users are presented with a sequence of pairwise preference questions: ``Do you prefer item A over B?". User parameters are updated based on their response, and subsequent questions are chosen adaptively after incorporating the feedback. We operate in a Bayesian framework and the choice of questions is based on an information gain criterion. We validate the scheme on the Netflix movie ratings data set and a proprietary television viewership data set. A user study and automated experiments validate our findings.}

TapPrints: Your Finger Taps have Fingerprints
Emiliano Miluzzo, Alexander Varshavsky, Suhrid Balakrishnan, Romit Roy Choudhury
The 10th International Conference on Mobile Systems, Applications and Services (MobiSys 2012),
2012.
[PDF]
[BIB]
ACM Copyright
(c) ACM, 2012. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in The 10th International Conference on Mobile Systems, Applications and Services (MobiSys 2012) , 2012-06-26.
{This paper shows that the location of screen taps on modern smartphones and tablets can be identified from accelerometer and gyroscope readings. Our findings have serious implications, as we demonstrate that an attacker can launch a background process on commodity smartphones and tablets, and silently monitor the user�s inputs, such as keyboard presses and icon taps. While precise tap detection is non-trivial, requiring machine learning algorithms to identify fingerprints of closely spaced keys, sensitive sensors on modern devices aid the process. We present TapPrints, a framework for inferring the location of taps on mobile device touch- screens using motion sensor data combined with machine learning analysis. By running tests on two different off-the-shelf smartphones and a tablet computer we show that identifying tap locations on the screen and inferring English letters could be done with up to 90% and 80% accuracy, respectively. By optimizing the core tap detection capability with additional information, such as contextual priors, we are able to further magnify the core threat.}

Computational Television Advertising
Suhrid Balakrishnan, Sumit Chopra, David Applegate, Simon Urbanek
IEEE International Conference on Data Mining,
2012.
[PDF]
[BIB]
IEEE Copyright
This version of the work is reprinted here with permission of IEEE for your personal use. Not for redistribution. The definitive version was published in 2012. , 2012-12-12
{Ever wonder why that Kia Ad ran during Iron Chef?
While advertising on television is still a robust business, providing a fascinating mix of marketing, branding, predictive modeling and measurements, it is at risk with the recent emergence of online television. Traditional methods used to generate advertising
campaigns on television do not come close to the highly sophisticated computational techniques being used in the online world, in terms of efficiency. This paper is an attempt to recast the process of television advertising media campaign generation in a computational framework. We describe efficient mathematical approaches to solve for the task of finding optimal campaigns for specific target audiences. We highlight the efficacy of our proposed methods and compare them using two case studies against
campaigns generated by traditional methods. }
Collaborative Ranking
Suhrid Balakrishnan, Sumit Chopra
Fifth International Conference on Web Search and Data Mining,
2012.
[PDF]
[BIB]
ACM Copyright
(c) ACM, 2011. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Fifth International Conference on Web Search and Data Mining , 2012-02-08.
{Typical recommender systems use the root mean squared error (RMSE)
between the predicted and actual ratings as the evaluation metric.
We argue that RMSE is not an optimal choice for this task,
especially when we will only recommend a few (top) items to any user.
Instead, we propose using a ranking metric, namely normalized discounted
cumulative gain (NDCG), as a better evaluation metric for this task.
Borrowing ideas from the learning to rank community for web search,
we propose novel models which approximately optimize NDCG for the recommendation task.
Our models are essentially variations on matrix factorization models where we also
additionally learn the features associated with the users and the items for the ranking task.
Experimental results on a number of standard collaborative filtering data sets validate our claims.
The results also show the accuracy and efficiency of our models and the benefits
of learning features for ranking. }

Combining Predictors for Recommending Music: the False Positives' approach to KDD Cup track 2
Suhrid Balakrishnan, Rensheng Wang, Carlos Scheidegger, Angus Maclellan, Yifan Hu, Aaron Archer, David Applegate, Shankar Krishnan, Guang Ma, Siu Au
KDD CUP 2011 workshop.,
2011.
[PDF]
[BIB]
ACM Copyright
(c) ACM, 2011. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in KDD CUP 2011 workshop , 2011-08-21.
{We describe our solution for the KDD Cup 2011 track 2 challenge. Our solution relies heavily on ensembling together diverse individual models for the prediction task, and achieved a final leaderboard misclassification rate of 3.8863\%. This paper provides details on both the modeling and ensemble creation steps.}
Two of a Kind or The Ratings Game? Adaptive Pairwise Preferences and Latent Factor Models
Sumit Chopra, Suhrid Balakrishnan
IEEE ICDM conference, 2010,
2010.
[BIB]
{Latent factor models have become a workhorse for a large number of recommender systems. While these systems are built using ratings data, which is typically assumed static, the ability to incorporate different kinds of subsequent user feedback is an important asset. For instance, the user might want to provide additional information to the system in order to improve his personal recommendations. To this end, we examine a novel scheme for efficiently learning (or refining) user parameters from such feedback. We propose a scheme where users are presented with a sequence of pairwise preference questions: ``Do you prefer item A over B?". User parameters are updated based on their response, and subsequent questions are chosen adaptively after incorporating the feedback. We operate in a Bayesian framework and the choice of questions is based on an information gain criterion. We validate the scheme on the Netflix movie ratings data set and a proprietary television viewership data set. A user study and automated experiments validate our findings.}

The Business Next Door: Click-Through Rate Modeling for Local Search
Suhrid Balakrishnan, Sumit Chopra, Dan Melamed
NIPS 2010 Workshop: Machine Learning in Online ADvertising,
2010.
[PDF]
[BIB]
MIT Press, Neural Information Processing Systems (NIPS) Copyright
The definitive version was published in NIPS 2010 Workshop: Machine Learning in Online ADvertising. , 2010-12-10, http://nips.cc/
{Computational advertising has received a tremendous amount of attention from the business and academic community recently. While great advances have been made on modeling click-through rate in well studied settings like sponsored search and context match, local search, has received relatively less attention. The geographic nature of local search and associated local browsing makes interesting research challenges and opportunities possible. We consider a novel application of a relational regression model to local search. The model is attractive in that it allows us to explicitly control and represent geographic and category-based neighborhood style constraints on the samples that result in superior click-through rate estimates. Further, the relational regression model we fit allows us to estimate an interpretable inherent `quality' of a business listing which we demonstrate reveals interesting latent information about listings and is also useful for further analysis.
}

FEATURE-RICH CONTINUOUS LANGUAGE MODELS FOR SPEECH RECOGNITION
Sumit Chopra, Piotr Mirowski, Suhrid Balakrishnan, Srinivas Bangalore
IEEE Workshop on Spoken Language Technology,
2010.
[BIB]
{State-of-the-art probabilistic models of text such as n-grams require an exponential number of examples as the size of the context grows, a problem that is due to the discrete word representation. We propose to solve this problem by learning a continuous-valued and low-dimensional mapping of words, and base our predictions for the probabilities of the target word on non-linear dynamics of the latent space representation of the words in context window. We build on neural networks-based language models; by expressing them as energy-based models, we can further enrich the models with additional inputs such as part-of-speech tags, topic information and graphs of word similarity. We demonstrate a significantly lower perplexity on different text corpora, as well as improved word accuracy rate on speech recognition tasks, as compared to Kneser-Ney back-off n-gram-based language models.}
Reinforcement Learning for Dialog Management using Least-Squares Policy Iteration and Fast Feature Selection
Lihong Li, Suhrid Balakrishnan, Jason Williams
Proc Interspeech, Brighton, United Kingdom,
2009.
[PDF]
[BIB]
Estimating probability of correctness for ASR N-Best lists
Jason Williams, Suhrid Balakrishnan
Proc SIGDIAL, London, United Kingdom,
2009.
[PDF]
[BIB]
System And Method For Automatically Generating A Dialog Manager,
Tue Apr 30 17:26:11 EDT 2013
Disclosed herein are systems, methods, and computer-readable storage media for automatically generating a dialog manager for use in a spoken dialog system. A system practicing the method receives a set of user interactions having features, identifies an initial policy, evaluates all of the features in a linear evaluation step of the algorithm to identify a set of most important features, performs a cubic policy improvement step on the identified set of most important features, repeats the previous two steps one or more times, and generates a dialog manager for use in a spoken dialog system based on the resulting policy and/or set of most important features. Evaluating all of the features can include estimating a weight for each feature which indicates how much each feature contributes to at least one of the identified policies. The system can ignore features not in the set of most important features.