
180 Park Ave - Building 103
Florham Park, NJ
Two of a Kind or The Ratings Game? Adaptive Pairwise Preferences and Latent Factor Models
Suhrid Balakrishnan, Sumit Chopra
Frontiers of Computer Science Journal,
2012.
[PDF]
[BIB]
Hihger Education Press Copyright
The definitive version was published in Frontiers of Computer Science Journal. , Issue 2, 2012-04-01, http://journal.hep.com.cn/computer/
{Latent factor models have become a workhorse for a large number of recommender systems. While these systems are built using ratings data, which is typically assumed static, the ability to incorporate different kinds of subsequent user feedback is an important asset. For instance, the user might want to provide additional information to the system in order to improve his personal recommendations. To this end, we examine a novel scheme for efficiently learning (or refining) user parameters from such feedback. We propose a scheme where users are presented with a sequence of pairwise preference questions: ``Do you prefer item A over B?". User parameters are updated based on their response, and subsequent questions are chosen adaptively after incorporating the feedback. We operate in a Bayesian framework and the choice of questions is based on an information gain criterion. We validate the scheme on the Netflix movie ratings data set and a proprietary television viewership data set. A user study and automated experiments validate our findings.}

Computational Television Advertising
Suhrid Balakrishnan, Sumit Chopra, David Applegate, Simon Urbanek
IEEE International Conference on Data Mining,
2012.
[PDF]
[BIB]
IEEE Copyright
This version of the work is reprinted here with permission of IEEE for your personal use. Not for redistribution. The definitive version was published in 2012. , 2012-12-12
{Ever wonder why that Kia Ad ran during Iron Chef?
While advertising on television is still a robust business, providing a fascinating mix of marketing, branding, predictive modeling and measurements, it is at risk with the recent emergence of online television. Traditional methods used to generate advertising
campaigns on television do not come close to the highly sophisticated computational techniques being used in the online world, in terms of efficiency. This paper is an attempt to recast the process of television advertising media campaign generation in a computational framework. We describe efficient mathematical approaches to solve for the task of finding optimal campaigns for specific target audiences. We highlight the efficacy of our proposed methods and compare them using two case studies against
campaigns generated by traditional methods. }
Collaborative Ranking
Suhrid Balakrishnan, Sumit Chopra
Fifth International Conference on Web Search and Data Mining,
2012.
[PDF]
[BIB]
ACM Copyright
(c) ACM, 2011. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Fifth International Conference on Web Search and Data Mining , 2012-02-08.
{Typical recommender systems use the root mean squared error (RMSE)
between the predicted and actual ratings as the evaluation metric.
We argue that RMSE is not an optimal choice for this task,
especially when we will only recommend a few (top) items to any user.
Instead, we propose using a ranking metric, namely normalized discounted
cumulative gain (NDCG), as a better evaluation metric for this task.
Borrowing ideas from the learning to rank community for web search,
we propose novel models which approximately optimize NDCG for the recommendation task.
Our models are essentially variations on matrix factorization models where we also
additionally learn the features associated with the users and the items for the ranking task.
Experimental results on a number of standard collaborative filtering data sets validate our claims.
The results also show the accuracy and efficiency of our models and the benefits
of learning features for ranking. }

NON-LINEAR TAGGING MODELS WITH LOCALIST AND DISTRIBUTED WORD REPRESENTATIONS
Sumit Chopra, Srinivas Bangalore
The 36th International Conference on Acoustics, Speech and Signal Processing ,
2011.
[BIB]
{Distributed representations of words are attractive since they provide a means for measuring word similarity. However, most approaches to learning distributed representations are divorced from the task context. In this paper, we describe a model that learns distributed representations of words in order to optimize task performance. We investigate this model for part-of-speech tagging and supertagging tasks and demonstrate its superior accuracy over localist models, especially for rare words. We also show that adding non-linearity in the model aids in improved accuracy for complex tasks such as supertagging. }
Combining Frame and Segment Level Processing via Temporal Pooling for Phonetic Classification
Sumit Chopra, Patrick Haffner, Dimitrios Dimitriadis
12th Annual Conference of the International Speech Communication Association,
2011.
[PDF]
[BIB]
International Speech Communication Association Copyright
The definitive version was published in 12th Annual Conference of the International Speech Communication Association. , 2011-08-27
{We propose a simple, yet novel, multi-layer model for the problem of phonetic classification. Our model combines the frame level transformation of the acoustic signal with the segment level transformation via a temporal pooling architecture to compute class conditional probabilities of phones. Without the use of any phonetic knowledge, our model achieved the state-of-the-art performance on the TIMIT phone classification task. The flexibility of our model allows us to mix a variety of pooling architectures, leading
to further significant performance improvements.}
Two of a Kind or The Ratings Game? Adaptive Pairwise Preferences and Latent Factor Models
Sumit Chopra, Suhrid Balakrishnan
IEEE ICDM conference, 2010,
2010.
[BIB]
{Latent factor models have become a workhorse for a large number of recommender systems. While these systems are built using ratings data, which is typically assumed static, the ability to incorporate different kinds of subsequent user feedback is an important asset. For instance, the user might want to provide additional information to the system in order to improve his personal recommendations. To this end, we examine a novel scheme for efficiently learning (or refining) user parameters from such feedback. We propose a scheme where users are presented with a sequence of pairwise preference questions: ``Do you prefer item A over B?". User parameters are updated based on their response, and subsequent questions are chosen adaptively after incorporating the feedback. We operate in a Bayesian framework and the choice of questions is based on an information gain criterion. We validate the scheme on the Netflix movie ratings data set and a proprietary television viewership data set. A user study and automated experiments validate our findings.}

The Business Next Door: Click-Through Rate Modeling for Local Search
Suhrid Balakrishnan, Sumit Chopra, Dan Melamed
NIPS 2010 Workshop: Machine Learning in Online ADvertising,
2010.
[PDF]
[BIB]
MIT Press, Neural Information Processing Systems (NIPS) Copyright
The definitive version was published in NIPS 2010 Workshop: Machine Learning in Online ADvertising. , 2010-12-10, http://nips.cc/
{Computational advertising has received a tremendous amount of attention from the business and academic community recently. While great advances have been made on modeling click-through rate in well studied settings like sponsored search and context match, local search, has received relatively less attention. The geographic nature of local search and associated local browsing makes interesting research challenges and opportunities possible. We consider a novel application of a relational regression model to local search. The model is attractive in that it allows us to explicitly control and represent geographic and category-based neighborhood style constraints on the samples that result in superior click-through rate estimates. Further, the relational regression model we fit allows us to estimate an interpretable inherent `quality' of a business listing which we demonstrate reveals interesting latent information about listings and is also useful for further analysis.
}

FEATURE-RICH CONTINUOUS LANGUAGE MODELS FOR SPEECH RECOGNITION
Sumit Chopra, Piotr Mirowski, Suhrid Balakrishnan, Srinivas Bangalore
IEEE Workshop on Spoken Language Technology,
2010.
[BIB]
{State-of-the-art probabilistic models of text such as n-grams require an exponential number of examples as the size of the context grows, a problem that is due to the discrete word representation. We propose to solve this problem by learning a continuous-valued and low-dimensional mapping of words, and base our predictions for the probabilities of the target word on non-linear dynamics of the latent space representation of the words in context window. We build on neural networks-based language models; by expressing them as energy-based models, we can further enrich the models with additional inputs such as part-of-speech tags, topic information and graphs of word similarity. We demonstrate a significantly lower perplexity on different text corpora, as well as improved word accuracy rate on speech recognition tasks, as compared to Kneser-Ney back-off n-gram-based language models.}