Algorithmically matching items to users in a given context is essential for the success and profitability of large scale recommender systems like content optimization, computational advertising, web search, shopping, and movie recommendation and so on. A key statistical problem that is essential to the success of such systems is to estimate response rates of some rare event (e.g. click-rates, buy rates, etc) when users interact with items. This is a very high dimensional estimation problem since data is obtained by interactions among several heavy-tailed categorical variables. In this talk, I will discuss statistical techniques based on large scale multi-level hierarchical models, some of which have been deployed and are successfully recommending articles and ads to users on Yahoo! websites. The methods described are based reduced rank logistic regression, probabilistic matrix factorization, supervised Latent Dirichlet Allocation, and multi-hierarchy smoothing.
Deepak Agarwal is a statistician at Yahoo! who is interested in developing statistical and machine learning methods to enhance the performance of large scale recommender systems. Deepak and his collaborators significantly improved article recommendation on several Yahoo! websites, most notably on the Yahoo! front page. He also works closely with teams in computational advertising, yet another large-scale recommender system. He serves as associate editor for the Journal of American Statistical Association and has received four best paper awards in the past.