skip to main content
10.1145/2020408.2020433acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Scalable distributed inference of dynamic user interests for behavioral targeting

Published: 21 August 2011 Publication History

Abstract

Historical user activity is key for building user profiles to predict the user behavior and affinities in many web applications such as targeting of online advertising, content personalization and social recommendations. User profiles are temporal, and changes in a user's activity patterns are particularly useful for improved prediction and recommendation. For instance, an increased interest in car-related web pages may well suggest that the user might be shopping for a new vehicle.In this paper we present a comprehensive statistical framework for user profiling based on topic models which is able to capture such effects in a fully \emph{unsupervised} fashion. Our method models topical interests of a user dynamically where both the user association with the topics and the topics themselves are allowed to vary over time, thus ensuring that the profiles remain current.
We describe a streaming, distributed inference algorithm which is able to handle tens of millions of users. Our results show that our model contributes towards improved behavioral targeting of display advertising relative to baseline models that do not incorporate topical and/or temporal dependencies. As a side-effect our model yields human-understandable results which can be used in an intuitive fashion by advertisers.

References

[1]
D. Agarwal and S. Merugu. Predictive discrete latent factor models for large scale dyadic data. KDD, 2007.
[2]
A. Ahmed and E. P. Xing. Dynamic non-parametric mixture models and the recurrent chinese restaurant process In SDM, pages 219--230. SIAM, 2008.
[3]
A. Ahmed and E. P. Xing. Timeline: A dynamic hierarchical dirichlet process model for recovering birth / death and evolution of topics in text stream. In UAI, 2010.
[4]
A. Asuncion, P. Smyth, and M. Welling. Asynchronous distributed learning of topic models. In NIPS, pages 81--88. MIT Press, 2008.
[5]
D. Blackwell and J. MacQueen. Ferguson distributions via polya urn schemes. The Annals of Statistics, 1973.
[6]
D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. JMLR, 3:993--1022, 2003.
[7]
D. M. Blei and J. D. Lafferty. Dynamic topic models. In ICML, volume 148, pages 113--120. ACM, 2006.
[8]
D. M. Blei and J. D. McAuliffe. Supervised topic models. In NIPS. MIT Press, 2007.
[9]
K. R. Canini, L. Shi, and T. L. Griffiths. Online inference of topics with latent dirichlet allocation. In AISTATS, 2009.
[10]
Y. Chen, D. Pavlov, and J. F. Canny. Large-scale behavioral targeting. In KDD, pages 209--218, 2009.
[11]
S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman. Indexing by latent semantic analysis. Am. Soc. for Information Science, 41, 1990.
[12]
S. Gauch, M. Speretta, A. Chandramouli, and A. Micarelli. User profiles for personalized information access. In LNCS 4321, Springer, 2007.
[13]
R. Ghosh and M. Dekhil. Discovering user profiles.% In WWW, pages 1233--1234, 2009.
[14]
T.L. Griffiths and M. Steyvers. Finding scientific topics. PNAS, 101:5228--5235, 2004.
[15]
A. Hassan, R. Jones, and K. L. Klinkner. Beyond DCG: User behavior as a predictor of a successful search. In WSDM 2010, pages 221--230, 2010.
[16]
T. Hofmann. Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 2001.
[17]
T. Iwata, T. Yamada, Y. Sakurai, and N. Ueda. Online multiscale dynamic topic models. In KDD, 2010.
[18]
H. R. Kim and P. K. Chan. Learning implicit user interest hierarchy for context in personalization. In IUI, 2003.
[19]
R. Kumar and A. Tomkins. A characterization of online search behavior. In WWW, 561--570, 2010.
[20]
L. Li, Z. Yang, B. Wang, and M. Kitsuregawa. Dynamic adaptation strategies for long-term and short-term user profile to personalize search. In ADWM, 2007.
[21]
L. Li, W. Chu, J. Langford, and R. Schapire. A contextual bandit approach to personalized news article recommendation. In WWW, 661--670, 2010.
[22]
J. Mellor-Crummey and M. L. Scott. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM TOCS, 9(1):21--65, February 1991.
[23]
F. J. Provost, B. Dalessandro, R. Hook, X. Zhang, and A. Murray. Audience selection for on-line brand advertising: privacy-friendly social network targeting. In KDD, pages 707--716, 2009.
[24]
A.J. Smola and S. Narayanamurthy. An architecture for parallel topic models. In VLDB, 2010.
[25]
K. Sugiyama, K. Hatano, and M. Yoshikawa. Adaptive web search based on user profile constructed without any effort from users. In WWW, pages 675--684, 2004.
[26]
Y. Teh, M. Jordan, M. Beal, and D. Blei. Hierarchical dirichlet processes. JASA, 2006.
[27]
L. Yao, D. Mimno, and A. McCallum. Efficient methods for topic model inference on streaming document collections. In KDD'09, 2009.
[28]
Y. Wang, H. Bai, M. Stanton, W. Chen, andE. Chang. PLDA: Parallel latent dirichlet allocationfor large-scale applications. In Proc. of 5thInternational Conference on Algorithmic Aspects inInformation and Management, 2009.
[29]
D. Newman, A. Asuncion, P. Smyth and M. Welling. Distributed Algorithms for Topic Models. In Journal of Machine Learning Research, 2009.
[30]
H. Wallach, D. Mimno and A. McCallum. Rethinking LDA: Why Priors Matter. In Advances in Neural Information Processing Systems 22, 2009.

Cited By

View all

Index Terms

  1. Scalable distributed inference of dynamic user interests for behavioral targeting

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
      August 2011
      1446 pages
      ISBN:9781450308137
      DOI:10.1145/2020408
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 21 August 2011

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. computational advertising
      2. distributed inference
      3. large-scale
      4. online inference
      5. user modeling

      Qualifiers

      • Research-article

      Conference

      KDD '11
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

      Upcoming Conference

      KDD '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)19
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 24 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media