skip to main content
10.1145/2623330.2623663acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Leveraging user libraries to bootstrap collaborative filtering

Published: 24 August 2014 Publication History

Abstract

We introduce a novel graphical model, the collaborative score topic model (CSTM), for personal recommendations of textual documents. CSTM's chief novelty lies in its learned model of individual libraries, or sets of documents, associated with each user. Overall, CSTM is a joint directed probabilistic model of user-item scores (ratings), and the textual side information in the user libraries and the items. Creating a generative description of scores and the text allows CSTM to perform well in a wide variety of data regimes, smoothly combining the side information with observed ratings as the number of ratings available for a given user ranges from none to many. Experiments on real-world datasets demonstrate CSTM's performance. We further demonstrate its utility in an application for personal recommendations of posters which we deployed at the NIPS 2013 conference.

Supplementary Material

MP4 File (p173-sidebyside.mp4)

References

[1]
D. Agarwal and B.-C. Chen. Regression-based latent factor models. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '09, pages 19--28, New York, NY, USA, 2009. ACM.
[2]
D. Agarwal and B.-C. Chen. flda: matrix factorization through latent dirichlet allocation. In Proceedings of the third ACM international conference on Web search and data mining, WSDM '10, pages 91--100, New York, NY, USA, 2010. ACM.
[3]
R. M. Bell and Y. Koren. Lessons from the netflix prize challenge. SIGKDD Explorations, 9(2):75--79, 2007.
[4]
M. J. Best and N. Chakravarti. Active set algorithms for isotonic regression; a unifying framework. Math. Program., 47:425--439, 1990.
[5]
D. M. Blei and J. D. Lafferty. A correlated topic model of science. AAS, 1(1):17--35, 2007.
[6]
D. M. Blei, A. Y. Ng, M. I. Jordan, and J. Lafferty. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, 2003.
[7]
L. Charlin, R. Zemel, and C. Boutilier. A framework for optimizing paper matching. In Proceedings of the Proceedings of the Twenty-Seventh Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-11), pages 86--95, Corvallis, Oregon, 2011. AUAI Press.
[8]
P. P. E. Bart, M. Welling. Unsupervised organization of image collections: Taxonomies and beyond. IEEE Transactions of Pattern Analysis and Machine Intelligence, 2011.
[9]
M. D. Hoffman, D. M. Blei, C. Wang, and J. Paisley. Stochastic variational inference. J. Mach. Learn. Res., 14(1):1303--1347, May 2013.
[10]
K. Järvelin and J. Kekäläinen. Ir evaluation methods for retrieving highly relevant documents. In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR '00, pages 41--48, New York, NY, USA, 2000. ACM.
[11]
M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul. An introduction to variational methods for graphical models. Mach. Learn., 37(2):183--233, Nov. 1999.
[12]
D. M. Mimno and A. McCallum. Expertise modeling for matching papers with reviewers. In P. Berkhin, R. Caruana, and X. Wu, editors, Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 500--509, San Jose, California, 2007. ACM.
[13]
R. Salakhutdinov and G. Hinton. Replicated softmax: an undirected topic model. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22 (NIPS), pages 1607--1614. 2009.
[14]
R. Salakhutdinov and A. Mnih. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In Proceedings of the 25th International Conference on Machine Learning (ICML), volume 25, pages 880--887, Helsinki, Finland, 2008.
[15]
R. Salakhutdinov and A. Mnih. Probabilistic matrix factorization. In J. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20 (NIPS), pages 1257--1264. MIT Press, Cambridge, MA, 2008.
[16]
H. Shan and A. Banerjee. Generalized probabilistic matrix factorizations for collaborative filtering. In Proceedings of the 2010 IEEE International Conference on Data Mining, ICDM '10, pages 1025--1030, Washington, DC, USA, 2010. IEEE Computer Society.
[17]
A. P. Singh and G. J. Gordon. Relational learning via collective matrix factorization. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '08, pages 650--658, New York, NY, USA, 2008. ACM.
[18]
G. Tian and L. Jing. Recommending scientific articles using bi-relational graph-based iterative rwr. In Proceedings of the 7th ACM Conference on Recommender Systems, RecSys '13, 2013.
[19]
L. van der Maaten and G. Hinton. Visualizing Data using t-SNE. Journal of Machine Learning Research, 9:2579--2605, Nov. 2008.
[20]
C. Wang and D. M. Blei. Collaborative topic modeling for recommending scientific articles. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '11, pages 448--456, New York, NY, USA, 2011. ACM.
[21]
C. Wang and D. M. Blei. Variational inference in nonconjugate models. Journal of Machine Learning Research, 14(1):1005--1031, Apr. 2013.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '14: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2014
2028 pages
ISBN:9781450329569
DOI:10.1145/2623330
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 August 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cold start
  2. collaborative filtering
  3. document recommendations
  4. side information
  5. topic modeling

Qualifiers

  • Research-article

Funding Sources

Conference

KDD '14
Sponsor:

Acceptance Rates

KDD '14 Paper Acceptance Rate 151 of 1,036 submissions, 15%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media