skip to main content
10.1145/3383313.3418486acmconferencesArticle/Chapter ViewAbstractPublication PagesrecsysConference Proceedingsconference-collections
extended-abstract

Tuning Word2vec for Large Scale Recommendation Systems

Published: 22 September 2020 Publication History

Abstract

Word2vec is a powerful machine learning tool that emerged from Natural Language Processing (NLP) and is now applied in multiple domains, including recommender systems, forecasting, and network analysis. As Word2vec is often used off the shelf, we address the question of whether the default hyperparameters are suitable for recommender systems. The answer is emphatically no. In this paper, we first elucidate the importance of hyperparameter optimization and show that unconstrained optimization yields an average 221% improvement in hit rate over the default parameters. However, unconstrained optimization leads to hyperparameter settings that are very expensive and not feasible for large scale recommendation tasks. To this end, we demonstrate 138% average improvement in hit rate with a runtime budget-constrained hyperparameter optimization. Furthermore, to make hyperparameter optimization applicable for large scale recommendation problems where the target dataset is too large to search over, we investigate generalizing hyperparameters settings from samples. We show that applying constrained hyperparameter optimization using only a 10% sample of the data still yields a 91% average improvement in hit rate over the default parameters when applied to the full datasets. Finally, we apply hyperparameters learned using our method of constrained optimization on a sample to the Who To Follow recommendation service at Twitter and are able to increase follow rates by 15%.

References

[1]
Ilya A. Antonov and V. M. Saleev. 1979. An Economic Method of Computing LPτ-sequences. U. S. S. R. Comput. Math. and Math. Phys. 19, 1 (1979), 252–256. https://doi.org/10.1016/0041-5553(79)90085-5
[2]
Oren Barkan and Noam Koenigstein. 2016. Item2Vec : Neural Item Embedding for Collaborative Filtering. 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP) (2016), https://doi.org/1603.04259
[3]
Luca Belli, Sofia I R A Ktena, Alexandre Lung-yut fong, and Frank Portman. 2020. Privacy-Preserving Recommender Systems Challenge on Twitter’s Home Timeline. (2020) In arXiv:2004.13715.
[4]
Ferenc Bodon. 2003. A fast APRIORI Implementation. FIMI 3(2003), 63. http://www.cs.bme.hu/~bodon/kozos/papers/bodon_trie.pdf
[5]
Hugo Caselles-Dupré, Florian Lesaint, and Jimena Royo-Letelier. 2018. Word2vec Applied to Recommendation: Hyperparameters Matter. In Proceedings of the 12th ACM Conference on Recommender Systems - RecSys ’18. (2018) 352–356. https://doi.org/10.1145/3240323.3240377
[6]
Benjamin P Chamberlain, Angelo Cardoso, C.H. Bryan Liu, Roberto Pagliari, and Marc Peter Deisenroth. 2017. Customer Lifetime Value Prediction Using Embeddings. In Proceedings of the 23nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’17’. (2017) 1753–1762.
[7]
Daqing Chen, Sai Laing Sain, and Kun Guo. 2012. Data mining for the online retail industry: A case study of RFM model-based customer segmentation using data mining. Journal of Database Marketing and Customer Strategy Management 19, 3(2012), 197–208. https://doi.org/10.1057/dbm.2012.17
[8]
Mihajlo Grbovic and Haibin Cheng. 2018. Real-time Personalization using Embeddings for Search Ranking at Airbnb. In In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining - KDD ’18. (2018) 311–320. https://doi.org/10.1145/3219819.3219885
[9]
Mihajlo Grbovic, Vladan Radosavljevic, Nemanja Djuric, Narayan Bhamidipati, Jaikit Savla, Varun Bhagwan, and Doug Sharp. 2015. E-commerce in Your Inbox : Product Recommendations at Scale Categories and Subject Descriptors. SIGKDD 2015: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’15(2015), 1809–1818. https://doi.org/10.1145/2783258.2788627
[10]
Michael U Gutmann. 2012. Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics. Journal of Machine Learning Research 13 (2012), 307–361. http://www.jmlr.org/papers/volume13/gutmann12a/gutmann12a.pdf
[11]
Bobak Shahriari, Kevin Swersky, and Ziyu Wang, Ryan P Adams, Nando de Freitas. 2015. Taking the Human Out of the Loop: A Review of Bayesian Optimization. Proceedings of the IEEE 1, 1 (2015), 148–175. https://doi.org/10.1017/CBO9781107415324.004
[12]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Neural Information Processing Systems. (2013) 3111–3119. https://doi.org/10.1162/jmlr.2003.3.4-5.951
[13]
Andriy Mnih and Geoffrey E Hinton. 2009. A Scalable Hierarchical Distributed Language Model. In Advances in Neural Information Processing Systems. 1081–1088. http://discovery.ucl.ac.uk/63249/
[14]
Andriy Mnih and Yee Whye Teh. 2012. A Fast and Simple Algorithm for Training Neural Probabilistic Language Models. In Proceedings of the 29th International Conference on Machine Learning. (2012) 1751–1758.
[15]
Erik Ordentlich, Lee Yang, Andy Feng, Peter Cnudde, and Gavin Owens. 2016. Network – Efficient Distributed Word2vec Training System for Large Vocabularies. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. (2016) 1139–1148.
[16]
Makbule Gulcin Ozsoy. 2016. From Word Embeddings to Item Recommendation. In arXiv:1601.01356. http://arxiv.org/abs/1601.01356
[17]
Bryan Perozzi and Steven Skiena. 2014. DeepWalk : Online Learning of Social Representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining - KDD ’14. (2014), 701–710. https://doi.org/10.1145/2623330.2623732
[18]
Radim Rehurek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora. Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks (2010), 45–50.
[19]
Roberto Turrin, Massimo Quadrana, Andrea Condorelli, Roberto Pagano, and Paolo Cremonesi. 2015. 30Music Listening and Playlists Dataset. In Proceedings of the 9th ACM Conference on Recommender Systems - RecSys ’15\, Vol. 1441.
[20]
Flavian Vasile, Elena Smirnova, and Alexis Conneau. 2016. Meta-Prod2Vec - Product Embeddings Using Side-Information for Recommendation. In Proceedings of the 10th ACM Conference on Recommender Systems - RecSys ’16. 225–232. https://doi.org/10.1145/2959100.2959160
[21]
Shuhao Wang, Cancheng Liu, Xiang Gao, Hongtao Qu, and Wei Xu. 2017. Session-Based Fraud Detection in Online E-Commerce Transactions Using Recurrent Neural Networks. In Proceedings of European Conference Machine Learning and Knowledge Discovery in Databases.
[22]
Kui Zhao, Yuechuan Li, Zhaoqian Shuai, and Cheng Yang. 2017. Learning and Transferring IDs Representation in E-commerce. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining - KDD ’18. (2017) 1031–1039https://doi.org/10.1145/3219819.3219855
[23]
Pankaj Gupta, Ashish Goel, Jimmy Lin, Anish Sharma, Dong Wang and Reza Zadeh. 2013. Wtf: The who to follow service at twitter. In Proceedings of the 22nd international conference on World Wide Web, (2013) 505–514

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
RecSys '20: Proceedings of the 14th ACM Conference on Recommender Systems
September 2020
796 pages
ISBN:9781450375832
DOI:10.1145/3383313
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 September 2020

Check for updates

Author Tags

  1. Embeddings
  2. Hyperparameter Optimization
  3. Neural Networks
  4. Recommender System Evaluation

Qualifiers

  • Extended-abstract
  • Research
  • Refereed limited

Conference

RecSys '20: Fourteenth ACM Conference on Recommender Systems
September 22 - 26, 2020
Virtual Event, Brazil

Acceptance Rates

Overall Acceptance Rate 254 of 1,295 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)80
  • Downloads (Last 6 weeks)2
Reflects downloads up to 28 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media