extended-abstract

Tuning Word2vec for Large Scale Recommendation Systems

Authors:

Benjamin P. Chamberlain,

Emanuele Rossi,

Suvash Sedhain,

Michael M. BronsteinAuthors Info & Claims

RecSys '20: Proceedings of the 14th ACM Conference on Recommender Systems

Pages 732 - 737

https://doi.org/10.1145/3383313.3418486

Published: 22 September 2020 Publication History

Abstract

Word2vec is a powerful machine learning tool that emerged from Natural Language Processing (NLP) and is now applied in multiple domains, including recommender systems, forecasting, and network analysis. As Word2vec is often used off the shelf, we address the question of whether the default hyperparameters are suitable for recommender systems. The answer is emphatically no. In this paper, we first elucidate the importance of hyperparameter optimization and show that unconstrained optimization yields an average 221% improvement in hit rate over the default parameters. However, unconstrained optimization leads to hyperparameter settings that are very expensive and not feasible for large scale recommendation tasks. To this end, we demonstrate 138% average improvement in hit rate with a runtime budget-constrained hyperparameter optimization. Furthermore, to make hyperparameter optimization applicable for large scale recommendation problems where the target dataset is too large to search over, we investigate generalizing hyperparameters settings from samples. We show that applying constrained hyperparameter optimization using only a 10% sample of the data still yields a 91% average improvement in hit rate over the default parameters when applied to the full datasets. Finally, we apply hyperparameters learned using our method of constrained optimization on a sample to the Who To Follow recommendation service at Twitter and are able to increase follow rates by 15%.

References

[1]

Ilya A. Antonov and V. M. Saleev. 1979. An Economic Method of Computing LPτ-sequences. U. S. S. R. Comput. Math. and Math. Phys. 19, 1 (1979), 252–256. https://doi.org/10.1016/0041-5553(79)90085-5

[2]

Oren Barkan and Noam Koenigstein. 2016. Item2Vec : Neural Item Embedding for Collaborative Filtering. 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP) (2016), https://doi.org/1603.04259

[3]

Luca Belli, Sofia I R A Ktena, Alexandre Lung-yut fong, and Frank Portman. 2020. Privacy-Preserving Recommender Systems Challenge on Twitter’s Home Timeline. (2020) In arXiv:2004.13715.

[4]

Ferenc Bodon. 2003. A fast APRIORI Implementation. FIMI 3(2003), 63. http://www.cs.bme.hu/~bodon/kozos/papers/bodon_trie.pdf

[5]

Hugo Caselles-Dupré, Florian Lesaint, and Jimena Royo-Letelier. 2018. Word2vec Applied to Recommendation: Hyperparameters Matter. In Proceedings of the 12th ACM Conference on Recommender Systems - RecSys ’18. (2018) 352–356. https://doi.org/10.1145/3240323.3240377

Digital Library

[6]

Benjamin P Chamberlain, Angelo Cardoso, C.H. Bryan Liu, Roberto Pagliari, and Marc Peter Deisenroth. 2017. Customer Lifetime Value Prediction Using Embeddings. In Proceedings of the 23nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’17’. (2017) 1753–1762.

Digital Library

[7]

Daqing Chen, Sai Laing Sain, and Kun Guo. 2012. Data mining for the online retail industry: A case study of RFM model-based customer segmentation using data mining. Journal of Database Marketing and Customer Strategy Management 19, 3(2012), 197–208. https://doi.org/10.1057/dbm.2012.17

[8]

Mihajlo Grbovic and Haibin Cheng. 2018. Real-time Personalization using Embeddings for Search Ranking at Airbnb. In In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining - KDD ’18. (2018) 311–320. https://doi.org/10.1145/3219819.3219885

Digital Library

[9]

Mihajlo Grbovic, Vladan Radosavljevic, Nemanja Djuric, Narayan Bhamidipati, Jaikit Savla, Varun Bhagwan, and Doug Sharp. 2015. E-commerce in Your Inbox : Product Recommendations at Scale Categories and Subject Descriptors. SIGKDD 2015: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’15(2015), 1809–1818. https://doi.org/10.1145/2783258.2788627

Digital Library

[10]

Michael U Gutmann. 2012. Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics. Journal of Machine Learning Research 13 (2012), 307–361. http://www.jmlr.org/papers/volume13/gutmann12a/gutmann12a.pdf

Digital Library

[11]

Bobak Shahriari, Kevin Swersky, and Ziyu Wang, Ryan P Adams, Nando de Freitas. 2015. Taking the Human Out of the Loop: A Review of Bayesian Optimization. Proceedings of the IEEE 1, 1 (2015), 148–175. https://doi.org/10.1017/CBO9781107415324.004

[12]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Neural Information Processing Systems. (2013) 3111–3119. https://doi.org/10.1162/jmlr.2003.3.4-5.951

[13]

Andriy Mnih and Geoffrey E Hinton. 2009. A Scalable Hierarchical Distributed Language Model. In Advances in Neural Information Processing Systems. 1081–1088. http://discovery.ucl.ac.uk/63249/

[14]

Andriy Mnih and Yee Whye Teh. 2012. A Fast and Simple Algorithm for Training Neural Probabilistic Language Models. In Proceedings of the 29th International Conference on Machine Learning. (2012) 1751–1758.

[15]

Erik Ordentlich, Lee Yang, Andy Feng, Peter Cnudde, and Gavin Owens. 2016. Network – Efficient Distributed Word2vec Training System for Large Vocabularies. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. (2016) 1139–1148.

Digital Library

[16]

Makbule Gulcin Ozsoy. 2016. From Word Embeddings to Item Recommendation. In arXiv:1601.01356. http://arxiv.org/abs/1601.01356

[17]

Bryan Perozzi and Steven Skiena. 2014. DeepWalk : Online Learning of Social Representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining - KDD ’14. (2014), 701–710. https://doi.org/10.1145/2623330.2623732

Digital Library

[18]

Radim Rehurek and Petr Sojka. 2010. Software Framework for Topic Modelling with Large Corpora. Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks (2010), 45–50.

[19]

Roberto Turrin, Massimo Quadrana, Andrea Condorelli, Roberto Pagano, and Paolo Cremonesi. 2015. 30Music Listening and Playlists Dataset. In Proceedings of the 9th ACM Conference on Recommender Systems - RecSys ’15\, Vol. 1441.

[20]

Flavian Vasile, Elena Smirnova, and Alexis Conneau. 2016. Meta-Prod2Vec - Product Embeddings Using Side-Information for Recommendation. In Proceedings of the 10th ACM Conference on Recommender Systems - RecSys ’16. 225–232. https://doi.org/10.1145/2959100.2959160

Digital Library

[21]

Shuhao Wang, Cancheng Liu, Xiang Gao, Hongtao Qu, and Wei Xu. 2017. Session-Based Fraud Detection in Online E-Commerce Transactions Using Recurrent Neural Networks. In Proceedings of European Conference Machine Learning and Knowledge Discovery in Databases.

[22]

Kui Zhao, Yuechuan Li, Zhaoqian Shuai, and Cheng Yang. 2017. Learning and Transferring IDs Representation in E-commerce. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining - KDD ’18. (2017) 1031–1039https://doi.org/10.1145/3219819.3219855

[23]

Pankaj Gupta, Ashish Goel, Jimmy Lin, Anish Sharma, Dong Wang and Reza Zadeh. 2013. Wtf: The who to follow service at twitter. In Proceedings of the 22nd international conference on World Wide Web, (2013) 505–514

Digital Library

Cited By

Wang YHu XGan QHuang XQiu XWipf D(2025)Efficient Link Prediction via GNN Layers Induced by Negative SamplingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348101537:1(253-264)Online publication date: Jan-2025
https://doi.org/10.1109/TKDE.2024.3481015
Kim YChang T(2024)Deep Learning-Based Freight Recommendation System for Freight Brokerage PlatformSystems10.3390/systems1211047712:11(477)Online publication date: 7-Nov-2024
https://doi.org/10.3390/systems12110477
Es-sabery FEs-sabery IQadir JSainz-de-Abajo BGarcia-Zapirain B(2024)A hybrid Hadoop-based sentiment analysis classifier for tweets associated with COVID-19 utilizing two machine learning algorithms: CNN, and fuzzy C4.5Journal of Big Data10.1186/s40537-024-01014-411:1Online publication date: 18-Dec-2024
https://doi.org/10.1186/s40537-024-01014-4
Show More Cited By

Recommendations

Word2vec applied to recommendation: hyperparameters matter
RecSys '18: Proceedings of the 12th ACM Conference on Recommender Systems

Skip-gram with negative sampling, a popular variant of Word2vec originally designed and tuned to create word embeddings for Natural Language Processing, has been used to create item embeddings with successful applications in recommendation. While these ...
Architecture-Aware Bayesian Optimization for Neural Network Tuning
Artificial Neural Networks and Machine Learning – ICANN 2019: Deep Learning
Abstract
Hyperparameter optimization of a neural network is a non-trivial task. It is time-consuming to evaluate a hyperparameter setting, no analytical expression of the impact of the hyperparameters are available, and the evaluations are noisy in the ...
Distributed collaborative filtering with singular ratings for large scale recommendation

Collaborative filtering (CF) is an effective technique addressing the information overloading problem, where each user is associated with a set of rating scores on a set of items. For a chosen target user, conventional CF algorithms measure similarity ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

RecSys '20: Proceedings of the 14th ACM Conference on Recommender Systems

September 2020

796 pages

ISBN:9781450375832

DOI:10.1145/3383313

Copyright © 2020 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 September 2020

Check for updates

Author Tags

Qualifiers

Extended-abstract
Research
Refereed limited

Conference

RecSys '20

Sponsor:

RecSys '20: Fourteenth ACM Conference on Recommender Systems

September 22 - 26, 2020

Virtual Event, Brazil

Acceptance Rates

Overall Acceptance Rate 254 of 1,295 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

16
Total Citations
View Citations
800
Total Downloads

Downloads (Last 12 months)80
Downloads (Last 6 weeks)2

Reflects downloads up to 28 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang YHu XGan QHuang XQiu XWipf D(2025)Efficient Link Prediction via GNN Layers Induced by Negative SamplingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348101537:1(253-264)Online publication date: Jan-2025
https://doi.org/10.1109/TKDE.2024.3481015
Kim YChang T(2024)Deep Learning-Based Freight Recommendation System for Freight Brokerage PlatformSystems10.3390/systems1211047712:11(477)Online publication date: 7-Nov-2024
https://doi.org/10.3390/systems12110477
Es-sabery FEs-sabery IQadir JSainz-de-Abajo BGarcia-Zapirain B(2024)A hybrid Hadoop-based sentiment analysis classifier for tweets associated with COVID-19 utilizing two machine learning algorithms: CNN, and fuzzy C4.5Journal of Big Data10.1186/s40537-024-01014-411:1Online publication date: 18-Dec-2024
https://doi.org/10.1186/s40537-024-01014-4
Rosnes DStarke ATrattner C(2024)Shaping the Future of Content-based News Recommenders: Insights from Evaluating Feature-Specific Similarity MetricsProceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization10.1145/3627043.3659560(201-211)Online publication date: 22-Jun-2024
https://dl.acm.org/doi/10.1145/3627043.3659560
Starke ASolberg VØverhaug STrattner C(2024)Examining the merits of feature-specific similarity functions in the news domain using human judgmentsUser Modeling and User-Adapted Interaction10.1007/s11257-024-09412-234:4(995-1042)Online publication date: 7-Aug-2024
https://doi.org/10.1007/s11257-024-09412-2
Steiger EKroll L(2023)Patient Embeddings From Diagnosis Codes for Health Care Prediction Tasks: Pat2Vec Machine Learning FrameworkJMIR AI10.2196/407552(e40755)Online publication date: 21-Apr-2023
https://doi.org/10.2196/40755
Lee TKim SJun CLee J(2023)Word2vec-Based Efficient Privacy-Preserving Shared Representation Learning for Federated Recommendation System in a Cross-Device SettingSSRN Electronic Journal10.2139/ssrn.4353520Online publication date: 2023
https://doi.org/10.2139/ssrn.4353520
Gahar RHidri AHidri M(2023)Let's Predict Who Will Move to a New Job2023 IEEE International Conference on Advanced Systems and Emergent Technologies (IC_ASET)10.1109/IC_ASET58101.2023.10150675(01-05)Online publication date: 29-Apr-2023
https://doi.org/10.1109/IC_ASET58101.2023.10150675
Lee TKim SLee JJun C(2023)Word2Vec-based efficient privacy-preserving shared representation learning for federated recommendation system in a cross-device settingInformation Sciences: an International Journal10.1016/j.ins.2023.119728651:COnline publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1016/j.ins.2023.119728
Kvernadze GSudyanti PSubedi NHajiaghayi M(2022)Two is Better Than One: Dual Embeddings for Complementary Product Recommendations2022 IEEE International Conference on Knowledge Graph (ICKG)10.1109/ICKG55886.2022.00024(131-140)Online publication date: Dec-2022
https://doi.org/10.1109/ICKG55886.2022.00024
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten