research-article

CAFE: Towards Compact, Adaptive, and Fast Embedding for Large-scale Recommendation Models

Authors:

Bin CuiAuthors Info & Claims

Proceedings of the ACM on Management of Data, Volume 2, Issue 1

Article No.: 51, Pages 1 - 28

https://doi.org/10.1145/3639306

Published: 26 March 2024 Publication History

Abstract

Recently, the growing memory demands of embedding tables in Deep Learning Recommendation Models (DLRMs) pose great challenges for model training and deployment. Existing embedding compression solutions cannot simultaneously meet three key design requirements: memory efficiency, low latency, and adaptability to dynamic data distribution. This paper presents CAFE, a Compact, Adaptive, and Fast Embedding compression framework that addresses the above requirements. The design philosophy of CAFE is to dynamically allocate more memory resources to important features (called hot features), and allocate less memory to unimportant ones. In CAFE, we propose a fast and lightweight sketch data structure, named HotSketch, to capture feature importance and report hot features in real time. For each reported hot feature, we assign it a unique embedding. For the non-hot features, we allow multiple features to share one embedding by using hash embedding technique. Guided by our design philosophy, we further propose a multi-level hash embedding framework to optimize the embedding tables of non-hot features. We theoretically analyze the accuracy of HotSketch, and analyze the model convergence against deviation. Extensive experiments show that CAFE significantly outperforms existing embedding compression methods, yielding 3.92% and 3.68% superior testing AUC on Criteo Kaggle dataset and CriteoTB dataset at a compression ratio of 10000x. The source codes of CAFE are available at GitHub.

Supplemental Material

MP4 File

Presentation video and slides

Download
219.78 MB

PDF File

Presentation video and slides

Download
1.96 MB

References

[1]

Aden and Yi Wang. 2012. KDD Cup 2012, Track 2. https://kaggle.com/competitions/kddcup2012-track2.

[2]

Zeyuan Allen-Zhu. 2017. Natasha: Faster Non-Convex Stochastic Optimization via Strongly Non-Convex Parameter. In Proceedings of the 34th International Conference on Machine Learning (ICML).

[3]

Moses Charikar, Kevin C. Chen, and Martin Farach-Colton. 2002. Finding Frequent Items in Data Streams. In Automata, Languages and Programming, 29th International Colloquium (ICALP).

[4]

Tianyi Chen, Jun Gao, Hedui Chen, and Yaofeng Tu. 2023. LOGER: A Learned Optimizer towards Generating Efficient and Robust Query Execution Plans. Proceedings of the VLDB Endowment, Vol. 16, 7 (2023), 1777--1789.

Digital Library

[5]

Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. 2016. Wide & Deep Learning for Recommender Systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems (DLRS@RecSys).

Digital Library

[6]

Monica Chiosa, Thomas Preußer, and Gustavo Alonso. 2021. SKT: A One-Pass Multi-Sketch Data Analytics Accelerator. Proceedings of the VLDB Endowment, Vol. 14, 11 (2021), 2369--2382.

Digital Library

[7]

Michael Chui, James Manyika, Mehdi Miremadi, Nicolaus Henke, Rita Chung, Pieter Nel, and Sankalp Malhotra. 2018. Notes from the AI frontier: Insights from hundreds of use cases. McKinsey Global Institute, Vol. 2 (2018). https://www.mckinsey.com/west-coast/ /media/McKinsey/Featured%20Insights/Artificial%20Intelligence/Notes%20from%20the%20AI%20frontier%20Applications%20and%20value%20of%20deep%20learning/Notes-from-the-AI-frontier-Insights-from-hundreds-of-use-cases-Discussion-paper.pdf

[8]

Graham Cormode and S. Muthukrishnan. 2005. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms, Vol. 55, 1 (2005), 58--75.

Digital Library

[9]

Fan Deng and Davood Rafiei. 2007. New estimation algorithms for streaming data: Count-min can do more. Webdocs. Cs. Ualberta. Ca (2007).

[10]

Wei Deng, Junwei Pan, Tian Zhou, Deguang Kong, Aaron Flores, and Guang Lin. 2021. DeepLight: Deep Lightweight Feature Interactions for Accelerating CTR Predictions in Ad Serving. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining (WSDM).

Digital Library

[11]

Xenofontas A. Dimitropoulos, Paul Hurley, and Andreas Kind. 2008. Probabilistic lossy counting: an efficient algorithm for finding heavy hitters. ACM SIGCOMM Computer Communication Review, Vol. 38, 1 (2008), 5.

Digital Library

[12]

Yue Ding, Yuhe Guo, Wei Lu, Hai-Xiang Li, Meihui Zhang, Hui Li, An-Qun Pan, and Xiaoyong Du. 2023. Context-Aware Semantic Type Identification for Relational Attributes. Journal of Computer Science and Technology, Vol. 38, 4 (2023), 927--946.

Digital Library

[13]

Muhammad Ebraheem, Saravanan Thirumuruganathan, Shafiq R. Joty, Mourad Ouzzani, and Nan Tang. 2018. Distributed Representations of Tuples for Entity Resolution. Proceedings of the VLDB Endowment, Vol. 11, 11 (2018), 1454--1467.

Digital Library

[14]

Cristian Estan and George Varghese. 2002. New directions in traffic measurement and accounting. ACM SIGCOMM Computer Communication Review, Vol. 32, 4 (2002), 323--336.

Digital Library

[15]

Yao-Chung Fan and Arbee L. P. Chen. 2008. Efficient and robust sensor data aggregation using linear counting sketches. In 22nd IEEE International Symposium on Parallel and Distributed Processing (IPDPS).

[16]

Fangcheng Fu, Yuzheng Hu, Yihan He, Jiawei Jiang, Yingxia Shao, Ce Zhang, and Bin Cui. 2020. Don't Waste Your Bits! Squeeze Activations and Gradients for Deep Neural Networks via TinyScript. In Proceedings of the 37th International Conference on Machine Learning (ICML).

Digital Library

[17]

Antonio A. Ginart, Maxim Naumov, Dheevatsa Mudigere, Jiyan Yang, and James Zou. 2021. Mixed Dimension Embeddings with Application to Memory-Efficient Recommendation Systems. In IEEE International Symposium on Information Theory (ISIT).

Digital Library

[18]

Siddharth Gopal. 2016. Adaptive Sampling for SGD by Exploiting Side Information. In Proceedings of the 33nd International Conference on Machine Learning (ICML).

[19]

Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI).

[20]

Udit Gupta, Samuel Hsia, Vikram Saraph, Xiaodong Wang, Brandon Reagen, Gu-Yeon Wei, Hsien-Hsin S. Lee, David Brooks, and Carole-Jean Wu. 2020a. DeepRecSys: A System for Optimizing End-To-End At-Scale Neural Recommendation Inference. In Proceedings of the 47th Annual International Symposium on Computer Architecture (ISCA).

Digital Library

[21]

Udit Gupta, Carole-Jean Wu, Xiaodong Wang, Maxim Naumov, Brandon Reagen, David Brooks, Bradford Cottel, Kim M. Hazelwood, Mark Hempstead, Bill Jia, Hsien-Hsin S. Lee, Andrey Malevich, Dheevatsa Mudigere, Mikhail Smelyanskiy, Liang Xiong, and Xuan Zhang. 2020b. The Architectural Implications of Facebook's DNN-Based Personalized Recommendation. In IEEE International Symposium on High Performance Computer Architecture (HPCA).

[22]

Teng-Yue Han, Pengfei Wang, and Shaozhang Niu. 2023. Multimodal Interactive Network for Sequential Recommendation. Journal of Computer Science and Technology, Vol. 38, 4 (2023), 911--926.

Digital Library

[23]

Ruihong Huang, Shaoxu Song, Yunsu Lee, Jungho Park, Soo-Hyung Kim, and Sungmin Yi. 2020. Effective and Efficient Retrieval of Structured Entities. Proceedings of the VLDB Endowment, Vol. 13, 6 (2020), 826--839.

Digital Library

[24]

Yesdaulet Izenov, Asoke Datta, Florin Rusu, and Jun Hyung Shin. 2021. COMPASS: Online Sketch-based Query Optimization for In-Memory Databases. In Proceedings of the International Conference on Management of Data (SIGMOD).

Digital Library

[25]

Biye Jiang, Chao Deng, Huimin Yi, Zelin Hu, Guorui Zhou, Yang Zheng, Sui Huang, Xinyang Guo, Dongyue Wang, Yue Song, Liqin Zhao, Zhi Wang, Peng Sun, Yu Zhang, Di Zhang, Jinhui Li, Jian Xu, Xiaoqiang Zhu, and Kun Gai. 2019. Xdl: an industrial deep learning framework for high-dimensional sparse data. In Proceedings of the 1st International Workshop on Deep Learning Practice for High-Dimensional Sparse Data.

Digital Library

[26]

Angelos Katharopoulos and Francc ois Fleuret. 2018. Not All Samples Are Created Equal: Deep Learning with Importance Sampling. In Proceedings of the 35th International Conference on Machine Learning (ICML).

[27]

Hyeonji Kim, Byeong-Hoon So, Wook-Shin Han, and Hongrae Lee. 2020. Natural language to SQL: Where are we today? Proceedings of the VLDB Endowment, Vol. 13, 10 (2020), 1737--1750.

Digital Library

[28]

Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations (ICLR).

[29]

Adrian Kochsiek and Rainer Gemulla. 2021. Parallel Training of Knowledge Graph Embedding Models: A Comparison of Techniques. Proceedings of the VLDB Endowment, Vol. 15, 3 (2021), 633--645.

Digital Library

[30]

Shuming Kong, Weiyu Cheng, Yanyan Shen, and Linpeng Huang. 2023. AutoSrh: An Embedding Dimensionality Search Framework for Tabular Data Prediction. IEEE Transactions on Knowledge and Data Engineering, Vol. 35, 7 (2023), 6673--6686.

Digital Library

[31]

Suyong Kwon, Woohwan Jung, and Kyuseok Shim. 2022. Cardinality Estimation of Approximate Substring Queries using Deep Learning. Proceedings of the VLDB Endowment, Vol. 15, 11 (2022), 3145--3157.

Digital Library

[32]

Criteo Labs. 2013. Download Terabyte Click Logs. https://labs.criteo.com/2013/12/download-terabyte-click-logs/.

[33]

Criteo Labs. 2014. Kaggle display advertising challenge dataset. https://labs.criteo.com/2014/02/kaggle-display-advertising-challenge-dataset/.

[34]

Fan Lai, Wei Zhang, Rui Liu, William Tsai, Xiaohan Wei, Yuxi Hu, Sabin Devkota, Jianyu Huang, Jongsoo Park, Xing Liu, Zeliang Chen, Ellie Wen, Paul Rivera, Jie You, Chun-cheng Jason Chen, and Mosharaf Chowdhury. 2023. AdaEmbed: Adaptive Embedding for Large-Scale Recommendation Models. In 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI).

[35]

Jizhou Li, Zikun Li, Yifei Xu, Shiqi Jiang, Tong Yang, Bin Cui, Yafei Dai, and Gong Zhang. 2020. WavingSketch: An Unbiased and Generic Sketch for Finding Top-k Items in Data Streams. In Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD).

Digital Library

[36]

Shiwei Li, Huifeng Guo, Lu Hou, Wei Zhang, Xing Tang, Ruiming Tang, Rui Zhang, and Ruixuan Li. 2023. Adaptive Low-Precision Training for Embeddings in Click-Through Rate Prediction. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI).

Digital Library

[37]

Tao Li, Shigang Chen, and Yibei Ling. 2012. Per-Flow Traffic Measurement Through Randomized Counter Sharing. IEEE/ACM Transactions on Networking, Vol. 20, 5 (2012), 1622--1634.

Digital Library

[38]

Xiangru Lian, Binhang Yuan, Xuefeng Zhu, Yulong Wang, Yongjun He, Honghuan Wu, Lei Sun, Haodong Lyu, Chengjun Liu, Xing Dong, Yiqiao Liao, Mingnan Luo, Congfei Zhang, Jingru Xie, Haonan Li, Lei Chen, Renjie Huang, Jianying Lin, Chengchun Shu, Xuezhong Qiu, Zhishan Liu, Dongying Kong, Lei Yuan, Hai Yu, Sen Yang, Ce Zhang, and Ji Liu. 2022. Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD).

Digital Library

[39]

Jie Liu, Wenqian Dong, Dong Li, and Qingqing Zhou. 2021. Fauce: Fast and Accurate Deep Ensembles with Uncertainty for Cardinality Estimation. Proceedings of the VLDB Endowment, Vol. 14, 11 (2021), 1950--1963.

Digital Library

[40]

Zirui Liu, Chaozhe Kong, Kaicheng Yang, Tong Yang, Ruijie Miao, Qizhi Chen, Yikai Zhao, Yaofeng Tu, and Bin Cui. 2023 a. HyperCalm Sketch: One-Pass Mining Periodic Batches in Data Streams. In 39th IEEE International Conference on Data Engineering (ICDE).

[41]

Zirui Liu, Yixin Zhang, Yifan Zhu, Ruwen Zhang, Tong Yang, Kun Xie, Sha Wang, Tao Li, and Bin Cui. 2023 b. TreeSensing: Linearly Compressing Sketches with Flexibility. In Proceedings of the International Conference on Management of Data (SIGMOD).

Digital Library

[42]

Fuyuan Lyu, Xing Tang, Hong Zhu, Huifeng Guo, Yingxue Zhang, Ruiming Tang, and Xue Liu. 2022. OptEmbed: Learning Optimal Embedding Table for Click-through Rate Prediction. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management (CIKM).

Digital Library

[43]

Ankush Mandal, He Jiang, Anshumali Shrivastava, and Vivek Sarkar. 2018. Topkapi: Parallel and Fast Sketches for Finding Top-K Frequent Elements. In Advances in Neural Information Processing Systems 31 (NeurIPS).

[44]

Xiangfu Meng, Hongjin Huo, Xiaoyan Zhang, Wanchun Wang, and Jinxia Zhu. 2023. A Survey of Personalized News Recommendation. Data Science and Engineering, Vol. 8, 4 (2023), 396--416.

[45]

Ahmed Metwally, Divyakant Agrawal, and Amr El Abbadi. 2005. Efficient Computation of Frequent and Top-k Elements in Data Streams. In International Conference on Database Theory.

[46]

Xupeng Miao, Xiaonan Nie, Hailin Zhang, Tong Zhao, and Bin Cui. 2023. Hetu: a highly efficient automatic parallel distributed deep learning system. Science China Information Sciences, Vol. 66, 1 (2023).

[47]

Xupeng Miao, Yining Shi, Hailin Zhang, Xin Zhang, Xiaonan Nie, Zhi Yang, and Bin Cui. 2022a. HET-GMP: A Graph-based System Approach to Scaling Large Embedding Model Training. In Proceedings of the International Conference on Management of Data (SIGMOD).

Digital Library

[48]

Xupeng Miao, Hailin Zhang, Yining Shi, Xiaonan Nie, Zhi Yang, Yangyu Tao, and Bin Cui. 2022b. HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework. Proceedings of the VLDB Endowment, Vol. 15, 2 (2022), 312--320.

Digital Library

[49]

Dheevatsa Mudigere, Yuchen Hao, Jianyu Huang, Zhihao Jia, Andrew Tulloch, Srinivas Sridharan, Xing Liu, Mustafa Ozdal, Jade Nie, Jongsoo Park, Liang Luo, Jie Amy Yang, Leon Gao, Dmytro Ivchenko, Aarti Basant, Yuxi Hu, Jiyan Yang, Ehsan K. Ardestani, Xiaodong Wang, Rakesh Komuravelli, Ching-Hsiang Chu, Serhat Yilmaz, Huayu Li, Jiyuan Qian, Zhuobo Feng, Yinbin Ma, Junjie Yang, Ellie Wen, Hong Li, Lin Yang, Chonglin Sun, Whitney Zhao, Dimitry Melts, Krishna Dhulipala, K. R. Kishore, Tyler Graf, Assaf Eisenman, Kiran Kumar Matam, Adi Gangidi, Guoqiang Jerry Chen, Manoj Krishnan, Avinash Nayak, Krishnakumar Nair, Bharath Muthiah, Mahmoud khorashadi, Pallab Bhattacharya, Petr Lapukhov, Maxim Naumov, Ajit Mathews, Lin Qiao, Mikhail Smelyanskiy, Bill Jia, and Vijay Rao. 2022. Software-hardware co-design for fast and scalable training of deep learning recommendation models. In Proceedings of the 49th Annual International Symposium on Computer Architecture (ISCA).

Digital Library

[50]

Maxim Naumov, John Kim, Dheevatsa Mudigere, Srinivas Sridharan, Xiaodong Wang, Whitney Zhao, Serhat Yilmaz, Changkyu Kim, Hector Yuen, Mustafa Ozdal, Krishnakumar Nair, Isabel Gao, Bor-Yiing Su, Jiyan Yang, and Mikhail Smelyanskiy. 2020. Deep Learning Training in Facebook Data Centers: Design of Scale-up and Scale-out Systems. CoRR, Vol. abs/2003.09518 (2020).

[51]

Maxim Naumov, Dheevatsa Mudigere, Hao-Jun Michael Shi, Jianyu Huang, Narayanan Sundaraman, Jongsoo Park, Xiaodong Wang, Udit Gupta, Carole-Jean Wu, Alisson G. Azzolini, Dmytro Dzhulgakov, Andrey Mallevich, Ilia Cherniavskii, Yinghai Lu, Raghuraman Krishnamoorthi, Ansha Yu, Volodymyr Kondratenko, Stephanie Pereira, Xianjie Chen, Wenlin Chen, Vijay Rao, Bill Jia, Liang Xiong, and Misha Smelyanskiy. 2019. Deep Learning Recommendation Model for Personalization and Recommendation Systems. CoRR, Vol. abs/1906.00091 (2019).

[52]

Niketan Pansare, Jay Katukuri, Aditya Arora, Frank Cipollone, Riyaaz Shaik, Noyan Tokgozoglu, and Chandru Venkataraman. 2022. Learning Compressed Embeddings for On-Device Inference. In Proceedings of Machine Learning and Systems (MLSys).

[53]

Guillaume Pitel and Geoffroy Fouquier. 2015. Count-Min-Log sketch: Approximately counting with approximate counters. In International Symposium on Web AlGorithms.

[54]

NVIDIA AI platform. 2020. MLPerf Benchmark. https://mlperf.org.

[55]

Pratanu Roy, Arijit Khan, and Gustavo Alonso. 2016. Augmented Sketch: Faster and More Accurate Stream Processing. In Proceedings of the International Conference on Management of Data (SIGMOD).

Digital Library

[56]

Pengyang Shao, Le Wu, Lei Chen, Kun Zhang, and Meng Wang. 2022. FairCF: fairness-aware collaborative filtering. Science China Information Sciences, Vol. 65, 12 (2022).

[57]

Benwei Shi, Zhuoyue Zhao, Yanqing Peng, Feifei Li, and Jeff M. Phillips. 2021. At-the-time and Back-in-time Persistent Sketches. In Proceedings of the International Conference on Management of Data (SIGMOD).

[58]

Hao-Jun Michael Shi, Dheevatsa Mudigere, Maxim Naumov, and Jiyan Yang. 2020. Compositional Embeddings Using Complementary Partitions for Memory-Efficient Recommendation Systems. In Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD).

[59]

Daniel Ting. 2018. Data Sketches for Disaggregated Subset Sum and Frequent Item Estimation. In Proceedings of the International Conference on Management of Data (SIGMOD).

Digital Library

[60]

Corinna Underwood. 2019. Use cases of recommendation systems in business--current applications and methods. Emerj (2019). https://emerj.com/ai-sector-overviews/use-cases-recommendation-systems/

[61]

Ruoxi Wang, Bin Fu, Gang Fu, and Mingliang Wang. 2017. Deep & Cross Network for Ad Click Predictions. In Proceedings of the ADKDD'17.

Digital Library

[62]

Steve Wang and Will Cukierski. 2014. Avazu Click-Through Rate Prediction. https://kaggle.com/competitions/avazu-ctr-prediction.

[63]

Zehuan Wang, Yingcan Wei, Minseok Lee, Matthias Langer, Fan Yu, Jie Liu, Shijie Liu, Daniel G. Abel, Xu Guo, Jianbing Dong, Ji Shi, and Kunlun Li. 2022. Merlin HugeCTR: GPU-accelerated Recommender System Training and Inference. In Proceedings of the 16th ACM Conference on Recommender Systems (RecSys).

Digital Library

[64]

Kilian Q. Weinberger, Anirban Dasgupta, John Langford, Alexander J. Smola, and Josh Attenberg. 2009. Feature hashing for large scale multitask learning. In Proceedings of the 26th International Conference on Machine Learning (ICML).

Digital Library

[65]

Minhui Xie, Kai Ren, Youyou Lu, Guangxu Yang, Qingxing Xu, Bihai Wu, Jiazhen Lin, Hongbo Ao, Wanhong Xu, and Jiwu Shu. 2020. Kraken: memory-efficient continual learning for large-scale real-time recommendations. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[66]

Xing Xie, Jianxun Lian, Zheng Liu, Xiting Wang, Fangzhao Wu, Hongwei Wang, and Zhongxia Chen. 2018. Personalized recommendation systems: Five hot research topics you must know. Microsoft Research Lab-Asia (2018). https://www.microsoft.com/en-us/research/lab/microsoft-research-asia/articles/personalized-recommendation-systems/

[67]

Zhiqiang Xu, Dong Li, Weijie Zhao, Xing Shen, Tianbo Huang, Xiaoyun Li, and Ping Li. 2021. Agile and Accurate CTR Prediction Model Training for Massive-Scale Online Advertising Systems. In Proceedings of the International Conference on Management of Data (SIGMOD).

Digital Library

[68]

Bencheng Yan, Pengjie Wang, Jinquan Liu, Wei Lin, Kuang-Chih Lee, Jian Xu, and Bo Zheng. 2021. Binary Code based Hash Embedding for Web-scale Applications. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (CIKM).

Digital Library

[69]

Renchi Yang, Jieming Shi, Xiaokui Xiao, Yin Yang, Juncheng Liu, and Sourav S. Bhowmick. 2020. Scaling Attributed Network Embedding to Massive Graphs. Proceedings of the VLDB Endowment, Vol. 14, 1 (2020), 37--49.

Digital Library

[70]

Tong Yang, Junzhi Gong, Haowei Zhang, Lei Zou, Lei Shi, and Xiaoming Li. 2018a. HeavyGuardian: Separate and Guard Hot Items in Data Streams. In Proceedings of the 24th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD).

Digital Library

[71]

Tong Yang, Jie Jiang, Peng Liu, Qun Huang, Junzhi Gong, Yang Zhou, Rui Miao, Xiaoming Li, and Steve Uhlig. 2018b. Elastic sketch: adaptive and fast network-wide measurements. In Proceedings of the 2018 ACM SIGCOMM Conference.

Digital Library

[72]

Chunxing Yin, Bilge Acun, Carole-Jean Wu, and Xing Liu. 2021. TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models. In Proceedings of Machine Learning and Systems (MLSys).

[73]

Zhiyang Yuan, Wenguang Zheng, Peilin Yang, Qingbo Hao, and Yingyuan Xiao. 2023. Evolving Interest with Feature Co-action Network for CTR Prediction. Data Science and Engineering, Vol. 8, 4 (2023), 344--356.

[74]

Caojin Zhang, Yicun Liu, Yuanpu Xie, Sofia Ira Ktena, Alykhan Tejani, Akshay Gupta, Pranay Kumar Myana, Deepak Dilipkumar, Suvadip Paul, Ikuhiro Ihara, Prasang Upadhyaya, Ferenc Huszar, and Wenzhe Shi. 2020. Model Size Reduction Using Frequency Based Double Hashing for Recommender Systems. In Proceedings of the 14th ACM Conference on Recommender Systems (RecSys).

Digital Library

[75]

Hailin Zhang, Zirui Liu, and Boxuan Chen. 2023 a. Source codes related to CAFE. https://github.com/HugoZHL/CAFE.

[76]

Hailin Zhang, Penghao Zhao, Xupeng Miao, Yingxia Shao, Zirui Liu, Tong Yang, and Bin Cui. 2023 b. Experimental Analysis of Large-scale Learnable Vector Storage Compression. CoRR, Vol. abs/2311.15578 (2023).

[77]

Jia-Dong Zhang and Chi-Yin Chow. 2015. GeoSoCa: Exploiting Geographical, Social and Categorical Correlations for Point-of-Interest Recommendations. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR).

Digital Library

[78]

Yinda Zhang, Zaoxing Liu, Ruixin Wang, Tong Yang, Jizhou Li, Ruijie Miao, Peng Liu, Ruwen Zhang, and Junchen Jiang. 2021. CocoSketch: high-performance sketch-based measurement over arbitrary partial key query. In Proceedings of the 2021 ACM SIGCOMM Conference.

Digital Library

[79]

Weijie Zhao, Deping Xie, Ronglai Jia, Yulei Qian, Ruiquan Ding, Mingming Sun, and Ping Li. 2020. Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems. In Proceedings of Machine Learning and Systems (MLSys).

[80]

Weijie Zhao, Jingyuan Zhang, Deping Xie, Yulei Qian, Ronglai Jia, and Ping Li. 2019. AIBox: CTR Prediction Model Training on a Single Node. In Proceedings of the 28th ACM International Conference on Information & Knowledge Management (CIKM).

Digital Library

[81]

Xiangyu Zhao, Haochen Liu, Hui Liu, Jiliang Tang, Weiwei Guo, Jun Shi, Sida Wang, Huiji Gao, and Bo Long. 2021. AutoDim: Field-aware Embedding Dimension Searchin Recommender Systems. In Proceedings of the Web Conference (WWW).

Digital Library

[82]

Yue Zhao, Gao Cong, Jiachen Shi, and Chunyan Miao. 2022. QueryFormer: A Tree Transformer Model for Query Plan Representation. Proceedings of the VLDB Endowment, Vol. 15, 8 (2022), 1658--1670.

Digital Library

[83]

Guorui Zhou, Xiaoqiang Zhu, Chengru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep Interest Network for Click-Through Rate Prediction. In Proceedings of the 24th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD).

Digital Library

[84]

Jieming Zhu, Jinyang Liu, Shuai Yang, Qi Zhang, and Xiuqiang He. 2021. Open Benchmarking for Click-Through Rate Prediction. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (CIKM).

Digital Library

Cited By

Liu ZZhang HChen BJiang ZZhao YTao YYang TCui B(2025)CAFE+: Towards Compact, Adaptive, and Fast Embedding for Large-scale Online Recommendation ModelsACM Transactions on Information Systems10.1145/3713072Online publication date: 21-Jan-2025
https://doi.org/10.1145/3713072
Huo PDevulapally AMaruf HPark MNair KArunachalam MAkbulut GKandemir MNarayanan V(2024)PIFS-Rec: Process-In-Fabric-Switch for Large-Scale Recommendation System Inferences2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00052(612-626)Online publication date: 2-Nov-2024
https://doi.org/10.1109/MICRO61859.2024.00052
Liu ZDong FLiu CDeng XYang TZhao YLi JCui BZhang G(2024)WavingSketch: an unbiased and generic sketch for finding top-k items in data streamsThe VLDB Journal10.1007/s00778-024-00869-633:5(1697-1722)Online publication date: 29-Jul-2024
https://doi.org/10.1007/s00778-024-00869-6

Index Terms

CAFE: Towards Compact, Adaptive, and Fast Embedding for Large-scale Recommendation Models

Recommendations

CAFE+: Towards Compact, Adaptive, and Fast Embedding for Large-scale Online Recommendation Models
The growing memory demands of embedding tables in Deep Learning Recommendation Models (DLRMs) pose great challenges for model training and deployment. Existing embedding compression solutions cannot simultaneously achieve memory efficiency, low latency, ...
Dual-image reversible data hiding method using maximum embedding ability of each pixel
Abstract
Compared to conventional reversible data hiding (RDH) methods, dual-image RDH methods have greater embedding capacity, better stego image quality and higher security. Among existing dual-image RDH methods, three methods of Lu et al., ...
Digital audio steganography using DWT with reduced embedding error and better extraction compared to DCT
ICWET '11: Proceedings of the International Conference & Workshop on Emerging Trends in Technology

The proposed system showed high hiding rates with reasonable imperceptibility compared to other steganographic systems, DCT and better audio quality. The results shown gives detail comparison between DWT and DCT. In this paper a novel method for digital ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data

Proceedings of the ACM on Management of Data Volume 2, Issue 1

SIGMOD

February 2024

1874 pages

EISSN:2836-6573

DOI:10.1145/3654807

Editor:
Divyakant Agrawal
UC Santa Barbara, United States

Issue’s Table of Contents

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 March 2024

Published in PACMMOD Volume 2, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
356
Total Downloads

Downloads (Last 12 months)356
Downloads (Last 6 weeks)33

Reflects downloads up to 05 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu ZZhang HChen BJiang ZZhao YTao YYang TCui B(2025)CAFE+: Towards Compact, Adaptive, and Fast Embedding for Large-scale Online Recommendation ModelsACM Transactions on Information Systems10.1145/3713072Online publication date: 21-Jan-2025
https://doi.org/10.1145/3713072
Huo PDevulapally AMaruf HPark MNair KArunachalam MAkbulut GKandemir MNarayanan V(2024)PIFS-Rec: Process-In-Fabric-Switch for Large-Scale Recommendation System Inferences2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00052(612-626)Online publication date: 2-Nov-2024
https://doi.org/10.1109/MICRO61859.2024.00052
Liu ZDong FLiu CDeng XYang TZhao YLi JCui BZhang G(2024)WavingSketch: an unbiased and generic sketch for finding top-k items in data streamsThe VLDB Journal10.1007/s00778-024-00869-633:5(1697-1722)Online publication date: 29-Jul-2024
https://doi.org/10.1007/s00778-024-00869-6

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents