skip to main content
10.1145/3488560.3498403acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Scope-aware Re-ranking with Gated Attention in Feed

Published: 15 February 2022 Publication History

Abstract

Modern recommendation systems introduce the re-ranking stage to optimize the entire list directly. This paper focuses on the design of re-ranking framework in feed to optimally model the mutual influence between items and further promote user engagement. On mobile devices, users browse the feed almost in a top-down manner and rarely compare items back and forth. Besides, users often compare item with its adjacency based on their partial observations. Given the distinct user behavior patterns, the modeling of mutual influence between items should be carefully designed. Existing re-ranking models encode the mutual influence between items with sequential encoding methods. However, previous works may be dissatisfactory due to the ignorance of connections between items on different scopes. In this paper, we first discuss Unidirectivity and Locality on the impacts and consequences, then report corresponding solutions in industrial applications. We propose a novel framework based on the empirical evidence from user analysis. To address the above problems, we design a \underlineS cope-aware \underlineR e-ranking with \underlineG ated \underlineA ttention model (SRGA ) to emulate the user behavior patterns from two aspects: 1) we emphasize the influence along the user's common browsing direction; 2) we strength the impacts of pivotal adjacent items within the user visual window. Specifically, we design a global scope attention to encode inter-item patterns unidirectionally from top to bottom. Besides, we devise a local scope attention sliding over the recommendation list to underline interactions among neighboring items. Furthermore, we design a learned gate mechanism to aggregating the information dynamically from local and global scope attention. Extensive offline experiments and online A/B testing demonstrate the benefits of our novel framework. The proposed SRGA model achieves the best performance in offline metrics compared with the state-of-the-art re-ranking methods. Further, empirical results on live traffic validate that our recommender system, equipped with SRGA in the re-ranking stage, improves significantly in user engagement.

Supplementary Material

MP4 File (WSDM22-fp213.mp4)
Modern recommendation systems introduce the re-ranking stage to optimize the entire list directly. This paper focuses on the design of re-ranking framework in feed to optimally model the mutual influence between items and further promote user engagement. On mobile devices, users browse the feed almost in a top-down manner and rarely compare items back and forth. Besides, users often compare item with its adjacency based on their partial observations. Given the distinct user behavior patterns, the modeling of mutual influence between items should be carefully designed. Existing reranking models encode the mutual influence between items with sequential encoding methods. However, previous works may be dissatisfactory due to the ignorance of connections between items on different scopes. Hereby, we discuss Unidirectivity and Locality on the impacts and consequences, then report corresponding solutions in industrial applications. We propose a novel re-ranking method to tackle the problem.

References

[1]
Qingyao Ai, Keping Bi, Jiafeng Guo, and W. Bruce Croft. 2018. Learning a Deep Listwise Context Model for Ranking Refinement. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR '18). Association for Computing Machinery, New York, NY, USA, 135--144. https://doi.org/10.1145/3209978.3209985
[2]
Qingyao Ai, Xuanhui Wang, Sebastian Bruch, Nadav Golbandi, Michael Bendersky, and Marc Najork. 2019. Learning Groupwise Multivariate Scoring Functions Using Deep Neural Networks. In Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval (ICTIR '19). Association for Computing Machinery, New York, NY, USA, 85--92. https://doi.org/10.1145/3341981.3344218
[3]
Lei Jimmy Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer Normalization. CoRR, Vol. abs/1607.06450 (2016). showeprint[arXiv]1607.06450 http://arxiv.org/abs/1607.06450
[4]
Irwan Bello, Sayali Kulkarni, Sagar Jain, Craig Boutilier, Ed Chi, Elad Eban, Xiyang Luo, Alan Mackey, and Ofer Meshi. 2019. Seq2Slate: Re-ranking and Slate Optimization with RNNs. arXiv:1810.02019 (2019).
[5]
Denny Britz, Anna Goldie, Minh-Thang Luong, and Quoc Le. 2017. Massive Exploration of Neural Machine Translation Architectures. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing . Association for Computational Linguistics, Copenhagen, Denmark, 1442--1451. https://doi.org/10.18653/v1/D17--1151
[6]
Tom Brown, Benjamin Mann, Nick Ryder, and etc. Subbiah. [n.d.].
[7]
Christopher Burges, Robert Ragno, and Quoc Le. [n.d.]. Learning to Rank with Nonsmooth Cost Functions. In Advances in Neural Information Processing Systems, B. Schölkopf, J. Platt, and T. Hoffman (Eds.). MIT Press.
[8]
Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. 2005. Learning to Rank Using Gradient Descent. In Proceedings of the 22nd International Conference on Machine Learning (ICML '05). Association for Computing Machinery, New York, NY, USA, 89--96. https://doi.org/10.1145/1102351.1102363
[9]
Chris J.C. Burges. 2010. From RankNet to LambdaRank to LambdaMART: An Overview. https://www.microsoft.com/en-us/research/publication/from-ranknet-to-lambdarank-to-lambdamart-an-overview/
[10]
Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. 2007. Learning to Rank: From Pairwise Approach to Listwise Approach. In Proceedings of the 24th International Conference on Machine Learning (ICML '07). Association for Computing Machinery, New York, NY, USA, 129--136. https://doi.org/10.1145/1273496.1273513
[11]
Jiawei Chen, Hande Dong, Xiang Wang, Fuli Feng, Meng Wang, and Xiangnan He. 2020. Bias and Debias in Recommender System: A Survey and Future Directions. arXiv:2010.03240 (2020).
[12]
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. 2016. Wide & Deep Learning for Recommender Systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems (DLRS 2016). Association for Computing Machinery, New York, NY, USA, 7--10. https://doi.org/10.1145/2988450.2988454
[13]
Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv:1412.3555 (2014).
[14]
Ruocheng Guo, Xiaoting Zhao, Adam Henderson, Liangjie Hong, and Huan Liu. 2020. Debiasing Grid-Based Product Search in E-Commerce. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD '20). Association for Computing Machinery, New York, NY, USA, 2852--2860. https://doi.org/10.1145/3394486.3403336
[15]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770--778. https://doi.org/10.1109/CVPR.2016.90
[16]
L. C. Jain and L. R. Medsker. 1999. Recurrent Neural Networks: Design and Applications 1st ed.). CRC Press, Inc., USA.
[17]
Kalervo J"arvelin and Jaana Kek"al"ainen. 2000. IR Evaluation Methods for Retrieving Highly Relevant Documents. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '00). Association for Computing Machinery, New York, NY, USA, 41--48. https://doi.org/10.1145/345508.345545
[18]
Thorsten Joachims. 2006. Training Linear SVMs in Linear Time. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '06). Association for Computing Machinery, New York, NY, USA, 217--226. https://doi.org/10.1145/1150402.1150429
[19]
Alexandros Karatzoglou, Linas Baltrunas, and Yue Shi. 2013. Learning to Rank for Recommender Systems. In Proceedings of the 7th ACM Conference on Recommender Systems (RecSys '13). Association for Computing Machinery, New York, NY, USA, 493--494. https://doi.org/10.1145/2507157.2508063
[20]
Tie-Yan Liu. 2009. Learning to Rank for Information Retrieval. Found. Trends Inf. Retr., Vol. 3, 3 (mar 2009), 225--331. https://doi.org/10.1561/1500000016
[21]
Liang Pang, Jun Xu, Qingyao Ai, Yanyan Lan, Xueqi Cheng, and Jirong Wen. 2020. SetRank: Learning a Permutation-Invariant Ranking Model for Information Retrieval. (2020), 499--508. https://doi.org/10.1145/3397271.3401104
[22]
Changhua Pei, Yi Zhang, Yongfeng Zhang, Fei Sun, Xiao Lin, Hanxiao Sun, Jian Wu, Peng Jiang, Junfeng Ge, Wenwu Ou, and Dan Pei. 2019. Personalized Re-Ranking for Recommendation. In Proceedings of the 13th ACM Conference on Recommender Systems (RecSys '19). Association for Computing Machinery, New York, NY, USA, 3--11. https://doi.org/10.1145/3298689.3347000
[23]
Tao Qin and Tie-Yan Liu. 2013. Introducing LETOR 4.0 Datasets., Vol. abs/1306.2597 (2013).
[24]
Zhen Qin, Le Yan, Honglei Zhuang, Yi Tay, Rama Kumar Pasumarthi, Xuanhui Wang, Mike Bendersky, and Marc Najork. 2021. Are Neural Rankers still Outperformed by Gradient Boosted Decision Trees?. In International Conference on Learning Representations (ICLR) .
[25]
Alec Radford and Karthik Narasimhan. 2018. Improving Language Understanding by Generative Pre-Training.
[26]
Michael Taylor, John Guiver, Stephen Robertson, and Tom Minka. 2008. SoftRank: Optimizing Non-Smooth Rank Metrics. In Proceedings of the 2008 International Conference on Web Search and Data Mining (WSDM '08). Association for Computing Machinery, New York, NY, USA, 77--86. https://doi.org/10.1145/1341531.1341544
[27]
Hamed Valizadegan, Rong Jin, Ruofei Zhang, and Jianchang Mao. 2009. Learning to Rank by Optimizing NDCG Measure. In Proceedings of the 22nd International Conference on Neural Information Processing Systems (NIPS'09). Curran Associates Inc., Red Hook, NY, USA, 1883--1891.
[28]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. [n.d.]. Attention is All you Need. In Advances in Neural Information Processing Systems. Curran Associates, Inc.
[29]
Benyou Wang, Lifeng Shang, Christina Lioma, Xin Jiang, Hao Yang, Qun Liu, and Jakob Grue Simonsen. 2021. On Position Embeddings in Bert. In International Conference on Learning Representations (ICLR) .
[30]
Ruoxi Wang, Bin Fu, Gang Fu, and Mingliang Wang. 2017. Deep & Cross Network for Ad Click Predictions. In Proceedings of the ADKDD'17 (ADKDD'17). Association for Computing Machinery, New York, NY, USA, Article 12, bibinfonumpages7 pages. https://doi.org/10.1145/3124749.3124754
[31]
Xuanhui Wang, Cheng Li, Nadav Golbandi, Michael Bendersky, and Marc Najork. 2018. The LambdaLoss Framework for Ranking Metric Optimization. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM '18). Association for Computing Machinery, New York, NY, USA, 1313--1322. https://doi.org/10.1145/3269206.3271784
[32]
Mark Wilhelm, Ajith Ramanathan, Alexander Bonomo, Sagar Jain, Ed H. Chi, and Jennifer Gillenwater. 2018. Practical Diversified Recommendations on YouTube with Determinantal Point Processes. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM '18). Association for Computing Machinery, New York, NY, USA, 2165--2173. https://doi.org/10.1145/3269206.3272018
[33]
Xiaohui Xie, Jiaxin Mao, Yiqun Liu, Maarten de Rijke, Yunqiu Shao, Zixin Ye, Min Zhang, and Shaoping Ma. 2019. Grid-Based Evaluation Metrics for Web Image Search. In The World Wide Web Conference (WWW '19). Association for Computing Machinery, New York, NY, USA, 2103--2114. https://doi.org/10.1145/3308558.3313514
[34]
Jun Xu and Hang Li. 2007. AdaRank: A Boosting Algorithm for Information Retrieval. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '07). Association for Computing Machinery, New York, NY, USA, 391--398. https://doi.org/10.1145/1277741.1277809
[35]
Jinyun Yan, Zhiyuan Xu, Birjodh Tiwana, and Shaunak Chatterjee. 2020. Ads Allocation in Feed via Constrained Optimization. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD '20). Association for Computing Machinery, New York, NY, USA, 3386--3394. https://doi.org/10.1145/3394486.3403391
[36]
Yisong Yue, Thomas Finley, Filip Radlinski, and Thorsten Joachims. 2007. A Support Vector Method for Optimizing Average Precision. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '07). Association for Computing Machinery, New York, NY, USA, 271--278. https://doi.org/10.1145/1277741.1277790
[37]
Honglei Zhuang, Xuanhui Wang, Michael Bendersky, and Marc Najork. 2020. Feature Transformation for Neural Ranking Models. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '20). Association for Computing Machinery, New York, NY, USA, 1649--1652. https://doi.org/10.1145/3397271.3401333
[38]
Tao Zhuang, Wenwu Ou, and Zhirong Wang. 2018. Globally Optimized Mutual Influence Aware Ranking in E-Commerce Search. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI'18). AAAI Press, 3725--3731.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining
February 2022
1690 pages
ISBN:9781450391320
DOI:10.1145/3488560
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 February 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. learing to rank
  2. re-ranking
  3. recommender system

Qualifiers

  • Research-article

Conference

WSDM '22

Acceptance Rates

Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)77
  • Downloads (Last 6 weeks)1
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media