research-article

Federated contextual cascading bandits with asynchronous communication and heterogeneous users

AUTHORs: Hantao Yang, Xutong Liu, Zhiyong Wang,

John C. S. Lui,

Enhong ChenAuthors Info & Claims

AAAI'24/IAAI'24/EAAI'24: Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence

Article No.: 2297, Pages 20596 - 20603

https://doi.org/10.1609/aaai.v38i18.30045

Published: 07 January 2025 Publication History

Abstract

We study the problem of federated contextual combinatorial cascading bandits, where |u| agents collaborate under the coordination of a central server to provide tailored recommendations to the |u| corresponding users. Existing works consider either a synchronous framework, necessitating full agent participation and global synchronization, or assume user homogeneity with identical behaviors. We overcome these limitations by considering (1) federated agents operating in an asynchronous communication paradigm, where no mandatory synchronization is required and all agents communicate independently with the server, (2) heterogeneous user behaviors, where users can be stratified into J ≤ |u| latent user clusters, each exhibiting distinct preferences. For this setting, we propose a UCB-type algorithm with delicate communication protocols. Through theoretical analysis, we give sub-linear regret bounds on par with those achieved in the synchronous framework, while incurring only logarithmic communication costs. Empirical evaluation on synthetic and real-world datasets validates our algorithm's superior performance in terms of regrets and communication costs.

References

[1]

Abbasi-Yadkori, Y.; Pál, D.; and Szepesvári, C. 2011. Improved algorithms for linear stochastic bandits. Advances in neural information processing systems, 24.

[2]

Ahmed, M.; Seraj, R.; and Islam, S. M. S. 2020. The k-means algorithm: A comprehensive survey and performance evaluation. Electronics, 9(8): 1295.

[3]

Choi, H.; Udwani, R.; and Oh, M.-h. 2023. Cascading Contextual Assortment Bandits. In Thirty-seventh Conference on Neural Information Processing Systems.

[4]

Chuklin, A.; Markov, I.; and Rijke, M. d. 2015. Click models for web search. Synthesis lectures on information concepts, retrieval, and services, 7(3): 1-115.

[5]

Craswell, N.; Zoeter, O.; Taylor, M.; and Ramsey, B. 2008. An experimental comparison of click position-bias models. In Proceedings of the 2008 international conference on web search and data mining, 87-94.

Digital Library

[6]

Dubey, A.; and Pentland, A. 2020. Differentially-private federated linear bandits. Advances in Neural Information Processing Systems, 33: 6003-6014.

[7]

Gentile, C.; Li, S.; and Zappella, G. 2014. Online clustering of bandits. In International Conference on Machine Learning, 757-765. PMLR.

[8]

He, J.; Wang, T.; Min, Y.; and Gu, Q. 2022. A Simple and Provably Efficient Algorithm for Asynchronous Federated Contextual Linear Bandits. Advances in neural information processing systems.

[9]

Korda, N.; Szorenyi, B.; and Li, S. 2016. Distributed clustering of linear bandits in peer to peer networks. In International conference on machine learning, 1301-1309. PMLR.

[10]

Kveton, B.; Szepesvari, C.; Wen, Z.; and Ashkan, A. 2015a. Cascading bandits: Learning to rank in the cascade model. In International Conference on Machine Learning, 767-776. PMLR.

[11]

Kveton, B.; Wen, Z.; Ashkan, A.; and Szepesvári, C. 2015b. Combinatorial cascading bandits. In Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 1, 1450-1458.

[12]

Li, C.; and Wang, H. 2021. Asynchronous Upper Confidence Bound Algorithms for Federated Linear Bandits. arXiv preprint arXiv:2110.01463.

[13]

Li, S.; Chen, W.; Li, S.; and Leung, K.-S. 2019. Improved Algorithm on Online Clustering of Bandits. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI'19, 2923-2929. AAAI Press. ISBN 9780999241141.

[14]

Li, S.; Wang, B.; Zhang, S.; and Chen, W. 2016. Contextual combinatorial cascading bandits. In International conference on machine learning, 1245-1253. PMLR.

[15]

Li, S.; and Zhang, S. 2018. Online clustering of contextual cascading bandits. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32.

[16]

Lian, D.; Gao, Z.; Song, X.; Li, Y.; Liu, Q.; and Chen, E. 2023. Training Recommenders Over Large Item Corpus With Importance Sampling. IEEE Transactions on Knowledge and Data Engineering.

[17]

Lian, D.; Liu, Q.; and Chen, E. 2020. Personalized Ranking with Importance Sampling. In Proceedings of The Web Conference 2020, 1093-1103.

[18]

Liu, X.; Zhao, H.; Yu, T.; Li, S.; and Lui, J. 2022a. Federated Online Clustering of Bandits. In The 38th Conference on Uncertainty in Artificial Intelligence.

[19]

Liu, X.; Zuo, J.; Wang, S.; Joe-Wong, C.; Lui, J.; and Chen, W. 2022b. Batch-size independent regret bounds for combinatorial semi-bandits with probabilistically triggered arms or independent arms. Advances in Neural Information Processing Systems, 35: 14904-14916.

[20]

Liu, X.; Zuo, J.; Wang, S.; Lui, J. C.; Hajiesmaili, M.; Wierman, A.; and Chen, W. 2023a. Contextual combinatorial bandits with probabilistically triggered arms. In International Conference on Machine Learning, 22559-22593. PMLR.

[21]

Liu, X.; Zuo, J.; Xie, H.; Joe-Wong, C.; and Lui, J. C. 2023b. Variance-adaptive algorithm for probabilistic maximum coverage bandits with general feedback. In IEEE INFOCOM 2023-IEEE Conference on Computer Communications, 1-10. IEEE.

[22]

Vial, D.; Sanghavi, S.; Shakkottai, S.; and Srikant, R. 2022. Minimax Regret for Cascading Bandits. arXiv preprint arXiv:2203.12577.

[23]

Wang, Y.; Hu, J.; Chen, X.; and Wang, L. 2019. Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication. arXiv e-prints.

[24]

Wang, Z.; Xie, J.; Liu, X.; Li, S.; and Lui, J. 2023. Online clustering of bandits with misspecified user models. arXiv preprint arXiv:2310.02717.

[25]

Wu, C.; Lian, D.; Ge, Y.; Zhu, Z.; and Chen, E. 2023. Influence-Driven Data Poisoning for Robust Recommender Systems. IEEE Transactions on Pattern Analysis and Machine Intelligence.

Digital Library

[26]

Xu, M.; and Klabjan, D. 2023. Decentralized Randomly Distributed Multi-agent Multi-armed Bandit with Heterogeneous Rewards. In Advances on Neural Information Processing Systems.

[27]

Zhu, Z.; Zhu, J.; Liu, J.; and Liu, Y. 2021. Federated bandit: A gossiping approach. In Abstract Proceedings of the 2021 ACM SIGMETRICS/International Conference on Measurement and Modeling of Computer Systems, 3-4.

[28]

Zong, S.; Ni, H.; Sung, K.; Ke, N. R.; Wen, Z.; and Kveton, B. 2016. Cascading bandits for large-scale recommendation problems. arXiv preprint arXiv:1603.05359.

Cited By

Liu X(2025)Scalable and Robust Online Learning for AI-powered Networked SystemsACM SIGMETRICS Performance Evaluation Review10.1145/3712170.371218352:3(39-42)Online publication date: 9-Jan-2025
https://dl.acm.org/doi/10.1145/3712170.3712183

Index Terms

Index terms have been assigned to the content through auto-classification.

Recommendations

A simple and provably efficient algorithm for asynchronous federated contextual linear bandits
NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing Systems

We study federated contextual linear bandits, where M agents cooperate with each other to solve a global contextual linear bandit problem with the help of a central server. We consider the asynchronous setting, where all agents work independently and the ...
Pure exploration in asynchronous federated bandits
UAI '24: Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence

We study the federated pure exploration problem of multi-armed bandits and linear bandits, where M agents cooperatively identify the best arm via communicating with the central server. To enhance the robustness against latency and unavailability of ...
Thompson sampling algorithms for cascading bandits

Motivated by the important and urgent need for efficient optimization in online recommender systems, we revisit the cascading bandit model proposed by Kveton et al. (2015a). While Thompson sampling (TS) algorithms have been shown to be empirically ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

AAAI'24/IAAI'24/EAAI'24: Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence

February 2024

23861 pages

ISBN:978-1-57735-887-9

Copyright © 2024 Association for the Advancement of Artificial Intelligence.

Sponsors

Association for the Advancement of Artificial Intelligence

Publisher

AAAI Press

Publication History

Published: 07 January 2025

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu X(2025)Scalable and Robust Online Learning for AI-powered Networked SystemsACM SIGMETRICS Performance Evaluation Review10.1145/3712170.371218352:3(39-42)Online publication date: 9-Jan-2025
https://dl.acm.org/doi/10.1145/3712170.3712183

View Options

View options

Media

Figures

Other

Tables

View Table of Contents