skip to main content
10.1145/3583780.3615043acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article
Open access

Safe-NORA: Safe Reinforcement Learning-based Mobile Network Resource Allocation for Diverse User Demands

Published: 21 October 2023 Publication History

Abstract

As mobile communication technologies advance, mobile networks become increasingly complex, and user requirements become increasingly diverse. To satisfy the diverse demands of users while improving the overall performance of the network system, the limited wireless network resources should be efficiently and dynamically allocated to them based on the magnitude of their demands and their relative location to the base stations. We separated the problem into four constrained subproblems, which we then solved using a safe reinforcement learning method. In addition, we design a reward mechanism to encourage agent cooperation in distributed training environments. We test our methodology in a simulated scenario with thousands of users and hundreds of base stations. According to experimental findings, our method guarantees that over 95% of user demands are satisfied while also maximizing the overall system throughput.

References

[1]
Joshua Achiam, David Held, Aviv Tamar, and Pieter Abbeel. 2017. Constrained policy optimization. In International conference on machine learning. PMLR, 22--31.
[2]
Eitan Altman. 1999. Constrained Markov decision processes: stochastic modeling. Routledge.
[3]
Yinlam Chow, Mohammad Ghavamzadeh, Lucas Janson, and Marco Pavone. 2017. Risk-constrained reinforcement learning with percentile risk criteria. The Journal of Machine Learning Research, Vol. 18, 1 (2017), 6070--6120.
[4]
Dongsheng Ding, Xiaohan Wei, Zhuoran Yang, Zhaoran Wang, and Mihailo Jovanovic. 2021. Provably efficient safe exploration via primal-dual policy optimization. In International Conference on Artificial Intelligence and Statistics. PMLR, 3304--3312.
[5]
Dongsheng Ding, Kaiqing Zhang, Tamer Basar, and Mihailo Jovanovic. 2020. Natural policy gradient primal-dual method for constrained markov decision processes. Advances in Neural Information Processing Systems, Vol. 33 (2020), 8378--8390.
[6]
Medhat Elsayed and Melike Erol-Kantarci. 2019. Reinforcement learning-based joint power and resource allocation for URLLC in 5G. In 2019 IEEE Global Communications Conference (GLOBECOM). IEEE, 1--6.
[7]
Javier Garcia and Fernando Fernández. 2015. A comprehensive survey on safe reinforcement learning. Journal of Machine Learning Research, Vol. 16, 1 (2015), 1437--1480.
[8]
Jiahui Gong, Qiaohong Yu, Tong Li, Haoqiang Liu, Jun Zhang, Hangyu Fan, Depeng Jin, and Yong Li. 2023. Scalable Digital Twin System for Mobile Networks with Generative AI. In Proceedings of the 21st ACM International Conference on Mobile Systems, Applications, and Services.
[9]
Chaofan He, Yang Hu, Yan Chen, and Bing Zeng. 2019. Joint power allocation and channel assignment for NOMA with deep reinforcement learning. IEEE Journal on Selected Areas in Communications, Vol. 37, 10 (2019), 2200--2210.
[10]
Shuodi Hui, Huandong Wang, Tong Li, Xinghao Yang, Xing Wang, Junlan Feng, Lin Zhu, Chao Deng, Pan Hui, Depeng Jin, and Yong Li. 2023. Large-Scale Urban Cellular Traffic Generation via Knowledge-Enhanced GANs with Multi-Periodic Patterns. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '23). 4195--4206.
[11]
Deepak Kumar Jain, Sumarga Kumar Sah Tyagi, Subramani Neelakandan, Mohan Prakash, and Lakshmaiya Natrayan. 2021. Metaheuristic optimization-based resource allocation technique for cybertwin-driven 6G on IoE environment. IEEE Transactions on Industrial Informatics, Vol. 18, 7 (2021), 4884--4892.
[12]
Bushra Jamil, Humaira Ijaz, Mohammad Shojafar, Kashif Munir, and Rajkumar Buyya. 2022. Resource Allocation and Task Scheduling in Fog Computing and Internet of Everything Environments: A Taxonomy, Review, and Future Directions. ACM Computing Surveys (CSUR) (2022).
[13]
Dong Ku Kim, Hyeonwoo Lee, Seong-Choon Lee, and Sunwoo Lee. 2020. 5G commercialization and trials in Korea. Commun. ACM, Vol. 63, 4 (2020), 82--85.
[14]
Apostolos Kousaridas, Ramya P Manjunath, José Perdomo, Chan Zhou, Ernst Zielinski, Steffen Schmitz, and Andreas Pfadler. 2021. Qos prediction for 5g connected and automated driving. IEEE Communications Magazine, Vol. 59, 9 (2021), 58--64.
[15]
Yeong-Dae Kwon, Jinho Choo, Iljoo Yoon, Minah Park, Duwon Park, and Youngjune Gwon. 2021. Matrix encoding networks for neural combinatorial optimization. In Advances in Neural Information Processing Systems, Vol. 34. 5138--5149.
[16]
Tong Li, Tristan Braud, Yong Li, and Pan Hui. 2020. Lifecycle-aware online video caching. IEEE Transactions on Mobile Computing, Vol. 20, 8 (2020), 2624--2636.
[17]
Tong Li, Zhu Xiao, Hassana Maigary Georges, Zhinian Luo, and Dong Wang. 2016. Performance Analysis of Co-and Cross-tier Device-to-Device Communication Underlaying Macro-small Cell Wireless Networks. KSII Transactions on Internet & Information Systems, Vol. 10, 4 (2016).
[18]
Tong Li, Li Yu, Yibo Ma, Tong Duan, Wenzhen Huang, Yan Zhou, Depeng Jin, Yong Li, and Tao Jiang. 2023. Carbon emissions and sustainability of launching 5G mobile networks in China. arXiv preprint arXiv:2306.08337 (2023).
[19]
Yongshuai Liu, Jiaxin Ding, and Xin Liu. 2020. IPO: Interior-point policy optimization under constraints. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 4940--4947.
[20]
Qingyue Long, Huandong Wang, Tong Li, Lisi Huang, Kun Wang, Qiong Wu, Guangyu Li, Yanping Liang, Li Yu, and Yong Li. 2023. Practical Synthetic Human Trajectories Generation Based on Variational Point Processes. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '23). 4561--4571.
[21]
Nguyen Cong Luong, Dinh Thai Hoang, Shimin Gong, Dusit Niyato, Ping Wang, Ying-Chang Liang, and Dong In Kim. 2019. Applications of deep reinforcement learning in communications and networking: A survey. IEEE Communications Surveys & Tutorials, Vol. 21, 4 (2019), 3133--3174.
[22]
Fan Meng, Peng Chen, and Lenan Wu. 2019. Power allocation in multi-user cellular networks with deep Q learning approach. In ICC 2019--2019 IEEE International Conference on Communications (ICC). IEEE, 1--6.
[23]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. 2015. Human-level control through deep reinforcement learning. nature, Vol. 518, 7540 (2015), 529--533.
[24]
Santiago Paternain, Luiz Chamon, Miguel Calvo-Fullana, and Alejandro Ribeiro. 2019. Constrained reinforcement learning has zero duality gap. Advances in Neural Information Processing Systems, Vol. 32 (2019).
[25]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).
[26]
Li Shen, Long Yang, Shixiang Chen, Bo Yuan, Xueqian Wang, Dacheng Tao, et al. 2022. Penalized Proximal Policy Optimization for Safe Reinforcement Learning. arXiv preprint arXiv:2205.11814 (2022).
[27]
Osvaldo Simeone. 2018. A very brief introduction to machine learning with applications to communication systems. IEEE Transactions on Cognitive Communications and Networking, Vol. 4, 4 (2018), 648--664.
[28]
Yushan Siriwardhana, Pawani Porambage, Madhusanka Liyanage, and Mika Ylianttila. 2021. A survey on mobile augmented reality with 5G mobile edge computing: architectures, applications, and technical aspects. IEEE Communications Surveys & Tutorials, Vol. 23, 2 (2021), 1160--1192.
[29]
Adam Stooke, Joshua Achiam, and Pieter Abbeel. 2020. Responsive safety in reinforcement learning by pid lagrangian methods. In International Conference on Machine Learning. PMLR, 9133--9143.
[30]
Tarik Taleb, Konstantinos Samdanis, Badr Mada, Hannu Flinck, Sunny Dutta, and Dario Sabella. 2017. On multi-access edge computing: A survey of the emerging 5G network edge cloud architecture and orchestration. IEEE Communications Surveys & Tutorials, Vol. 19, 3 (2017), 1657--1681.
[31]
Fengxiao Tang, Bomin Mao, Yuichi Kawamoto, and Nei Kato. 2021. Survey on machine learning for intelligent end-to-end communication toward 6G: From network access, routing to traffic control and streaming adaption. IEEE Communications Surveys & Tutorials, Vol. 23, 3 (2021), 1578--1598.
[32]
Chen Tessler, Daniel J Mankowitz, and Shie Mannor. 2018. Reward constrained policy optimization. arXiv preprint arXiv:1805.11074 (2018).
[33]
Qiong Wu, Xu Chen, Zhi Zhou, Liang Chen, and Junshan Zhang. 2021. Deep reinforcement learning with spatio-temporal traffic forecasting for data-driven base station sleep control. IEEE/ACM Transactions on Networking, Vol. 29, 2 (2021), 935--948.
[34]
Yulei Wu, Hong-Ning Dai, Haozhe Wang, Zehui Xiong, and Song Guo. 2022. A survey of intelligent network slicing management for industrial IoT: integrated approaches for smart transportation, smart energy, and smart factory. IEEE Communications Surveys & Tutorials, Vol. 24, 2 (2022), 1175--1211.
[35]
Zhu Xiao, Tong Li, Wei Ding, Dong Wang, and Jie Zhang. 2016a. Dynamic PCI allocation on avoiding handover confusion via cell status prediction in LTE heterogeneous small cell networks. Wireless Communications and Mobile Computing, Vol. 16, 14 (2016), 1972--1986.
[36]
Zhu Xiao, Hongjing Liu, Vincent Havyarimana, Tong Li, and Dong Wang. 2016b. Analytical study on multi-tier 5G heterogeneous small cell networks: Coverage performance and energy efficiency. Sensors, Vol. 16, 11 (2016), 1854.
[37]
Zhu Xiao, Jianzhi Yu, Tong Li, Zhiyang Xiang, Dong Wang, and Wenjie Chen. 2016c. Resource allocation via hierarchical clustering in dense small cell networks: A correlated equilibrium approach. In 2016 IEEE 27th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC). IEEE, 1--5.
[38]
Quan Yu, Jing Ren, Yinjin Fu, Ying Li, and Wei Zhang. 2019. Cybertwin: An origin of next generation network architecture. IEEE Wireless Communications, Vol. 26, 6 (2019), 111--117.
[39]
Hao Zhou, Medhat Elsayed, and Melike Erol-Kantarci. 2021. RAN Resource Slicing in 5G Using Multi-Agent Correlated Q-Learning. In 2021 IEEE 32nd Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC). IEEE, 1179--1184.
[40]
Hao Zhu, Yang Cao, Wei Wang, Tao Jiang, and Shi Jin. 2018. Deep reinforcement learning for mobile edge caching: Review, new features, and open issues. IEEE Network, Vol. 32, 6 (2018), 50--57.

Cited By

View all

Index Terms

  1. Safe-NORA: Safe Reinforcement Learning-based Mobile Network Resource Allocation for Diverse User Demands

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
      October 2023
      5508 pages
      ISBN:9798400701245
      DOI:10.1145/3583780
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 21 October 2023

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. mobile networks
      2. multi-agent
      3. resources allocation
      4. safe reinforcement learning

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      CIKM '23
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      CIKM '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)340
      • Downloads (Last 6 weeks)71
      Reflects downloads up to 21 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media