research-article

Open access

Safe-NORA: Safe Reinforcement Learning-based Mobile Network Resource Allocation for Diverse User Demands

Authors:

Yong LiAuthors Info & Claims

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

Pages 885 - 894

https://doi.org/10.1145/3583780.3615043

Published: 21 October 2023 Publication History

Abstract

As mobile communication technologies advance, mobile networks become increasingly complex, and user requirements become increasingly diverse. To satisfy the diverse demands of users while improving the overall performance of the network system, the limited wireless network resources should be efficiently and dynamically allocated to them based on the magnitude of their demands and their relative location to the base stations. We separated the problem into four constrained subproblems, which we then solved using a safe reinforcement learning method. In addition, we design a reward mechanism to encourage agent cooperation in distributed training environments. We test our methodology in a simulated scenario with thousands of users and hundreds of base stations. According to experimental findings, our method guarantees that over 95% of user demands are satisfied while also maximizing the overall system throughput.

References

[1]

Joshua Achiam, David Held, Aviv Tamar, and Pieter Abbeel. 2017. Constrained policy optimization. In International conference on machine learning. PMLR, 22--31.

[2]

Eitan Altman. 1999. Constrained Markov decision processes: stochastic modeling. Routledge.

[3]

Yinlam Chow, Mohammad Ghavamzadeh, Lucas Janson, and Marco Pavone. 2017. Risk-constrained reinforcement learning with percentile risk criteria. The Journal of Machine Learning Research, Vol. 18, 1 (2017), 6070--6120.

Digital Library

[4]

Dongsheng Ding, Xiaohan Wei, Zhuoran Yang, Zhaoran Wang, and Mihailo Jovanovic. 2021. Provably efficient safe exploration via primal-dual policy optimization. In International Conference on Artificial Intelligence and Statistics. PMLR, 3304--3312.

[5]

Dongsheng Ding, Kaiqing Zhang, Tamer Basar, and Mihailo Jovanovic. 2020. Natural policy gradient primal-dual method for constrained markov decision processes. Advances in Neural Information Processing Systems, Vol. 33 (2020), 8378--8390.

[6]

Medhat Elsayed and Melike Erol-Kantarci. 2019. Reinforcement learning-based joint power and resource allocation for URLLC in 5G. In 2019 IEEE Global Communications Conference (GLOBECOM). IEEE, 1--6.

Digital Library

[7]

Javier Garcia and Fernando Fernández. 2015. A comprehensive survey on safe reinforcement learning. Journal of Machine Learning Research, Vol. 16, 1 (2015), 1437--1480.

Digital Library

[8]

Jiahui Gong, Qiaohong Yu, Tong Li, Haoqiang Liu, Jun Zhang, Hangyu Fan, Depeng Jin, and Yong Li. 2023. Scalable Digital Twin System for Mobile Networks with Generative AI. In Proceedings of the 21st ACM International Conference on Mobile Systems, Applications, and Services.

Digital Library

[9]

Chaofan He, Yang Hu, Yan Chen, and Bing Zeng. 2019. Joint power allocation and channel assignment for NOMA with deep reinforcement learning. IEEE Journal on Selected Areas in Communications, Vol. 37, 10 (2019), 2200--2210.

[10]

Shuodi Hui, Huandong Wang, Tong Li, Xinghao Yang, Xing Wang, Junlan Feng, Lin Zhu, Chao Deng, Pan Hui, Depeng Jin, and Yong Li. 2023. Large-Scale Urban Cellular Traffic Generation via Knowledge-Enhanced GANs with Multi-Periodic Patterns. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '23). 4195--4206.

Digital Library

[11]

Deepak Kumar Jain, Sumarga Kumar Sah Tyagi, Subramani Neelakandan, Mohan Prakash, and Lakshmaiya Natrayan. 2021. Metaheuristic optimization-based resource allocation technique for cybertwin-driven 6G on IoE environment. IEEE Transactions on Industrial Informatics, Vol. 18, 7 (2021), 4884--4892.

[12]

Bushra Jamil, Humaira Ijaz, Mohammad Shojafar, Kashif Munir, and Rajkumar Buyya. 2022. Resource Allocation and Task Scheduling in Fog Computing and Internet of Everything Environments: A Taxonomy, Review, and Future Directions. ACM Computing Surveys (CSUR) (2022).

[13]

Dong Ku Kim, Hyeonwoo Lee, Seong-Choon Lee, and Sunwoo Lee. 2020. 5G commercialization and trials in Korea. Commun. ACM, Vol. 63, 4 (2020), 82--85.

Digital Library

[14]

Apostolos Kousaridas, Ramya P Manjunath, José Perdomo, Chan Zhou, Ernst Zielinski, Steffen Schmitz, and Andreas Pfadler. 2021. Qos prediction for 5g connected and automated driving. IEEE Communications Magazine, Vol. 59, 9 (2021), 58--64.

Digital Library

[15]

Yeong-Dae Kwon, Jinho Choo, Iljoo Yoon, Minah Park, Duwon Park, and Youngjune Gwon. 2021. Matrix encoding networks for neural combinatorial optimization. In Advances in Neural Information Processing Systems, Vol. 34. 5138--5149.

[16]

Tong Li, Tristan Braud, Yong Li, and Pan Hui. 2020. Lifecycle-aware online video caching. IEEE Transactions on Mobile Computing, Vol. 20, 8 (2020), 2624--2636.

[17]

Tong Li, Zhu Xiao, Hassana Maigary Georges, Zhinian Luo, and Dong Wang. 2016. Performance Analysis of Co-and Cross-tier Device-to-Device Communication Underlaying Macro-small Cell Wireless Networks. KSII Transactions on Internet & Information Systems, Vol. 10, 4 (2016).

[18]

Tong Li, Li Yu, Yibo Ma, Tong Duan, Wenzhen Huang, Yan Zhou, Depeng Jin, Yong Li, and Tao Jiang. 2023. Carbon emissions and sustainability of launching 5G mobile networks in China. arXiv preprint arXiv:2306.08337 (2023).

[19]

Yongshuai Liu, Jiaxin Ding, and Xin Liu. 2020. IPO: Interior-point policy optimization under constraints. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 4940--4947.

[20]

Qingyue Long, Huandong Wang, Tong Li, Lisi Huang, Kun Wang, Qiong Wu, Guangyu Li, Yanping Liang, Li Yu, and Yong Li. 2023. Practical Synthetic Human Trajectories Generation Based on Variational Point Processes. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '23). 4561--4571.

Digital Library

[21]

Nguyen Cong Luong, Dinh Thai Hoang, Shimin Gong, Dusit Niyato, Ping Wang, Ying-Chang Liang, and Dong In Kim. 2019. Applications of deep reinforcement learning in communications and networking: A survey. IEEE Communications Surveys & Tutorials, Vol. 21, 4 (2019), 3133--3174.

Digital Library

[22]

Fan Meng, Peng Chen, and Lenan Wu. 2019. Power allocation in multi-user cellular networks with deep Q learning approach. In ICC 2019--2019 IEEE International Conference on Communications (ICC). IEEE, 1--6.

[23]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. 2015. Human-level control through deep reinforcement learning. nature, Vol. 518, 7540 (2015), 529--533.

[24]

Santiago Paternain, Luiz Chamon, Miguel Calvo-Fullana, and Alejandro Ribeiro. 2019. Constrained reinforcement learning has zero duality gap. Advances in Neural Information Processing Systems, Vol. 32 (2019).

[25]

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).

[26]

Li Shen, Long Yang, Shixiang Chen, Bo Yuan, Xueqian Wang, Dacheng Tao, et al. 2022. Penalized Proximal Policy Optimization for Safe Reinforcement Learning. arXiv preprint arXiv:2205.11814 (2022).

[27]

Osvaldo Simeone. 2018. A very brief introduction to machine learning with applications to communication systems. IEEE Transactions on Cognitive Communications and Networking, Vol. 4, 4 (2018), 648--664.

[28]

Yushan Siriwardhana, Pawani Porambage, Madhusanka Liyanage, and Mika Ylianttila. 2021. A survey on mobile augmented reality with 5G mobile edge computing: architectures, applications, and technical aspects. IEEE Communications Surveys & Tutorials, Vol. 23, 2 (2021), 1160--1192.

[29]

Adam Stooke, Joshua Achiam, and Pieter Abbeel. 2020. Responsive safety in reinforcement learning by pid lagrangian methods. In International Conference on Machine Learning. PMLR, 9133--9143.

[30]

Tarik Taleb, Konstantinos Samdanis, Badr Mada, Hannu Flinck, Sunny Dutta, and Dario Sabella. 2017. On multi-access edge computing: A survey of the emerging 5G network edge cloud architecture and orchestration. IEEE Communications Surveys & Tutorials, Vol. 19, 3 (2017), 1657--1681.

Digital Library

[31]

Fengxiao Tang, Bomin Mao, Yuichi Kawamoto, and Nei Kato. 2021. Survey on machine learning for intelligent end-to-end communication toward 6G: From network access, routing to traffic control and streaming adaption. IEEE Communications Surveys & Tutorials, Vol. 23, 3 (2021), 1578--1598.

[32]

Chen Tessler, Daniel J Mankowitz, and Shie Mannor. 2018. Reward constrained policy optimization. arXiv preprint arXiv:1805.11074 (2018).

[33]

Qiong Wu, Xu Chen, Zhi Zhou, Liang Chen, and Junshan Zhang. 2021. Deep reinforcement learning with spatio-temporal traffic forecasting for data-driven base station sleep control. IEEE/ACM Transactions on Networking, Vol. 29, 2 (2021), 935--948.

Digital Library

[34]

Yulei Wu, Hong-Ning Dai, Haozhe Wang, Zehui Xiong, and Song Guo. 2022. A survey of intelligent network slicing management for industrial IoT: integrated approaches for smart transportation, smart energy, and smart factory. IEEE Communications Surveys & Tutorials, Vol. 24, 2 (2022), 1175--1211.

[35]

Zhu Xiao, Tong Li, Wei Ding, Dong Wang, and Jie Zhang. 2016a. Dynamic PCI allocation on avoiding handover confusion via cell status prediction in LTE heterogeneous small cell networks. Wireless Communications and Mobile Computing, Vol. 16, 14 (2016), 1972--1986.

Digital Library

[36]

Zhu Xiao, Hongjing Liu, Vincent Havyarimana, Tong Li, and Dong Wang. 2016b. Analytical study on multi-tier 5G heterogeneous small cell networks: Coverage performance and energy efficiency. Sensors, Vol. 16, 11 (2016), 1854.

[37]

Zhu Xiao, Jianzhi Yu, Tong Li, Zhiyang Xiang, Dong Wang, and Wenjie Chen. 2016c. Resource allocation via hierarchical clustering in dense small cell networks: A correlated equilibrium approach. In 2016 IEEE 27th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC). IEEE, 1--5.

Digital Library

[38]

Quan Yu, Jing Ren, Yinjin Fu, Ying Li, and Wei Zhang. 2019. Cybertwin: An origin of next generation network architecture. IEEE Wireless Communications, Vol. 26, 6 (2019), 111--117.

Digital Library

[39]

Hao Zhou, Medhat Elsayed, and Melike Erol-Kantarci. 2021. RAN Resource Slicing in 5G Using Multi-Agent Correlated Q-Learning. In 2021 IEEE 32nd Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC). IEEE, 1179--1184.

[40]

Hao Zhu, Yang Cao, Wei Wang, Tao Jiang, and Shi Jin. 2018. Deep reinforcement learning for mobile edge caching: Review, new features, and open issues. IEEE Network, Vol. 32, 6 (2018), 50--57.

Cited By

Omolara AAlawida M(2025)DaE2: Unmasking malicious URLs by leveraging diverse and efficient ensemble machine learning for online securityComputers & Security10.1016/j.cose.2024.104170148(104170)Online publication date: Jan-2025
https://doi.org/10.1016/j.cose.2024.104170
Yu XZhu MZhu MZhou XLong Lkhodaparast M(2025)Location-aware job scheduling for IoT systems using cloud and fogAlexandria Engineering Journal10.1016/j.aej.2024.09.055110(346-362)Online publication date: Jan-2025
https://doi.org/10.1016/j.aej.2024.09.055
Liu PZhang TTian FTeng YYang M(2024)Hybrid Decision Support Framework for Energy Scheduling Using Stochastic Optimization and Cooperative Game TheoryEnergies10.3390/en1724638617:24(6386)Online publication date: 19-Dec-2024
https://doi.org/10.3390/en17246386
Show More Cited By

Index Terms

Safe-NORA: Safe Reinforcement Learning-based Mobile Network Resource Allocation for Diverse User Demands
1. Computing methodologies
  1. Artificial intelligence
    1. Planning and scheduling
      1. Multi-agent planning
2. Networks
  1. Network types
    1. Mobile networks

Recommendations

Model-based Safe Reinforcement Learning using Variable Horizon Rollouts
CODS-COMAD '24: Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD)

Safe reinforcement learning aims to ensure the safety of agents and their interactions with the environment. Traditional reinforcement learning algorithms often neglect safety considerations, resulting in undesirable consequences when deployed in real-...
Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning
Abstract
Reinforcement learning (RL) is a learning method that learns actions based on trial and error. Recently, multi-objective reinforcement learning (MORL) and safe reinforcement learning (SafeRL) have been studied. The objective of conventional RL is ...
D2D Resource Allocation Based on Reinforcement Learning and QoS
Abstract
Device-to-device (D2D) communications is designed to improve the overall network performance, including low latency, high data rates, and system capacity of the fifth-generation (5G) wireless networks. The system capacity can even be improved by ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

October 2023

5508 pages

ISBN:9798400701245

DOI:10.1145/3583780

General Chairs:
Ingo Frommholz
University of Wolverhampton, UK
,
Frank Hopfgartner
University of Koblenz, Germany
,
Mark Lee
University of Birmingham, UK
,
Michael Oakes
University of Birmingham, UK
,
Program Chairs:
Mounia Lalmas
Spotify, UK
,
Min Zhang
Tsinghua University, China
,
Rodrygo Santos
Federal University of Minas Gerais, Brazil

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

CIKM '23

Sponsor:

CIKM '23: The 32nd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2023

Birmingham, United Kingdom

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

21
Total Citations
View Citations
406
Total Downloads

Downloads (Last 12 months)340
Downloads (Last 6 weeks)71

Reflects downloads up to 21 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Omolara AAlawida M(2025)DaE2: Unmasking malicious URLs by leveraging diverse and efficient ensemble machine learning for online securityComputers & Security10.1016/j.cose.2024.104170148(104170)Online publication date: Jan-2025
https://doi.org/10.1016/j.cose.2024.104170
Yu XZhu MZhu MZhou XLong Lkhodaparast M(2025)Location-aware job scheduling for IoT systems using cloud and fogAlexandria Engineering Journal10.1016/j.aej.2024.09.055110(346-362)Online publication date: Jan-2025
https://doi.org/10.1016/j.aej.2024.09.055
Liu PZhang TTian FTeng YYang M(2024)Hybrid Decision Support Framework for Energy Scheduling Using Stochastic Optimization and Cooperative Game TheoryEnergies10.3390/en1724638617:24(6386)Online publication date: 19-Dec-2024
https://doi.org/10.3390/en17246386
Wang YOthman MChoo WLiu RWang X(2024)DFRDRL: a dynamic fuzzy routing algorithm based on deep reinforcement learning with guaranteed latency and bandwidth for software-defined networksJournal of Big Data10.1186/s40537-024-01029-x11:1Online publication date: 28-Oct-2024
https://doi.org/10.1186/s40537-024-01029-x
Ali YKhan HKhan FMoon Y(2024)Building integrated assessment model for IoT technology deployment in the Industry 4.0Journal of Cloud Computing10.1186/s13677-024-00718-513:1Online publication date: 14-Nov-2024
https://doi.org/10.1186/s13677-024-00718-5
Gong JLi TWang HLiu YWang XWang ZDeng CFeng JJin DLi Y(2024)KGDA: A Knowledge Graph Driven Decomposition Approach for Cellular Traffic PredictionACM Transactions on Intelligent Systems and Technology10.1145/369065015:6(1-22)Online publication date: 20-Nov-2024
https://dl.acm.org/doi/10.1145/3690650
Li THui SZhang SWang HZhang YHui PJin DLi Y(2024)Mobile User Traffic Generation Via Multi-Scale Hierarchical GANACM Transactions on Knowledge Discovery from Data10.1145/366465518:8(1-19)Online publication date: 10-May-2024
https://dl.acm.org/doi/10.1145/3664655
Chai HJiang TYu LBaeza-Yates RBonchi F(2024)Diffusion Model-based Mobile Traffic Generation with Open Data for Network Planning and OptimizationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671544(4828-4838)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671544
Mo JZhang Z(2024)Research on University Network Data Anomaly Detection and Security Protection Algorithm Based on edge computingSPIN10.1142/S2010324724400095Online publication date: 7-Nov-2024
https://doi.org/10.1142/S2010324724400095
Liu HLi TJiang FSu WWang Z(2024)Coverage Optimization for Large-Scale Mobile Networks With Digital Twin and Multi-Agent Reinforcement LearningIEEE Transactions on Wireless Communications10.1109/TWC.2024.346463923:12(18316-18330)Online publication date: Dec-2024
https://doi.org/10.1109/TWC.2024.3464639
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents