Improve Robustness of Reinforcement Learning against Observation Perturbations via $l_\infty$ Lipschitz Policy Networks

Nie, Buqing; Ji, Jingtian; Fu, Yangqing; Gao, Yue

Computer Science > Machine Learning

arXiv:2312.08751 (cs)

[Submitted on 14 Dec 2023]

Title:Improve Robustness of Reinforcement Learning against Observation Perturbations via $l_\infty$ Lipschitz Policy Networks

Authors:Buqing Nie, Jingtian Ji, Yangqing Fu, Yue Gao

View PDF HTML (experimental)

Abstract:Deep Reinforcement Learning (DRL) has achieved remarkable advances in sequential decision tasks. However, recent works have revealed that DRL agents are susceptible to slight perturbations in observations. This vulnerability raises concerns regarding the effectiveness and robustness of deploying such agents in real-world applications. In this work, we propose a novel robust reinforcement learning method called SortRL, which improves the robustness of DRL policies against observation perturbations from the perspective of the network architecture. We employ a novel architecture for the policy network that incorporates global $l_\infty$ Lipschitz continuity and provide a convenient method to enhance policy robustness based on the output margin. Besides, a training framework is designed for SortRL, which solves given tasks while maintaining robustness against $l_\infty$ bounded perturbations on the observations. Several experiments are conducted to evaluate the effectiveness of our method, including classic control tasks and video games. The results demonstrate that SortRL achieves state-of-the-art robustness performance against different perturbation strength.

Comments:	Accepted paper on AAAI2024
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2312.08751 [cs.LG]
	(or arXiv:2312.08751v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2312.08751

Submission history

From: Buqing Nie [view email]
[v1] Thu, 14 Dec 2023 08:57:22 UTC (79 KB)

Computer Science > Machine Learning

Title:Improve Robustness of Reinforcement Learning against Observation Perturbations via $l_\infty$ Lipschitz Policy Networks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Improve Robustness of Reinforcement Learning against Observation Perturbations via $l_\infty$ Lipschitz Policy Networks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators