Efficient Diffusion Policies for Offline Reinforcement Learning

Kang, Bingyi; Ma, Xiao; Du, Chao; Pang, Tianyu; Yan, Shuicheng

Computer Science > Machine Learning

arXiv:2305.20081 (cs)

[Submitted on 31 May 2023 (v1), last revised 26 Oct 2023 (this version, v2)]

Title:Efficient Diffusion Policies for Offline Reinforcement Learning

Authors:Bingyi Kang, Xiao Ma, Chao Du, Tianyu Pang, Shuicheng Yan

View PDF

Abstract:Offline reinforcement learning (RL) aims to learn optimal policies from offline datasets, where the parameterization of policies is crucial but often overlooked. Recently, Diffsuion-QL significantly boosts the performance of offline RL by representing a policy with a diffusion model, whose success relies on a parametrized Markov Chain with hundreds of steps for sampling. However, Diffusion-QL suffers from two critical limitations. 1) It is computationally inefficient to forward and backward through the whole Markov chain during training. 2) It is incompatible with maximum likelihood-based RL algorithms (e.g., policy gradient methods) as the likelihood of diffusion models is intractable. Therefore, we propose efficient diffusion policy (EDP) to overcome these two challenges. EDP approximately constructs actions from corrupted ones at training to avoid running the sampling chain. We conduct extensive experiments on the D4RL benchmark. The results show that EDP can reduce the diffusion policy training time from 5 days to 5 hours on gym-locomotion tasks. Moreover, we show that EDP is compatible with various offline RL algorithms (TD3, CRR, and IQL) and achieves new state-of-the-art on D4RL by large margins over previous methods. Our code is available at this https URL.

Comments:	Accepted by NeurIPS 2023
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2305.20081 [cs.LG]
	(or arXiv:2305.20081v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2305.20081

Submission history

From: Bingyi Kang [view email]
[v1] Wed, 31 May 2023 17:55:21 UTC (127 KB)
[v2] Thu, 26 Oct 2023 12:25:02 UTC (128 KB)

Computer Science > Machine Learning

Title:Efficient Diffusion Policies for Offline Reinforcement Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Efficient Diffusion Policies for Offline Reinforcement Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators