DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation

Park, Jaehyun; Kim, Yunho; Kim, Sejin; Lee, Byung-Jun; Kim, Sundong

Computer Science > Machine Learning

arXiv:2410.11338 (cs)

[Submitted on 15 Oct 2024]

Title:DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation

Authors:Jaehyun Park, Yunho Kim, Sejin Kim, Byung-Jun Lee, Sundong Kim

View PDF HTML (experimental)

Abstract:We propose a novel offline reinforcement learning (offline RL) approach, introducing the Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation (DIAR) framework. We address two key challenges in offline RL: out-of-distribution samples and long-horizon problems. We leverage diffusion models to learn state-action sequence distributions and incorporate value functions for more balanced and adaptive decision-making. DIAR introduces an Adaptive Revaluation mechanism that dynamically adjusts decision lengths by comparing current and future state values, enabling flexible long-term decision-making. Furthermore, we address Q-value overestimation by combining Q-network learning with a value function guided by a diffusion model. The diffusion model generates diverse latent trajectories, enhancing policy robustness and generalization. As demonstrated in tasks like Maze2D, AntMaze, and Kitchen, DIAR consistently outperforms state-of-the-art algorithms in long-horizon, sparse-reward environments.

Comments:	Preprint, under review. Comments welcome
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)
Cite as:	arXiv:2410.11338 [cs.LG]
	(or arXiv:2410.11338v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2410.11338

Submission history

From: Sundong Kim [view email]
[v1] Tue, 15 Oct 2024 07:09:56 UTC (5,814 KB)

Computer Science > Machine Learning

Title:DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:DIAR: Diffusion-model-guided Implicit Q-learning with Adaptive Revaluation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators