DiffAIL: Diffusion Adversarial Imitation Learning

Authors

  • Bingzheng Wang Shandong University
  • Guoqiang Wu Shandong University
  • Teng Pang Shandong University
  • Yan Zhang Shandong University
  • Yilong Yin Shandong University

DOI:

https://doi.org/10.1609/aaai.v38i14.29470

Keywords:

ML: Reinforcement Learning, ML: Imitation Learning & Inverse Reinforcement Learning

Abstract

Imitation learning aims to solve the problem of defining reward functions in real-world decision-making tasks. The current popular approach is the Adversarial Imitation Learning (AIL) framework, which matches expert state-action occupancy measures to obtain a surrogate reward for forward reinforcement learning. However, the traditional discriminator is a simple binary classifier and doesn't learn an accurate distribution, which may result in failing to identify expert-level state-action pairs induced by the policy interacting with the environment. To address this issue, we propose a method named diffusion adversarial imitation learning (DiffAIL), which introduces the diffusion model into the AIL framework. Specifically, DiffAIL models the state-action pairs as unconditional diffusion models and uses diffusion loss as part of the discriminator's learning objective, which enables the discriminator to capture better expert demonstrations and improve generalization. Experimentally, the results show that our method achieves state-of-the-art performance and significantly surpasses expert demonstration on two benchmark tasks, including the standard state-action setting and state-only settings.

Downloads

Published

2024-03-24

How to Cite

Wang, B., Wu, G., Pang, T., Zhang, Y., & Yin, Y. (2024). DiffAIL: Diffusion Adversarial Imitation Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 38(14), 15447-15455. https://doi.org/10.1609/aaai.v38i14.29470

Issue

Section

AAAI Technical Track on Machine Learning V