Diffusion-Reward Adversarial Imitation Learning
Chun-Mao Lai, Hsiang-Chun Wang, Ping-Chun Hsieh, Yu-Chiang Frank Wang,, Min-Hung Chen, Shao-Hua Sun

TL;DR
DRAIL introduces a diffusion model-based discriminator and reward system into GAIL, enhancing robustness, smoothness, and data efficiency in imitation learning across various tasks.
Contribution
It integrates diffusion models into GAIL to improve reward quality and training stability, a novel approach in imitation learning.
Findings
DRAIL outperforms prior methods in navigation, manipulation, and locomotion tasks.
DRAIL demonstrates higher data efficiency and reward robustness.
Visualized rewards show smoother and more reliable reward functions.
Abstract
Imitation learning aims to learn a policy from observing expert demonstrations without access to reward signals from environments. Generative adversarial imitation learning (GAIL) formulates imitation learning as adversarial learning, employing a generator policy learning to imitate expert behaviors and discriminator learning to distinguish the expert demonstrations from agent trajectories. Despite its encouraging results, GAIL training is often brittle and unstable. Inspired by the recent dominance of diffusion models in generative modeling, we propose Diffusion-Reward Adversarial Imitation Learning (DRAIL), which integrates a diffusion model into GAIL, aiming to yield more robust and smoother rewards for policy learning. Specifically, we propose a diffusion discriminative classifier to construct an enhanced discriminator, and design diffusion rewards based on the classifier's output…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
MethodsGenerative Adversarial Imitation Learning · Diffusion
