Diffusion-Reward Adversarial Imitation Learning

Chun-Mao Lai; Hsiang-Chun Wang; Ping-Chun Hsieh; Yu-Chiang Frank Wang,; Min-Hung Chen; Shao-Hua Sun

arXiv:2405.16194·cs.LG·November 27, 2024

Diffusion-Reward Adversarial Imitation Learning

Chun-Mao Lai, Hsiang-Chun Wang, Ping-Chun Hsieh, Yu-Chiang Frank Wang,, Min-Hung Chen, Shao-Hua Sun

PDF

Open Access 1 Video

TL;DR

DRAIL introduces a diffusion model-based discriminator and reward system into GAIL, enhancing robustness, smoothness, and data efficiency in imitation learning across various tasks.

Contribution

It integrates diffusion models into GAIL to improve reward quality and training stability, a novel approach in imitation learning.

Findings

01

DRAIL outperforms prior methods in navigation, manipulation, and locomotion tasks.

02

DRAIL demonstrates higher data efficiency and reward robustness.

03

Visualized rewards show smoother and more reliable reward functions.

Abstract

Imitation learning aims to learn a policy from observing expert demonstrations without access to reward signals from environments. Generative adversarial imitation learning (GAIL) formulates imitation learning as adversarial learning, employing a generator policy learning to imitate expert behaviors and discriminator learning to distinguish the expert demonstrations from agent trajectories. Despite its encouraging results, GAIL training is often brittle and unstable. Inspired by the recent dominance of diffusion models in generative modeling, we propose Diffusion-Reward Adversarial Imitation Learning (DRAIL), which integrates a diffusion model into GAIL, aiming to yield more robust and smoother rewards for policy learning. Specifically, we propose a diffusion discriminative classifier to construct an enhanced discriminator, and design diffusion rewards based on the classifier's output…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Diffusion-Reward Adversarial Imitation Learning· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications

MethodsGenerative Adversarial Imitation Learning · Diffusion