DiffPoGAN: Diffusion Policies with Generative Adversarial Networks for   Offline Reinforcement Learning

Xuemin Hu; Shen Li; Yingfen Xu; Bo Tang; Long Chen

arXiv:2406.09089·cs.LG·June 14, 2024

DiffPoGAN: Diffusion Policies with Generative Adversarial Networks for Offline Reinforcement Learning

Xuemin Hu, Shen Li, Yingfen Xu, Bo Tang, Long Chen

PDF

Open Access

TL;DR

DiffPoGAN introduces a diffusion-based generative approach combined with GANs for offline reinforcement learning, effectively addressing policy exploration and behavior policy representation issues, leading to superior performance on D4RL datasets.

Contribution

The paper proposes DiffPoGAN, a novel offline RL method that integrates diffusion models with GANs, enhancing policy diversity and exploration constraints.

Findings

01

Outperforms state-of-the-art offline RL methods on D4RL datasets.

02

Effectively constrains policy exploration with discriminator-based regularization.

03

Generates diverse action distributions using diffusion models.

Abstract

Offline reinforcement learning (RL) can learn optimal policies from pre-collected offline datasets without interacting with the environment, but the sampled actions of the agent cannot often cover the action distribution under a given state, resulting in the extrapolation error issue. Recent works address this issue by employing generative adversarial networks (GANs). However, these methods often suffer from insufficient constraints on policy exploration and inaccurate representation of behavior policies. Moreover, the generator in GANs fails in fooling the discriminator while maximizing the expected returns of a policy. Inspired by the diffusion, a generative model with powerful feature expressiveness, we propose a new offline RL method named Diffusion Policies with Generative Adversarial Networks (DiffPoGAN). In this approach, the diffusion serves as the policy generator to generate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics

MethodsDiffusion