DiffPoGAN: Diffusion Policies with Generative Adversarial Networks for Offline Reinforcement Learning
Xuemin Hu, Shen Li, Yingfen Xu, Bo Tang, Long Chen

TL;DR
DiffPoGAN introduces a diffusion-based generative approach combined with GANs for offline reinforcement learning, effectively addressing policy exploration and behavior policy representation issues, leading to superior performance on D4RL datasets.
Contribution
The paper proposes DiffPoGAN, a novel offline RL method that integrates diffusion models with GANs, enhancing policy diversity and exploration constraints.
Findings
Outperforms state-of-the-art offline RL methods on D4RL datasets.
Effectively constrains policy exploration with discriminator-based regularization.
Generates diverse action distributions using diffusion models.
Abstract
Offline reinforcement learning (RL) can learn optimal policies from pre-collected offline datasets without interacting with the environment, but the sampled actions of the agent cannot often cover the action distribution under a given state, resulting in the extrapolation error issue. Recent works address this issue by employing generative adversarial networks (GANs). However, these methods often suffer from insufficient constraints on policy exploration and inaccurate representation of behavior policies. Moreover, the generator in GANs fails in fooling the discriminator while maximizing the expected returns of a policy. Inspired by the diffusion, a generative model with powerful feature expressiveness, we propose a new offline RL method named Diffusion Policies with Generative Adversarial Networks (DiffPoGAN). In this approach, the diffusion serves as the policy generator to generate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics
MethodsDiffusion
