DIME:Diffusion-Based Maximum Entropy Reinforcement Learning

Onur Celik; Zechu Li; Denis Blessing; Ge Li; Daniel Palenicek; Jan Peters; Georgia Chalvatzaki; Gerhard Neumann

arXiv:2502.02316·cs.LG·June 11, 2025

DIME:Diffusion-Based Maximum Entropy Reinforcement Learning

Onur Celik, Zechu Li, Denis Blessing, Ge Li, Daniel Palenicek, Jan Peters, Georgia Chalvatzaki, Gerhard Neumann

PDF

Open Access

TL;DR

DIME introduces a diffusion-based maximum entropy reinforcement learning framework that enhances policy expressiveness and exploration, achieving superior performance on complex control tasks with reduced computational demands.

Contribution

The paper develops a novel diffusion-based MaxEnt-RL method with a provably convergent policy iteration scheme, overcoming entropy intractability and improving high-dimensional control performance.

Findings

01

Outperforms existing diffusion-based RL methods on challenging benchmarks.

02

Achieves competitive results with state-of-the-art non-diffusion RL methods.

03

Requires fewer algorithmic choices and less computation, simplifying implementation.

Abstract

Maximum entropy reinforcement learning (MaxEnt-RL) has become the standard approach to RL due to its beneficial exploration properties. Traditionally, policies are parameterized using Gaussian distributions, which significantly limits their representational capacity. Diffusion-based policies offer a more expressive alternative, yet integrating them into MaxEnt-RL poses challenges-primarily due to the intractability of computing their marginal entropy. To overcome this, we propose Diffusion-Based Maximum Entropy RL (DIME). \emph{DIME} leverages recent advances in approximate inference with diffusion models to derive a lower bound on the maximum entropy objective. Additionally, we propose a policy iteration scheme that provably converges to the optimal diffusion policy. Our method enables the use of expressive diffusion-based policies while retaining the principled exploration benefits of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Neural Networks and Applications · Smart Grid Energy Management

MethodsDistance to Modelled Embedding · Diffusion