Decision Flow Policy Optimization

Jifeng Hu; Sili Huang; Siyuan Guo; Zhaogeng Liu; Li Shen; Lichao Sun; Hechang Chen; Yi Chang; Dacheng Tao

arXiv:2505.20350·cs.LG·May 28, 2025

Decision Flow Policy Optimization

Jifeng Hu, Sili Huang, Siyuan Guo, Zhaogeng Liu, Li Shen, Lichao Sun, Hechang Chen, Yi Chang, Dacheng Tao

PDF

Open Access

TL;DR

This paper introduces Decision Flow, a unified framework that integrates multi-modal action distribution modeling with policy optimization in reinforcement learning, leading to improved performance in offline RL tasks.

Contribution

The paper proposes Decision Flow, a novel method that seamlessly combines flow-based generative models with policy optimization for reinforcement learning.

Findings

01

Achieves or matches state-of-the-art performance in offline RL environments.

02

Effectively models complex multi-modal action distributions.

03

Demonstrates superior control in continuous action spaces.

Abstract

In recent years, generative models have shown remarkable capabilities across diverse fields, including images, videos, language, and decision-making. By applying powerful generative models such as flow-based models to reinforcement learning, we can effectively model complex multi-modal action distributions and achieve superior robotic control in continuous action spaces, surpassing the limitations of single-modal action distributions with traditional Gaussian-based policies. Previous methods usually adopt the generative models as behavior models to fit state-conditioned action distributions from datasets, with policy optimization conducted separately through additional policies using value-based sample weighting or gradient-based updates. However, this separation prevents the simultaneous optimization of multi-modal distribution fitting and policy improvement, ultimately hindering the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBig Data and Business Intelligence