FlowQ: Energy-Guided Flow Policies for Offline Reinforcement Learning
Marvin Alles, Nutan Chen, Patrick van der Smagt, Botond Cseke

TL;DR
FlowQ introduces an energy-guided flow matching method for offline reinforcement learning that improves training efficiency and performance without requiring guidance during inference.
Contribution
It presents a novel energy-guided flow matching approach that enhances flow model training and introduces FlowQ, an offline RL algorithm with constant training time regardless of flow sampling steps.
Findings
Achieves competitive performance in offline RL tasks.
Training time remains constant regardless of flow sampling steps.
Effectively models multi-modal action distributions.
Abstract
The use of guidance to steer sampling toward desired outcomes has been widely explored within diffusion models, especially in applications such as image and trajectory generation. However, incorporating guidance during training remains relatively underexplored. In this work, we introduce energy-guided flow matching, a novel approach that enhances the training of flow models and eliminates the need for guidance at inference time. We learn a conditional velocity field corresponding to the flow policy by approximating an energy-guided probability path as a Gaussian path. Learning guided trajectories is appealing for tasks where the target distribution is defined by a combination of data and an energy function, as in reinforcement learning. Diffusion-based policies have recently attracted attention for their expressive power and ability to capture multi-modal action distributions.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Reinforcement Learning in Robotics · Human Pose and Action Recognition
MethodsSoftmax · Attention Is All You Need · Diffusion
