Aligning Flow Map Policies with Optimal Q-Guidance

Christos Ziakas; Alessandra Russo; and Avishek Joey Bose

arXiv:2605.12416·cs.LG·May 13, 2026

Aligning Flow Map Policies with Optimal Q-Guidance

Christos Ziakas, Alessandra Russo, and Avishek Joey Bose

PDF

1 Models

TL;DR

This paper introduces flow map policies for fast, flexible action generation in complex control tasks, combining theoretical insights with practical algorithms to improve offline-to-online reinforcement learning performance.

Contribution

It proposes flow map policies for rapid action sampling, derives a new Q-guidance learning target, and develops a stochastic sampler for iterative inference, advancing offline-to-online RL.

Findings

01

FMQ outperforms previous methods with a 21.3% success rate improvement.

02

Flow map policies enable arbitrary jump actions, reducing inference latency.

03

The approach achieves state-of-the-art results across 12 robotic tasks.

Abstract

Generative policies based on expressive model classes, such as diffusion and flow matching, are well-suited to complex control problems with highly multimodal action distributions. Their expressivity, however, comes at a significant inference cost: generating each action typically requires simulating many steps of the generative process, compounding latency across sequential decision-making rollouts. We introduce flow map policies, a novel class of generative policies designed for fast action generation by learning to take arbitrary-size jumps including one-step jumps-across the generative dynamics of existing flow-based policies. We instantiate flow map policies for offline-to-online reinforcement learning (RL) and formulate online adaptation as a trust-region optimization problem that improves the critic's Q-value while remaining close to the offline policy. We theoretically derive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
christoszi/flow-map-policies
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.