Meta Flow Maps enable scalable reward alignment

Peter Potaptchik; Adhi Saravanan; Abbas Mammadov; Alvaro Prat; Michael S. Albergo; Yee Whye Teh

arXiv:2601.14430·stat.ML·January 22, 2026

Meta Flow Maps enable scalable reward alignment

Peter Potaptchik, Adhi Saravanan, Abbas Mammadov, Alvaro Prat, Michael S. Albergo, Yee Whye Teh

PDF

Open Access

TL;DR

Meta Flow Maps (MFMs) provide a scalable, efficient method for reward alignment in generative models by enabling stochastic posterior sampling, reducing computational costs in steering and fine-tuning tasks.

Contribution

We introduce Meta Flow Maps, a novel framework extending flow models to stochastic regimes, allowing efficient posterior sampling and improved reward alignment in generative models.

Findings

01

Steered-MFM outperforms baseline on ImageNet across multiple rewards.

02

MFMs enable inference-time steering without inner rollouts.

03

MFMs facilitate unbiased, off-policy fine-tuning for general rewards.

Abstract

Controlling generative models is computationally expensive. This is because optimal alignment with a reward function--whether via inference-time steering or fine-tuning--requires estimating the value function. This task demands access to the conditional posterior $p_{1∣ t} (x_{1} ∣ x_{t})$ , the distribution of clean data $x_{1}$ consistent with an intermediate state $x_{t}$ , a requirement that typically compels methods to resort to costly trajectory simulations. To address this bottleneck, we introduce Meta Flow Maps (MFMs), a framework extending consistency models and flow maps into the stochastic regime. MFMs are trained to perform stochastic one-step posterior sampling, generating arbitrarily many i.i.d. draws of clean data $x_{1}$ from any intermediate state. Crucially, these samples provide a differentiable reparametrization that unlocks efficient value function estimation. We leverage this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Machine Learning in Healthcare · Domain Adaptation and Few-Shot Learning