Test-time scaling of diffusions with flow maps

Amirmojtaba Sabour; Michael S. Albergo; Carles Domingo-Enrich; Nicholas M. Boffi; Sanja Fidler; Karsten Kreis; Eric Vanden-Eijnden

arXiv:2511.22688·cs.LG·December 1, 2025

Test-time scaling of diffusions with flow maps

Amirmojtaba Sabour, Michael S. Albergo, Carles Domingo-Enrich, Nicholas M. Boffi, Sanja Fidler, Karsten Kreis, Eric Vanden-Eijnden

PDF

Open Access 3 Reviews

TL;DR

This paper introduces Flow Map Trajectory Tilting (FMTT), a novel test-time method for improving diffusion model samples by directly using flow maps to optimize for user-defined rewards, enabling better ascent and complex image editing.

Contribution

The paper proposes a new algorithm, FMTT, that leverages flow maps for better reward optimization in diffusion models at test-time, outperforming existing methods.

Findings

01

FMTT improves reward ascent over standard methods.

02

The approach enables exact sampling and reward maximization.

03

It facilitates complex image editing with vision-language models.

Abstract

A common recipe to improve diffusion models at test-time so that samples score highly against a user-specified reward is to introduce the gradient of the reward into the dynamics of the diffusion itself. This procedure is often ill posed, as user-specified rewards are usually only well defined on the data distribution at the end of generation. While common workarounds to this problem are to use a denoiser to estimate what a sample would have been at the end of generation, we propose a simple solution to this problem by working directly with a flow map. By exploiting a relationship between the flow map and velocity field governing the instantaneous transport, we construct an algorithm, Flow Map Trajectory Tilting (FMTT), which provably performs better ascent on the reward than standard test-time methods involving the gradient of the reward. The approach can be used to either perform…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 2

Strengths

- Important problem: reward conditioning in diffusion models and related generative models is a fundamental question with far-reaching implications for various fields like image and video generation, robotics and scientific applications. - Non-trivial, theoretically principled technique: the construction (and comprehension) of FMTT demands a fair amount of sophistication in stochastic analysis, PDEs and Monte Carlo methods. As such, most researchers in the field would not have been able to arri

Weaknesses

- Unclear motivation for why flow maps are more appropriate than diffusion models for reward guidance: in section 2.2, the authors claim that passing a noisy latent $x_s$ through a denoiser $D$ before computing the reward is inappropriate due to the denoiser providing little information early on in the dynamics. It is not a priori clear to me why using the flow map $X_{s,1}$ instead would not share the same problem for small $s$. It would be good if the authors could clarify this. - The presen

Reviewer 02Rating 10Confidence 3

Strengths

This paper seems to present a well-reasoned proof that flow maps can be used to apply rewards to diffusion models with look-ahead, so that the rewards are more accurate than if they were applied to the noisy samples near t=0. While I did not check all the technical details, I did not find any errors in the math, and it makes sense. The experimental results are excellent, and show multiple creative uses of the rewards, especially in ways that are out of distribution for previous benchmarks. Barri

Weaknesses

I did not find any major weaknesses. I only have a few suggestions for improving the clarity slightly: - Fig. 2 needs labelling of the time axis. - Nabla is overloaded, both as an operator and stand-alone symbol (composed with dot product); although it is common notation, due to the overloading it would be safer to define it in all cases. - Fig. 5: It would be more convincing to present a scatter plot of thermodynamic lengths vs. reward across samples, instead of just average bar plots, which

Reviewer 03Rating 0Confidence 4

Strengths

I think the idea is straightforward to follow. And the proof process is correct and easy to follow ( I checked all the derivations in section 2 apart from proposition 2.1 & 2.2).

Weaknesses

1. The method is mostly an application of known SMC / importance weighting ideas to reward-guided diffusion, swapping in a learned flow map to get a better guess of the final clean sample. The paper over-markets this as fundamentally new. 2. The theoretical section restates standard Jarzynski/SMC logic but does not analyze estimator variance, practical degeneracy, or approximation error in the learned flow map — which are the actual hard problems. 3. The paper leans on buzzwords (“test-time sc

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neuroimaging Techniques and Applications · Generative Adversarial Networks and Image Synthesis · Medical Image Segmentation Techniques