On Variance Reduction in Learning Mean Flows
Juanwu Lu, Ziran Wang

TL;DR
This paper analyzes the instability in MeanFlow training for generative models, deriving an optimal coefficient for variance reduction that improves sample quality and FID scores.
Contribution
It establishes a theoretical understanding of the variance issues in MeanFlow training and proposes an optimal coefficient to improve stability and performance.
Findings
Optimal coefficient reduces variance and improves sample quality by up to 54%.
Controlled coefficient sweep confirms the bias-variance trade-off predicted by theory.
FID scores are optimized at a different coefficient than variance, revealing a landscape mismatch.
Abstract
One-step generative modeling has emerged as a leading approach to amortize the inference cost of diffusion and flow-matching models. Among distillation-free methods, MeanFlow training is notoriously unstable, with non-decreasing loss and unbounded gradient variance. In this work, we establish a theory that attributes this pathology to a misuse of the conditional velocity field: it plays two distinct statistical roles in the loss, both as an unbiased regression target and as a Monte Carlo control variate inside a Jacobi-vector product, with the original loss assigning the wrong coefficient to the latter. We derive the optimal coefficient in closed form, and show that a family of fixes in concurrent works corresponds to different practical realizations of the same optimum. A controlled sweep of this coefficient on two-dimensional benchmarks and on a latent Diffusion Transformer recovers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
