Gradient Variance Reveals Failure Modes in Flow-Based Generative Models
Teodora Reu, Sixtine Dromigny, Michael Bronstein, Francisco Vargas

TL;DR
This paper reveals that flow-based generative models with straight-path objectives can memorize training pairs due to low gradient variance, leading to failure modes, which can be mitigated by adding noise.
Contribution
It analyzes how low gradient variance causes memorization in flow models and demonstrates that noise injection can restore proper generalization.
Findings
Deterministic training leads to memorization of training pairs.
Adding small noise during training improves generalization.
Memorization occurs even when interpolant lines intersect.
Abstract
Rectified Flows learn ODE vector fields whose trajectories are straight between source and target distributions, enabling near one-step inference. We show that this straight-path objective conceals fundamental failure modes: under deterministic training, low gradient variance drives memorization of arbitrary training pairings, even when interpolant lines between pairs intersect. To analyze this mechanism, we study Gaussian-to-Gaussian transport and use the loss gradient variance across stochastic and deterministic regimes to characterize which vector fields optimization favors in each setting. We then show that, in a setting where all interpolating lines intersect, applying Rectified Flow yields the same specific pairings at inference as during training. More generally, we prove that a memorizing vector field exists even when training interpolants intersect, and that optimizing the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Stochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference
