TL;DR
Weighted Conditional Flow Matching (W-CFM) introduces a novel weighting scheme inspired by entropic optimal transport to improve flow paths and efficiency in continuous normalizing flows, achieving better sample quality and speed.
Contribution
W-CFM modifies classical CFM loss with a Gibbs kernel weighting, linking it to entropic OT and overcoming batch size limitations for improved flow training.
Findings
W-CFM achieves comparable or better sample quality than baselines.
W-CFM maintains computational efficiency of vanilla CFM.
W-CFM overcomes batch size bottlenecks in flow training.
Abstract
Conditional flow matching (CFM) has emerged as a powerful framework for training continuous normalizing flows due to its computational efficiency and effectiveness. However, standard CFM often produces paths that deviate significantly from straight-line interpolations between prior and target distributions, making generation slower and less accurate due to the need for fine discretization at inference. Recent methods enhance CFM performance by inducing shorter and straighter trajectories but typically rely on computationally expensive mini-batch optimal transport (OT). Drawing insights from entropic optimal transport (EOT), we propose Weighted Conditional Flow Matching (W-CFM), a novel approach that modifies the classical CFM loss by weighting each training pair with a Gibbs kernel. We show that this weighting recovers the entropic OT coupling up to some bias in the marginals,…
Peer Reviews
Decision·Submitted to ICLR 2026
- The authors propose a method for training a CFM model that lets one to approximate an entropic optimal transport plan for training data pairs, without extra compute as in OT-CFM/EOT-CFM. - The authors test their approach on several data types and analyze different aspects of CFM models. - The method delivers decent results on image data, outperforming other CFM approaches by both fidelity and diversity.
The main weakness of the method is **“tilted” marginals**: - There are no clear types of cases where marginals are not tilted (except for very limited case described in Proposition 2). In contrast the OT-CFM/EOT-CFM methods are asymptotically unbiased and when batch size grows to infinity they recover the ground truth OT or EOT plan. While in W-CFM there are no such parameters that guarantee the lack of tilting. - In that light the claims of authors that distributions are not “tilted” signific
The authors propose a method that approximates EOT-CFM by simply reweighting pairs sampled from an independent plan, thus avoiding the need for mini-batch OT computations. They provide a theoretical justification for its equivalence to OT-CFM in the large-batch limit. However, this reweighting results in tilted marginals, i.e., $\tilde{\mu}_\epsilon(dx)=\frac{\exp{(-\phi\_\epsilon(x))}}{Z^1\_\varepsilon}\mu(dx)$ (eq. 10), where $\mu(dx)$ is the original marginal, so the authors analyze the condi
### **Terminology** - In the statement of Theorem 1, there are $\phi, \psi$, whereas in Equation (2) they appear as $\phi_\varepsilon, \psi_\varepsilon$. Moreover, Theorem 4.2 from [1] states that $\phi, \psi$ are called _EOT potentials_, not _Schrödinger potentials_ and Remark 4.3 therein notes that there is already some inconsistency in the naming. ### **Theory** - As I understand it, $\mathcal{L}\_{\text{W-CFM}}$ can be considered an accurate approximation of $\mathcal{L}\_{\text{EOT-CFM}
1. The topic of this paper is essential. Making the FM path straighter is key to reducing NFE and accelerating the sampling process while maintaining high generation quality. Reflow is one way, but it suffers from two-stage training and requires traversing the entire reverse process. W-CFM is one-stage and does not necessarily traverse the entire reverse process, which is interesting.
However, this paper contains some major concerns: 1. The motivation for introducing the Gibbs kernel to achieve a straighter path is unconvincing. Firstly, this paper does not offer any theoretical justification for this motivation. Meanwhile, the experimental results cannot support the motivation either. For example, Fig.2 shows that the path of OT-CFM is straighter than that of W-CFM. Meanwhile, image generation tasks also show that W-CFM cannot achieve this motivation. Under the same NFE, if
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
