On Flow Matching KL Divergence

Maojiang Su; Jerry Yao-Chieh Hu; Sophia Pi; Han Liu

arXiv:2511.05480·cs.LG·November 10, 2025

On Flow Matching KL Divergence

Maojiang Su, Jerry Yao-Chieh Hu, Sophia Pi, Han Liu

PDF

Open Access

TL;DR

This paper provides a theoretical analysis of flow matching methods, establishing bounds on KL divergence and demonstrating near-optimal efficiency in estimating smooth distributions, supported by numerical experiments.

Contribution

It derives a non-asymptotic KL divergence bound for flow matching, linking it to statistical convergence rates and efficiency comparable to diffusion models.

Findings

01

KL divergence bound depends linearly and quadratically on flow-matching loss

02

Flow matching achieves near-minimax optimal efficiency for smooth distributions

03

Numerical results support theoretical bounds and efficiency claims

Abstract

We derive a deterministic, non-asymptotic upper bound on the Kullback-Leibler (KL) divergence of the flow-matching distribution approximation. In particular, if the $L_{2}$ flow-matching loss is bounded by $ϵ^{2} > 0$ , then the KL divergence between the true data distribution and the estimated distribution is bounded by $A_{1} ϵ + A_{2} ϵ^{2}$ . Here, the constants $A_{1}$ and $A_{2}$ depend only on the regularities of the data and velocity fields. Consequently, this bound implies statistical convergence rates of Flow Matching Transformers under the Total Variation (TV) distance. We show that, flow matching achieves nearly minimax-optimal efficiency in estimating smooth distributions. Our results make the statistical efficiency of flow matching comparable to that of diffusion models under the TV distance. Numerical studies on synthetic and learned velocities corroborate our theory.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Generative Adversarial Networks and Image Synthesis · Statistical Methods and Inference