Convergence Analysis of Flow Matching in Latent Space with Transformers
Yuling Jiao, Yanming Lai, Yang Wang, Bokai Yan

TL;DR
This paper provides theoretical convergence guarantees for flow matching in latent space using transformers, demonstrating that generated samples converge to the target distribution under certain conditions.
Contribution
It introduces a convergence analysis for ODE-based generative models with transformers in latent space, including error bounds and approximation capabilities.
Findings
Sample distribution converges in Wasserstein-2 distance
Transformers can approximate smooth Lipschitz functions effectively
The approach is validated under practical assumptions
Abstract
We present theoretical convergence guarantees for ODE-based generative models, specifically flow matching. We use a pre-trained autoencoder network to map high-dimensional original inputs to a low-dimensional latent space, where a transformer network is trained to predict the velocity field of the transformation from a standard normal distribution to the target latent distribution. Our error analysis demonstrates the effectiveness of this approach, showing that the distribution of samples generated via estimated ODE flow converges to the target distribution in the Wasserstein-2 distance under mild and practical assumptions. Furthermore, we show that arbitrary smooth functions can be effectively approximated by transformer networks with Lipschitz continuity, which may be of independent interest.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis
