Support-Conditioned Flow Matching Is Kernel Smoothing
Daniel Matsui Smola

TL;DR
This paper links cross-attention in generative models to kernel smoothing, showing how the velocity field is a Nadaraya–Watson kernel smoother that adapts over flow time, with implications for model performance.
Contribution
It reveals the connection between support-conditioned flow matching and classical kernel theory, providing a theoretical framework for understanding conditioning in generative models.
Findings
Exact velocity field is a Nadaraya–Watson kernel smoother.
Kernel bandwidth decreases over flow time from broad to nearest-neighbor.
Learned conditioning improves performance in high-dimensional and data-geometry mismatch regimes.
Abstract
Generative models are often conditioned on a small set of examples via cross-attention. Under the Gaussian optimal-transport path, we show that the exact velocity field induced by a finite support set is a Nadaraya--Watson kernel smoother whose bandwidth decreases with flow time, from broad averaging at early steps to nearest-neighbor at late steps. A single Gaussian-kernel attention head exactly computes this field, connecting cross-attention conditioning to classical kernel theory. The theory predicts three failure regimes: nearest-neighbor collapse of the kernel at high dimension, mismatch between the isotropic kernel and the data geometry, and insufficient support for nonparametric estimation. Experiments on Gaussian mixtures, spherical shells, and DINOv2 ImageNet features confirm that learned conditioning improves in precisely these regimes, and that IP-Adapter's cross-attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
