Small Transformers Compute Universal Metric Embeddings
Anastasis Kratsios, Valentin Debarnot, Ivan Dokmani\'c

TL;DR
This paper demonstrates that small neural networks called probabilistic transformers can effectively embed data from arbitrary metric spaces into Gaussian mixture representations with low distortion, avoiding the curse of dimensionality.
Contribution
It provides theoretical guarantees for probabilistic transformers to embed metric space data with low distortion, including bi-Hölder and bi-Lipschitz bounds, applicable to various structured datasets.
Findings
Probabilistic transformers achieve low-distortion embeddings of metric data.
Embedding guarantees extend to Riemannian manifolds, metric trees, and graphs.
Transformers can embed into Gaussian mixtures with arbitrarily small distortion.
Abstract
We study representations of data from an arbitrary metric space in the space of univariate Gaussian mixtures with a transport metric (Delon and Desolneux 2020). We derive embedding guarantees for feature maps implemented by small neural networks called \emph{probabilistic transformers}. Our guarantees are of memorization type: we prove that a probabilistic transformer of depth about and width about can bi-H\"{o}lder embed any -point dataset from with low metric distortion, thus avoiding the curse of dimensionality. We further derive probabilistic bi-Lipschitz guarantees, which trade off the amount of distortion and the probability that a randomly chosen pair of points embeds with that distortion. If 's geometry is sufficiently regular, we obtain stronger, bi-Lipschitz guarantees for all points in the dataset. As applications,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging and Analysis · AI in cancer detection · Generative Adversarial Networks and Image Synthesis
