Kernel Distribution Embeddings: Universal Kernels, Characteristic Kernels and Kernel Metrics on Distributions
Carl-Johann Simon-Gabriel, Bernhard Sch\"olkopf

TL;DR
This paper investigates the properties of kernel mean embeddings, focusing on conditions for injectivity, the types of measures that can be embedded, and how the induced metrics relate to other topologies, unifying and extending existing results.
Contribution
It provides a unified framework for understanding when kernel embeddings are injective, how they can be extended to distributions, and characterizes kernels that induce metrics compatible with weak convergence.
Findings
Characterizes when kernel embeddings are injective.
Extends kernel methods to distributions beyond measures.
Shows that characteristic kernels metrize weak convergence under certain conditions.
Abstract
Kernel mean embeddings have recently attracted the attention of the machine learning community. They map measures from some set to functions in a reproducing kernel Hilbert space (RKHS) with kernel . The RKHS distance of two mapped measures is a semi-metric over . We study three questions. (I) For a given kernel, what sets can be embedded? (II) When is the embedding injective over (in which case is a metric)? (III) How does the -induced topology compare to other topologies on ? The existing machine learning literature has addressed these questions in cases where is (a subset of) the finite regular Borel measures. We unify, improve and generalise those results. Our approach naturally leads to continuous and possibly even injective embeddings of (Schwartz-) distributions, i.e., generalised measures, but the reader is free to focus on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Signal Denoising Methods · Neural Networks and Applications · Gaussian Processes and Bayesian Inference
