Structure preservation via the Wasserstein distance

Daniel Bartl; Shahar Mendelson

arXiv:2209.07058·math.ST·January 13, 2025·1 cites

Structure preservation via the Wasserstein distance

Daniel Bartl, Shahar Mendelson

PDF

Open Access

TL;DR

This paper demonstrates that the coordinate distributions of a random vector can be accurately approximated by empirical distributions using Wasserstein distance, with the error bound depending on the dimension and sample size.

Contribution

It provides a novel high-probability bound on the Wasserstein distance between true marginals and empirical distributions for random vectors, establishing optimality.

Findings

01

High-probability Wasserstein distance bound for marginals

02

Error rate of (d/m)^{1/4} is proven to be optimal

03

Method applies under minimal assumptions on the random vector

Abstract

We show that under minimal assumptions on a random vector $X \in R^{d}$ and with high probability, given $m$ independent copies of $X$ , the coordinate distribution of each vector $(⟨ X_{i}, θ ⟩)_{i = 1}^{m}$ is dictated by the distribution of the true marginal $⟨ X, θ ⟩$ . Specifically, we show that with high probability, \[\sup_{\theta \in S^{d-1}} \left( \frac{1}{m}\sum_{i=1}^m \left|\langle X_i,\theta \rangle^\sharp - \lambda^\theta_i \right|^2 \right)^{1/2} \leq c \left( \frac{d}{m} \right)^{1/4},\] where $λ_{i}^{θ} = m \int_{(\frac{i - 1}{m}, \frac{i}{m}]} F_{⟨ X, θ ⟩}^{- 1} (u) d u$ and $a^{♯}$ denotes the monotone non-decreasing rearrangement of $a$ . Moreover, this estimate is optimal. The proof follows from a sharp estimate on the worst Wasserstein distance between a marginal of $X$ and its empirical counterpart,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGeometric Analysis and Curvature Flows · Point processes and geometric inequalities · Random Matrices and Applications