Two-sample Test using Projected Wasserstein Distance
Jie Wang, Rui Gao, Yao Xie

TL;DR
This paper introduces a projected Wasserstein distance for two-sample testing that mitigates high-dimensional challenges by optimizing linear projections to enhance test power, supported by theoretical analysis and practical algorithms.
Contribution
It proposes a novel projected Wasserstein distance that couples optimal projection with Wasserstein metrics to improve high-dimensional two-sample testing.
Findings
Theoretical convergence rates for the proposed metric.
Algorithms for efficient computation of the projected Wasserstein distance.
Numerical experiments validating the method's effectiveness.
Abstract
We develop a projected Wasserstein distance for the two-sample test, a fundamental problem in statistics and machine learning: given two sets of samples, to determine whether they are from the same distribution. In particular, we aim to circumvent the curse of dimensionality in Wasserstein distance: when the dimension is high, it has diminishing testing power, which is inherently due to the slow concentration property of Wasserstein metrics in the high dimension space. A key contribution is to couple optimal projection to find the low dimensional linear mapping to maximize the Wasserstein distance between projected probability distributions. We characterize the theoretical property of the finite-sample convergence rate on IPMs and present practical algorithms for computing this metric. Numerical examples validate our theoretical results.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
