Two-sample Test with Kernel Projected Wasserstein Distance
Jie Wang, Rui Gao, Yao Xie

TL;DR
This paper introduces a kernel projected Wasserstein distance for two-sample testing that effectively handles high-dimensional data by finding optimal nonlinear projections, with practical algorithms and validated performance.
Contribution
It proposes a novel kernel projected Wasserstein distance that overcomes the curse of dimensionality and provides algorithms with uncertainty quantification.
Findings
Efficiently distinguishes distributions in high dimensions.
Provides algorithms with non-asymptotic uncertainty bounds.
Numerical results confirm theoretical advantages.
Abstract
We develop a kernel projected Wasserstein distance for the two-sample test, an essential building block in statistics and machine learning: given two sets of samples, to determine whether they are from the same distribution. This method operates by finding the nonlinear mapping in the data space which maximizes the distance between projected distributions. In contrast to existing works about projected Wasserstein distance, the proposed method circumvents the curse of dimensionality more efficiently. We present practical algorithms for computing this distance function together with the non-asymptotic uncertainty quantification of empirical estimates. Numerical examples validate our theoretical results and demonstrate good performance of the proposed method.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Advanced Statistical Methods and Models · Probabilistic and Robust Engineering Design
