Wasserstein Identity Testing
Shichuan Deng, Wenzheng Li, Xuan Wu

TL;DR
This paper introduces Wasserstein identity testing, a new approach for distribution testing in metric spaces that overcomes limitations of traditional methods by providing nearly optimal sample complexities, especially for distributions satisfying the Doubling Condition.
Contribution
It proposes the Wasserstein identity testing framework and establishes nearly optimal sample complexities, advancing distribution testing in large or continuous support spaces.
Findings
Nearly optimal worst-case sample complexity achieved.
Instance-optimal sample complexity for distributions satisfying the Doubling Condition.
Addresses limitations of $L_1$-distance testing in large or continuous supports.
Abstract
Uniformity testing and the more general identity testing are well studied problems in distributional property testing. Most previous work focuses on testing under -distance. However, when the support is very large or even continuous, testing under -distance may require a huge (even infinite) number of samples. Motivated by such issues, we consider the identity testing in Wasserstein distance (a.k.a. transportation distance and earthmover distance) on a metric space (discrete or continuous). In this paper, we propose the Wasserstein identity testing problem (Identity Testing in Wasserstein distance). We obtain nearly optimal worst-case sample complexity for the problem. Moreover, for a large class of probability distributions satisfying the so-called "Doubling Condition", we provide nearly instance-optimal sample complexity.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplexity and Algorithms in Graphs · Markov Chains and Monte Carlo Methods · Limits and Structures in Graph Theory
