Optimal Algorithms for Testing Closeness of Discrete Distributions
Siu-On Chan, Ilias Diakonikolas, Gregory Valiant, Paul, Valiant

TL;DR
This paper introduces simple, optimal algorithms for testing whether two discrete distributions are close or far apart, improving sample efficiency and matching theoretical lower bounds.
Contribution
The paper presents new, simple testers for distribution closeness that are nearly optimal in sample complexity, advancing the understanding of distribution testing.
Findings
Sample complexity for $oldsymbol{ ext{ell}_1}$ testing is $oldsymbol{ heta( ext{max}\{n^{2/3}/ ext{eps}^{4/3}, n^{1/2}/ ext{eps}^2 ight)}$.
New testers are simple, practical, and match information-theoretic lower bounds.
Achieves optimal dependence on $n$ and $ ext{eps}$ in sample complexity.
Abstract
We study the question of closeness testing for two discrete distributions. More precisely, given samples from two distributions and over an -element set, we wish to distinguish whether versus is at least -far from , in either or distance. Batu et al. gave the first sub-linear time algorithms for these problems, which matched the lower bounds of Valiant up to a logarithmic factor in , and a polynomial factor of In this work, we present simple (and new) testers for both the and settings, with sample complexity that is information-theoretically optimal, to constant factors, both in the dependence on , and the dependence on ; for the testing problem we establish that the sample complexity is
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplexity and Algorithms in Graphs · Machine Learning and Algorithms · Cryptography and Data Security
