The Price of Tolerance in Distribution Testing

Cl\'ement L. Canonne; Ayush Jain; Gautam Kamath; Jerry Li

arXiv:2106.13414·cs.DS·November 10, 2021

The Price of Tolerance in Distribution Testing

Cl\'ement L. Canonne, Ayush Jain, Gautam Kamath, Jerry Li

PDF

Open Access

TL;DR

None

Contribution

None

Abstract

We revisit the problem of tolerant distribution testing. That is, given samples from an unknown distribution $p$ over ${1, \dots, n}$ , is it $ε_{1}$ -close to or $ε_{2}$ -far from a reference distribution $q$ (in total variation distance)? Despite significant interest over the past decade, this problem is well understood only in the extreme cases. In the noiseless setting (i.e., $ε_{1} = 0$ ) the sample complexity is $Θ (n)$ , strongly sublinear in the domain size. At the other end of the spectrum, when $ε_{1} = ε_{2} /2$ , the sample complexity jumps to the barely sublinear $Θ (n / lo g n)$ . However, very little is known about the intermediate regime. We fully characterize the price of tolerance in distribution testing as a function of $n$ , $ε_{1}$ , $ε_{2}$ , up to a single $lo g n$ factor. Specifically, we show the sample…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Complexity and Algorithms in Graphs · Adversarial Robustness in Machine Learning