Testing Distribution Identity Efficiently

Krzysztof Onak

arXiv:0910.3243·cs.DS·October 20, 2009·2 cites

Testing Distribution Identity Efficiently

Krzysztof Onak

PDF

Open Access

TL;DR

This paper improves the efficiency of distribution identity testing by reducing the running time of existing algorithms from linear to near square-root in the domain size, while maintaining sample complexity.

Contribution

The authors modify the existing distribution identity tester to achieve a near square-root running time without increasing the sample complexity.

Findings

01

Achieved a running time of O~(sqrt(n) * poly(1/epsilon))

02

Maintained the same sample complexity as previous methods

03

Enhanced the practical efficiency of distribution testing algorithms

Abstract

We consider the problem of testing distribution identity. Given a sequence of independent samples from an unknown distribution on a domain of size n, the goal is to check if the unknown distribution approximately equals a known distribution on the same domain. While Batu, Fortnow, Fischer, Kumar, Rubinfeld, and White (FOCS 2001) proved that the sample complexity of the problem is O~(sqrt(n) * poly(1/epsilon)), the running time of their tester is much higher: O(n) + O~(sqrt(n) * poly(1/epsilon)). We modify their tester to achieve a running time of O~(sqrt(n) * poly(1/epsilon)).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplexity and Algorithms in Graphs · Cryptography and Data Security · Privacy-Preserving Technologies in Data