Testing Identity of Distributions under Kolmogorov Distance in Polylogarithmic Space
Christian Janos Lebeda, Jakub T\v{e}tek

TL;DR
This paper presents a space-efficient streaming algorithm for identity testing of continuous distributions under the Kolmogorov distance, significantly reducing space complexity while maintaining optimal sample complexity.
Contribution
It introduces a novel algorithm that tests distribution identity with logarithmic space in the streaming model, improving over previous linear space methods.
Findings
Uses $O( ext{log}^4 rac{1}{ ext{epsilon}})$ space for testing
Achieves asymptotically optimal sample complexity
Contrasts with discrete case where space reduction is impossible
Abstract
Suppose we have a sample from a distribution and we want to test whether for a fixed distribution . Specifically, we want to reject with constant probability, if the distance of from is in a given metric. In the case of continuous distributions, this has been studied thoroughly in the statistics literature. Namely, for the well-studied Kolmogorov metric a test is known that uses the optimal samples. However, this test naively uses also space , and previous work improved this to . In this paper, we show that much less space suffices -- we give an algorithm that uses space in the streaming setting while also using an asymptotically optimal number of samples. This is in contrast with the standard total variation distance on discrete distributions for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProbability and Statistical Research · Bayesian Methods and Mixture Models · Advanced Statistical Methods and Models
