Concentration Bounds for the Collision Estimator
Maciej Skorski

TL;DR
This paper establishes strong concentration bounds for the collision estimator, a key tool in uniformity testing and entropy estimation, demonstrating its reliability without additional boosting techniques.
Contribution
It provides the first high-probability concentration bounds for the collision estimator, extending beyond variance to higher moments using novel techniques.
Findings
Achieves high-probability guarantees without boosting
Bounds higher moments of the estimator
Enhances understanding of collision estimator's reliability
Abstract
We prove a strong concentration result about the natural collision estimator, which counts the number of collisions that occur within an iid sample. This estimator is at the heart of algorithms used for uniformity testing and entropy assessment. While the prior works were limited to only variance, we use elegant techniques of independent interest to bounds higher moments and conclude concentration properties. As an immediate corollary we show that the estimator achieves high-probability guarantee on its own and there is no need for boosting (aka median/majority trick).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Optimization and Search Problems · Markov Chains and Monte Carlo Methods
