Concentration Bounds for the Collision Estimator

Maciej Skorski

arXiv:2006.07366·cs.IT·June 26, 2020

Concentration Bounds for the Collision Estimator

Maciej Skorski

PDF

Open Access

TL;DR

This paper establishes strong concentration bounds for the collision estimator, a key tool in uniformity testing and entropy estimation, demonstrating its reliability without additional boosting techniques.

Contribution

It provides the first high-probability concentration bounds for the collision estimator, extending beyond variance to higher moments using novel techniques.

Findings

01

Achieves high-probability guarantees without boosting

02

Bounds higher moments of the estimator

03

Enhances understanding of collision estimator's reliability

Abstract

We prove a strong concentration result about the natural collision estimator, which counts the number of collisions that occur within an iid sample. This estimator is at the heart of algorithms used for uniformity testing and entropy assessment. While the prior works were limited to only variance, we use elegant techniques of independent interest to bounds higher moments and conclude concentration properties. As an immediate corollary we show that the estimator achieves high-probability guarantee on its own and there is no need for boosting (aka median/majority trick).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Optimization and Search Problems · Markov Chains and Monte Carlo Methods