Competitive Distribution Estimation
Alon Orlitsky, Ananda Theertha Suresh

TL;DR
This paper introduces a new framework for distribution estimation that measures how well estimators perform relative to limited oracle knowledge, achieving near-optimal regret bounds with efficient algorithms.
Contribution
It proposes a novel competitive regret framework under natural oracle limitations and provides an efficient estimator with near-optimal regret bounds.
Findings
Competitive regret reduces to (k/n, 1/ n) for natural and permutation-invariant estimators.
The proposed estimator runs in linear time and achieves near-optimal regret bounds.
Simulations demonstrate the effectiveness of the competitive estimators.
Abstract
Estimating an unknown distribution from its samples is a fundamental problem in statistics. The common, min-max, formulation of this goal considers the performance of the best estimator over all distributions in a class. It shows that with samples, distributions over symbols can be learned to a KL divergence that decreases to zero with the sample size , but grows unboundedly with the alphabet size . Min-max performance can be viewed as regret relative to an oracle that knows the underlying distribution. We consider two natural and modest limits on the oracle's power. One where it knows the underlying distribution only up to symbol permutations, and the other where it knows the exact distribution but is restricted to use natural estimators that assign the same probability to symbols that appeared equally many times in the sample. We show that in both cases the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Imbalanced Data Classification Techniques · Advanced Bandit Algorithms Research
