Competitive Distribution Estimation

Alon Orlitsky; Ananda Theertha Suresh

arXiv:1503.07940·cs.IT·March 30, 2015

Competitive Distribution Estimation

Alon Orlitsky, Ananda Theertha Suresh

PDF

Open Access

TL;DR

This paper introduces a new framework for distribution estimation that measures how well estimators perform relative to limited oracle knowledge, achieving near-optimal regret bounds with efficient algorithms.

Contribution

It proposes a novel competitive regret framework under natural oracle limitations and provides an efficient estimator with near-optimal regret bounds.

Findings

01

Competitive regret reduces to (k/n, 1/ n) for natural and permutation-invariant estimators.

02

The proposed estimator runs in linear time and achieves near-optimal regret bounds.

03

Simulations demonstrate the effectiveness of the competitive estimators.

Abstract

Estimating an unknown distribution from its samples is a fundamental problem in statistics. The common, min-max, formulation of this goal considers the performance of the best estimator over all distributions in a class. It shows that with $n$ samples, distributions over $k$ symbols can be learned to a KL divergence that decreases to zero with the sample size $n$ , but grows unboundedly with the alphabet size $k$ . Min-max performance can be viewed as regret relative to an oracle that knows the underlying distribution. We consider two natural and modest limits on the oracle's power. One where it knows the underlying distribution only up to symbol permutations, and the other where it knows the exact distribution but is restricted to use natural estimators that assign the same probability to symbols that appeared equally many times in the sample. We show that in both cases the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Imbalanced Data Classification Techniques · Advanced Bandit Algorithms Research