Canonical Estimation in a Rare-Events Regime
Mesrob I. Ohannessian, Vincent Y. F. Tan, Munther A. Dahleh

TL;DR
This paper introduces a unified methodology for consistent statistical inference in rare-events regimes, enabling accurate estimation of various parameters like entropy, probability, alphabet size, and probability range in large alphabet settings.
Contribution
It develops a general framework for constructing consistent estimators in rare-events regimes, extending previous work to a broader class of estimation problems.
Findings
Consistent estimation of alphabet size and probability range demonstrated.
Two concrete constructions: pseudo-empirical measure and mixture model estimation.
Framework applicable to a wide range of canonical estimation problems.
Abstract
We propose a general methodology for performing statistical inference within a `rare-events regime' that was recently suggested by Wagner, Viswanath and Kulkarni. Our approach allows one to easily establish consistent estimators for a very large class of canonical estimation problems, in a large alphabet setting. These include the problems studied in the original paper, such as entropy and probability estimation, in addition to many other interesting ones. We particularly illustrate this approach by consistently estimating the size of the alphabet and the range of the probabilities. We start by proposing an abstract methodology based on constructing a probability measure with the desired asymptotic properties. We then demonstrate two concrete constructions by casting the Good-Turing estimator as a pseudo-empirical measure, and by using the theory of mixture model estimation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
