Strong Consistency of the Good-Turing Estimator
Aaron B. Wagner, Pramod Viswanath, and Sanjeev R. Kulkarni

TL;DR
This paper proves the strong consistency of the Good-Turing estimator for total symbol probabilities in large, variable distributions, under certain convergence conditions, advancing understanding in probability estimation for i.i.d. sequences.
Contribution
It establishes the strong consistency of the Good-Turing estimator in a setting with changing distributions and large block lengths, under natural convergence assumptions.
Findings
Total probabilities converge to a deterministic limit.
Good-Turing estimator is strongly consistent.
Results apply to variable distribution regimes.
Abstract
We consider the problem of estimating the total probability of all symbols that appear with a given frequency in a string of i.i.d. random variables with unknown distribution. We focus on the regime in which the block length is large yet no symbol appears frequently in the string. This is accomplished by allowing the distribution to change with the block length. Under a natural convergence assumption on the sequence of underlying distributions, we show that the total probabilities converge to a deterministic limit, which we characterize. We then show that the Good-Turing total probability estimator is strongly consistent.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
