Strong Consistency of the Good-Turing Estimator

Aaron B. Wagner; Pramod Viswanath; and Sanjeev R. Kulkarni

arXiv:cs/0607014·cs.IT·November 15, 2016

Strong Consistency of the Good-Turing Estimator

Aaron B. Wagner, Pramod Viswanath, and Sanjeev R. Kulkarni

PDF

TL;DR

This paper proves the strong consistency of the Good-Turing estimator for total symbol probabilities in large, variable distributions, under certain convergence conditions, advancing understanding in probability estimation for i.i.d. sequences.

Contribution

It establishes the strong consistency of the Good-Turing estimator in a setting with changing distributions and large block lengths, under natural convergence assumptions.

Findings

01

Total probabilities converge to a deterministic limit.

02

Good-Turing estimator is strongly consistent.

03

Results apply to variable distribution regimes.

Abstract

We consider the problem of estimating the total probability of all symbols that appear with a given frequency in a string of i.i.d. random variables with unknown distribution. We focus on the regime in which the block length is large yet no symbol appears frequently in the string. This is accomplished by allowing the distribution to change with the block length. Under a natural convergence assumption on the sequence of underlying distributions, we show that the total probabilities converge to a deterministic limit, which we characterize. We then show that the Good-Turing total probability estimator is strongly consistent.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.