# Exact Good-Turing characterization of the two-parameter   Poisson-Dirichlet superpopulation model

**Authors:** Annalisa Cerquetti

arXiv: 1901.09665 · 2019-01-29

## TL;DR

This paper proves that for finite samples, Bayesian nonparametric estimation of discovery probabilities under the two-parameter Poisson-Dirichlet model is exactly equivalent to the Good-Turing estimator, extending previous asymptotic results.

## Contribution

It establishes an exact finite-sample equivalence between Good-Turing and Bayesian nonparametric estimators under the two-parameter Poisson-Dirichlet model, improving prior asymptotic results.

## Key findings

- Exact Good-Turing estimation for finite samples.
- Bayesian nonparametric estimation aligns with Good-Turing under superpopulation assumptions.
- Interpretation of Good-Turing as a Bayesian estimator with partial information.

## Abstract

Large sample size equivalence between the celebrated {\it approximated} Good-Turing estimator of the probability to discover a species already observed a certain number of times (Good, 1953) and the modern Bayesian nonparametric counterpart has been recently established by virtue of a particular smoothing rule based on the two-parameter Poisson-Dirichlet model. Here we improve on this result showing that, for any finite sample size, when the population frequencies are assumed to be selected from a superpopulation with two-parameter Poisson-Dirichlet distribution, then Bayesian nonparametric estimation of the discovery probabilities corresponds to Good-Turing {\it exact} estimation. Moreover under general superpopulation hypothesis the Good-Turing solution admits an interpretation as a modern Bayesian nonparametric estimator under partial information.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.09665/full.md

---
Source: https://tomesphere.com/paper/1901.09665