Bayesian Nonparametric Inference for "Species-sampling" Problems
Cecilia Balocchi, Stefano Favaro, Zacharie Naulet

TL;DR
This paper reviews Bayesian nonparametric methods for species-sampling problems, focusing on the Pitman--Yor process, offering improved computational approaches, analyzing parameter estimation consistency, and discussing generalizations in biological and related sciences.
Contribution
It introduces novel posterior representations for SSPs under the Pitman--Yor process, and analyzes the Bayesian consistency of parameter estimation methods.
Findings
Posterior representations simplified using Binomial and Hypergeometric distributions.
Bayesian consistency established for the discount parameter estimation.
Scale parameter cannot be estimated consistently, advising caution.
Abstract
Given an observed sample from a population of individuals belonging to species, "species-sampling" problems (SSPs) call for estimating some features of the unknown species composition of additional unobservable samples from the same population. Within SSPs, the problems of estimating coverage probabilities, the number of unseen species and coverages of prevalences have emerged in the past three decades for being the subject of numerous methodological and applied works, mostly in biological sciences but also in statistical machine learning, electrical engineering, theoretical computer science, information theory and forensic statistics. In this paper, we focus on these popular SSPs, and present an overview of their Bayesian nonparametric (BNP) analysis under the Pitman--Yor process (PYP) prior. While reviewing the literature, we improve on computation and interpretability of existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference · Census and Population Estimation
