Estimating the observable population size from biased samples: a new approach to population estimation with capture heterogeneity
James E. Johndrow, Kristian Lum, Daniel Manrique-Vallier

TL;DR
This paper introduces a new method for estimating the size of a population with heterogeneous capture probabilities by focusing on individuals above a certain capture threshold, improving estimation accuracy.
Contribution
It reformulates population estimation with heterogeneity as a nonparametric density estimation problem and proposes a threshold-based approach for more reliable estimates.
Findings
Threshold-based estimators have lower risk than total population estimators.
The method performs well in simulations and real data applications.
Estimating the entire population size remains challenging with heterogeneity.
Abstract
Capture-recapture methods aim to estimate the size of a closed population on the basis of multiple incomplete enumerations of individuals. In many applications, the individual probability of being recorded is heterogeneous in the population. Previous studies have suggested that it is not possible to reliably estimate the total population size when capture heterogeneity exists. Here we approach population estimation in the presence of capture heterogeneity as a latent length biased nonparametric density estimation problem on the unit interval. We show that in this setting it is generally impossible to estimate the density on the entire unit interval in finite samples, and that estimators of the population size have high and sometimes unbounded risk when the density has significant mass near zero. As an alternative, we propose estimating the population of individuals with capture…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCensus and Population Estimation · Data-Driven Disease Surveillance · Animal Ecology and Behavior Studies
