Universal features of surname distribution in a subsample of a growing population
Yosef E. Maruvka, Nadav M. Shnerb, David A. Kessler

TL;DR
This paper investigates surname distribution in a growing population, deriving theoretical models for family size statistics and applying them to U.S. Census data, revealing significant discrepancies in growth rate estimates.
Contribution
It introduces a novel theoretical framework for analyzing family size distributions in subsamples of growing populations, accounting for mutation and growth dynamics.
Findings
Derived the family size distribution for subsamples from the full population distribution.
Showed that subsample distribution shifts left, affecting family size statistics.
Applied the model to U.S. Census data, revealing a misestimation of population growth rate.
Abstract
We examine the problem of family size statistics (the number of individuals carrying the same surname, or the same DNA sequence) in a given size subsample of an exponentially growing population. We approach the problem from two directions. In the first, we construct the family size distribution for the subsample from the stable distribution for the full population. This latter distribution is calculated for an arbitrary growth process in the limit of slow growth, and is seen to depend only on the average and variance of the number of children per individual, as well as the mutation rate. The distribution for the subsample is shifted left with respect to the original distribution, tending to eliminate the part of the original distribution reflecting the small families, and thus increasing the mean family size. From the subsample distribution, various bulk quantities such as the average…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic diversity and population structure · Evolution and Genetic Dynamics · Forensic and Genetic Research
