We are not alone ! (at least, most of us). Homonymy in large scale social groups
Arthur Charpentier, Baptiste Coulmont

TL;DR
This paper estimates the prevalence of homonyms in large social groups using a generalized birthday paradox approach, revealing frequent identity collisions in societies like France and the US, but rare in smaller groups.
Contribution
It introduces a novel estimation method for homonym frequency in large populations based on name distributions and the birthday paradox generalization.
Findings
Most individuals in large societies have at least one homonym.
Identity collisions are common in large populations such as France and the US.
Homonyms are rare in small groups, affecting only a few individuals.
Abstract
This article brings forward an estimation of the proportion of homonyms in large scale groups based on the distribution of first names and last names in a subset of these groups. The estimation is based on the generalization of the "birthday paradox problem". The main results is that, in societies such as France or the United States, identity collisions (based on first + last names) are frequent. The large majority of the population has at least one homonym. But in smaller settings, it is much less frequent : even if small groups of a few thousand people have at least one couple of homonyms, only a few individuals have an homonym.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNames, Identity, and Discrimination Research · Authorship Attribution and Profiling · Language and Culture
