We are not alone ! (at least, most of us). Homonymy in large scale   social groups

Arthur Charpentier; Baptiste Coulmont

arXiv:1707.07607·stat.OT·July 25, 2017·1 cites

We are not alone ! (at least, most of us). Homonymy in large scale social groups

Arthur Charpentier, Baptiste Coulmont

PDF

Open Access 1 Repo

TL;DR

This paper estimates the prevalence of homonyms in large social groups using a generalized birthday paradox approach, revealing frequent identity collisions in societies like France and the US, but rare in smaller groups.

Contribution

It introduces a novel estimation method for homonym frequency in large populations based on name distributions and the birthday paradox generalization.

Findings

01

Most individuals in large societies have at least one homonym.

02

Identity collisions are common in large populations such as France and the US.

03

Homonyms are rare in small groups, affecting only a few individuals.

Abstract

This article brings forward an estimation of the proportion of homonyms in large scale groups based on the distribution of first names and last names in a subset of these groups. The estimation is based on the generalization of the "birthday paradox problem". The main results is that, in societies such as France or the United States, identity collisions (based on first + last names) are frequent. The large majority of the population has at least one homonym. But in smaller settings, it is much less frequent : even if small groups of a few thousand people have at least one couple of homonyms, only a few individuals have an homonym.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

freakonometrics/homonym
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNames, Identity, and Discrimination Research · Authorship Attribution and Profiling · Language and Culture