The Ten Thousand Kims
Seung Ki Baek, Petter Minnhagen, Beom Jun Kim

TL;DR
This paper models the distribution of Korean family names using a simple null model, predicting name frequency changes over time and estimating historical population sizes, revealing social stability in Korean culture.
Contribution
It introduces the RGF model to describe and predict Korean family name distributions and their historical changes, including the prevalence of the name Kim.
Findings
The RGF model accurately predicts name distribution changes over time.
The occurrence of the name Kim is proportional to the total number of married women.
Estimated Korean population with about ten thousand Kims around 500 AD.
Abstract
In the Korean culture the family members are recorded in special family books. This makes it possible to follow the distribution of Korean family names far back in history. It is here shown that these name distributions are well described by a simple null model, the random group formation (RGF) model. This model makes it possible to predict how the name distributions change and these predictions are shown to be borne out. In particular, the RGF model predicts that, for married women entering a collection of family books in a certain year, the occurrence of the most common family name "Kim" should be directly proportional the total number of married women with the same proportionality constant for all the years. This prediction is also borne out to high degree. We speculate that it reflects some inherent social stability in the Korean culture. In addition, we obtain an estimate of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
