Inferring cultural regions from correlation networks of given baby names
Mateusz Pomorski, Malgorzata J. Krawczyk, Krzysztof Kulakowski,, Jaroslaw Kwapien, Marcel Ausloos

TL;DR
This study analyzes baby name correlations across US states from 1910 to 2010, revealing stable community structures that align with traditional regions and uncovering regional differences in name diversity and distribution patterns.
Contribution
It introduces a network-based method to infer cultural regions from baby name correlations and compares regional naming patterns over a century.
Findings
Community structures align with US geopolitical regions until 1980.
Name distribution follows Zipf's law with regional variations in the exponent.
Southern states have a narrower pool of popular names.
Abstract
We report investigations on the statistical characteristics of the baby names given between 1910 and 2010 in the United States of America. For each year, the 100 most frequent names in the USA are sorted out. For these names, the correlations between the names profiles are calculated for all pairs of states (minus Hawaii and Alaska). The correlations are used to form a weighted network which is found to vary mildly in time. In fact, the structure of communities in the network remains quite stable till about 1980. The goal is that the calculated structure approximately reproduces the usually accepted geopolitical regions: the North East, the South, and the "Midwest + West" as the third one. Furthermore, the dataset reveals that the name distribution satisfies the Zipf law, separately for each state and each year, i.e. the name frequency , where r is the name rank.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
