Pantheon 1.0, a manually verified dataset of globally famous biographies
Amy Zhao Yu, Shahar Ronen, Kevin Hu, Tiffany Lu, C\'esar A. Hidalgo

TL;DR
The paper introduces Pantheon 1.0, a comprehensive, manually verified dataset of globally recognized biographies from Wikipedia, enriched with demographic, occupational, and popularity metrics, to analyze historical impact.
Contribution
It provides a new, detailed dataset of 11,341 biographies with verified demographic and popularity data, enabling better analysis of individual impact across history and cultures.
Findings
Popularity measures correlate with individual accomplishments.
The dataset covers biographies in over 25 languages.
Global popularity proxies reflect historical impact.
Abstract
We present the Pantheon 1.0 dataset: a manually verified dataset of individuals that have transcended linguistic, temporal, and geographic boundaries. The Pantheon 1.0 dataset includes the 11,341 biographies present in more than 25 languages in Wikipedia and is enriched with: (i) manually verified demographic information (place and date of birth, gender) (ii) a taxonomy of occupations classifying each biography at three levels of aggregation and (iii) two measures of global popularity including the number of languages in which a biography is present in Wikipedia (L), and the Historical Popularity Index (HPI) a metric that combines information on L, time since birth, and page-views (2008-2013). We compare the Pantheon 1.0 dataset to data from the 2003 book, Human Accomplishments, and also to external measures of accomplishment in individual games and sports: Tennis, Swimming, Car Racing,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
