Mapping the Past: Geographically Linking an Early 20th Century Swedish Encyclopedia with Wikidata
Axel Ahlin, Alfred Myrne, Pierre Nugues

TL;DR
This paper extracts and links geographic entries from a 20th-century Swedish encyclopedia to Wikidata, revealing historical geographic distributions and entry selection patterns.
Contribution
It introduces a method for extracting, categorizing, and geographically linking encyclopedia entries to Wikidata, enabling historical geographic analysis.
Findings
Higher density of locations in Sweden, Germany, UK
Approximately 22% of entries are locations
Nearly 18,000 valid geographic coordinates extracted
Abstract
In this paper, we describe the extraction of all the location entries from a prominent Swedish encyclopedia from the early 20th century, the \textit{Nordisk Familjebok} `Nordic Family Book.' We focused on the second edition called \textit{Uggleupplagan}, which comprises 38 volumes and over 182,000 articles. This makes it one of the most extensive Swedish encyclopedias. Using a classifier, we first determined the category of the entries. We found that approximately 22 percent of them were locations. We applied a named entity recognition to these entries and we linked them to Wikidata. Wikidata enabled us to extract their precise geographic locations resulting in almost 18,000 valid coordinates. We then analyzed the distribution of these locations and the entry selection process. It showed a higher density within Sweden, Germany, and the United Kingdom. The paper sheds light on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Humanities and Scholarship
