EDDA-Coordinata: An Annotated Dataset of Historical Geographic Coordinates
Ludovic Moncla, Pierre Nugues, Thierry Joliveau, Katherine McDonough

TL;DR
This paper presents EDDA-Coordinata, a dataset and model pipeline for extracting and normalizing geographic coordinates from 18th-century texts, demonstrating high accuracy and cross-domain applicability.
Contribution
The authors created a gold standard dataset and trained transformer models for geographic coordinate retrieval from historical texts, improving extraction accuracy and cross-lingual, cross-domain generalization.
Findings
Achieved 86% EM score on in-domain data.
Model maintained 61-77% EM scores on out-of-domain texts.
Demonstrated the dataset's effectiveness for training coordinate extraction models.
Abstract
This paper introduces a dataset of enriched geographic coordinates retrieved from Diderot and d'Alembert's eighteenth-century Encyclopedie. Automatically recovering geographic coordinates from historical texts is a complex task, as they are expressed in a variety of ways and with varying levels of precision. To improve retrieval of coordinates from similar digitized early modern texts, we have created a gold standard dataset, trained models, published the resulting inferred and normalized coordinate data, and experimented applying these models to new texts. From 74,000 total articles in each of the digitized versions of the Encyclopedie from ARTFL and ENCCRE, we examined 15,278 geographical entries, manually identifying 4,798 containing coordinates, and 10,480 with descriptive but non-numerical references. Leveraging our gold standard annotations, we trained transformer-based models to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeographic Information Systems Studies · Historical Astronomy and Related Studies · Historical Geography and Cartography
