TL;DR
This paper introduces an unsupervised model for toponym resolution that effectively disambiguates geographic mentions in text without requiring training data, outperforming some commercial tools and approaching supervised methods.
Contribution
The paper proposes a novel unsupervised approach that leverages document geographic scope and inter-toponym relationships to resolve locations without training data.
Findings
Outperforms unsupervised techniques and some commercial tools on multiple datasets.
Close performance to supervised methods, especially with many unseen toponyms.
Effective in scenarios lacking annotated training data.
Abstract
Toponym Resolution, the task of assigning a location mention in a document to a geographic referent (i.e., latitude/longitude), plays a pivotal role in analyzing location-aware content. However, the ambiguities of natural language and a huge number of possible interpretations for toponyms constitute insurmountable hurdles for this task. In this paper, we study the problem of toponym resolution with no additional information other than a gazetteer and no training data. We demonstrate that a dearth of large enough annotated data makes supervised methods less capable of generalizing. Our proposed method estimates the geographic scope of documents and leverages the connections between nearby place names as evidence to resolve toponyms. We explore the interactions between multiple interpretations of mentions and the relationships between different toponyms in a document to build a model that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
