Geolocating News about Extreme Climate Events: A Comparative Analysis of Off-the-Shelf Tools for Toponym Identification in German
Brielen Madureira, Mariana Madruga de Brito, Andreas Niekler

TL;DR
This study compares three off-the-shelf NER tools for German news articles to assess their impact on geolocating extreme climate events and how tool choice influences media analysis conclusions.
Contribution
It provides a detailed comparison of Flair, Spacy, and Stanza NER tools and evaluates their effects on geolocation accuracy in climate-related news texts.
Findings
Different NER tools produce varying toponym outputs.
Tool choice significantly affects geolocation and media prominence analysis.
Contrasts between tools influence downstream decision-making.
Abstract
Determining the geolocation of extreme climate events and disasters in texts is a common problem in climate impact and adaptation research. Named-entity recognition (NER) tools are typically used to identify a pool of toponyms that serve as candidate event locations. In this study, we conduct a comparative analysis of three off-the-shelf NER tools, namely Flair, Spacy and Stanza. We describe and quantify differences between their outputs for German news articles and evaluate them extrinsically based on three methods to determine the country where events took place. We show how their contrasts are propagated into downstream tasks and can yield distinct decisions about a document's geographical focus, which, in turn, can impact conclusions about countries' prominence in German media.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
