Multilingual News Location Detection using an Entity-Based Siamese Network with Semi-Supervised Contrastive Learning and Knowledge Base
V\'ictor Su\'arez-Paniagua, Steven Derby, Tri Kurniawan Wijaya

TL;DR
This paper introduces a novel entity-based Siamese network with semi-supervised contrastive learning and a knowledge base for multilingual news location detection, especially effective when locations are not explicitly mentioned.
Contribution
It presents a new system combining entity linking, contrastive learning, and knowledge bases, along with a multilingual dataset for news location detection.
Findings
Outperforms baseline models in location detection accuracy.
Effectively infers implicit locations using knowledge bases.
Provides a publicly available multilingual dataset for future research.
Abstract
Early detection of relevant locations in a piece of news is especially important in extreme events such as environmental disasters, war conflicts, disease outbreaks, or political turmoils. Additionally, this detection also helps recommender systems to promote relevant news based on user locations. Note that, when the relevant locations are not mentioned explicitly in the text, state-of-the-art methods typically fail to recognize them because these methods rely on syntactic recognition. In contrast, by incorporating a knowledge base and connecting entities with their locations, our system successfully infers the relevant locations even when they are not mentioned explicitly in the text. To evaluate the effectiveness of our approach, and due to the lack of datasets in this area, we also contribute to the research community with a gold-standard multilingual news-location dataset, NewsLOC.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Text and Document Classification Technologies · Natural Language Processing Techniques
Methodsfail · Balanced Selection
