Statistical and Neural Methods for Cross-lingual Entity Label Mapping in Knowledge Graphs
Gabriel Amaral, M\=arcis Pinnis, Inguna Skadi\c{n}a, Odinaldo, Rodrigues, Elena Simperl

TL;DR
This paper explores the use of word and sentence alignment techniques to improve cross-lingual entity label matching in Wikidata, significantly enhancing multilingual consistency for applications like machine translation.
Contribution
It introduces a novel approach combining alignment methods and matching algorithms to significantly improve cross-lingual label mapping in knowledge bases.
Findings
Sentence embedding methods outperform others in label matching.
Mapping accuracy improves up to 20 points in F1-score.
Techniques enhance multilingual knowledge base consistency.
Abstract
Knowledge bases such as Wikidata amass vast amounts of named entity information, such as multilingual labels, which can be extremely useful for various multilingual and cross-lingual applications. However, such labels are not guaranteed to match across languages from an information consistency standpoint, greatly compromising their usefulness for fields such as machine translation. In this work, we investigate the application of word and sentence alignment techniques coupled with a matching algorithm to align cross-lingual entity labels extracted from Wikidata in 10 languages. Our results indicate that mapping between Wikidata's main labels stands to be considerably improved (up to points in F1-score) by any of the employed methods. We show how methods relying on sentence embeddings outperform all others, even across different scripts. We believe the application of such techniques…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Advanced Graph Neural Networks
MethodsBalanced Selection · ALIGN
