Multilingual vs Crosslingual Retrieval of Fact-Checked Claims: A Tale of Two Approaches

Alan Ramponi; Marco Rovera; Robert Moro; Sara Tonelli

arXiv:2505.22118·cs.CL·September 23, 2025

Multilingual vs Crosslingual Retrieval of Fact-Checked Claims: A Tale of Two Approaches

Alan Ramponi, Marco Rovera, Robert Moro, Sara Tonelli

PDF

Open Access 1 Video

TL;DR

This paper compares multilingual and crosslingual retrieval methods for fact-checked claims, highlighting the effectiveness of LLM-based re-ranking and negative sampling strategies across 47 languages.

Contribution

It introduces strategies to enhance crosslingual claim retrieval and demonstrates that crosslingual and multilingual setups have distinct characteristics.

Findings

01

LLM-based re-ranking yields the best retrieval performance.

02

Negative example sampling improves supervised retrieval.

03

Crosslingual retrieval has unique challenges compared to multilingual retrieval.

Abstract

Retrieval of previously fact-checked claims is a well-established task, whose automation can assist professional fact-checkers in the initial steps of information verification. Previous works have mostly tackled the task monolingually, i.e., having both the input and the retrieved claims in the same language. However, especially for languages with a limited availability of fact-checks and in case of global narratives, such as pandemics, wars, or international politics, it is crucial to be able to retrieve claims across languages. In this work, we examine strategies to improve the multilingual and crosslingual performance, namely selection of negative examples (in the supervised) and re-ranking (in the unsupervised setting). We evaluate all approaches on a dataset containing posts and claims in 47 languages (283 language combinations). We observe that the best results are obtained by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Multilingual vs Crosslingual Retrieval of Fact-Checked Claims: A Tale of Two Approaches· underline

Taxonomy

TopicsArtificial Intelligence in Law · linguistics and terminology studies · Multi-Agent Systems and Negotiation