MultiCaption: Detecting disinformation using multilingual visual claims

Rafael Martins Frade; Rrubaa Panchendrarajan; Arkaitz Zubiaga

arXiv:2601.11220·cs.CL·January 19, 2026

MultiCaption: Detecting disinformation using multilingual visual claims

Rafael Martins Frade, Rrubaa Panchendrarajan, Arkaitz Zubiaga

PDF

Open Access

TL;DR

MultiCaption introduces a multilingual visual claims dataset to improve disinformation detection, demonstrating the importance of task-specific fine-tuning and multilingual training for effective fact-checking in diverse languages and media.

Contribution

The paper presents MultiCaption, a novel multilingual visual claims dataset, and evaluates transformer-based models, highlighting challenges and benefits of multilingual and multimodal fact-checking.

Findings

01

MultiCaption contains 11,088 claims in 64 languages.

02

Transformer models require task-specific fine-tuning for strong performance.

03

Multilingual training improves fact-checking without machine translation.

Abstract

Online disinformation poses an escalating threat to society, driven increasingly by the rapid spread of misleading content across both multimedia and multilingual platforms. While automated fact-checking methods have advanced in recent years, their effectiveness remains constrained by the scarcity of datasets that reflect these real-world complexities. To address this gap, we first present MultiCaption, a new dataset specifically designed for detecting contradictions in visual claims. Pairs of claims referring to the same image or video were labeled through multiple strategies to determine whether they contradict each other. The resulting dataset comprises 11,088 visual claims in 64 languages, offering a unique resource for building and evaluating misinformation-detection systems in truly multimodal and multilingual environments. We then provide comprehensive experiments using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Multimodal Machine Learning Applications · Ethics and Social Impacts of AI