"AGI" team at SHROOM-CAP: Data-Centric Approach to Multilingual Hallucination Detection using XLM-RoBERTa

Harsh Rathva; Pruthwik Mishra; Shrikant Malviya

arXiv:2511.18301·cs.CL·November 25, 2025

"AGI" team at SHROOM-CAP: Data-Centric Approach to Multilingual Hallucination Detection using XLM-RoBERTa

Harsh Rathva, Pruthwik Mishra, Shrikant Malviya

PDF

Open Access 1 Models

TL;DR

This paper presents a data-centric approach using balanced multilingual datasets and fine-tuned XLM-RoBERTa-Large to detect hallucinations in scientific texts across nine languages, emphasizing data quality over model complexity.

Contribution

The study introduces a comprehensive, balanced multilingual dataset and demonstrates that data curation can outperform architectural innovations in hallucination detection, especially for low-resource languages.

Findings

01

Achieved second place in Gujarati zero-shot detection with Factuality F1 of 0.5107.

02

Created a 172x larger, balanced training dataset of 124,821 samples.

03

Systematic data curation improves multilingual hallucination detection performance.

Abstract

The detection of hallucinations in multilingual scientific text generated by Large Language Models (LLMs) presents significant challenges for reliable AI systems. This paper describes our submission to the SHROOM-CAP 2025 shared task on scientific hallucination detection across 9 languages. Unlike most approaches that focus primarily on model architecture, we adopted a data-centric strategy that addressed the critical issue of training data scarcity and imbalance. We unify and balance five existing datasets to create a comprehensive training corpus of 124,821 samples (50% correct, 50% hallucinated), representing a 172x increase over the original SHROOM training data. Our approach fine-tuned XLM-RoBERTa-Large with 560 million parameters on this enhanced dataset, achieves competitive performance across all languages, including \textbf{2nd place in Gujarati} (zero-shot language) with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
Haxxsh/XLMRHallucinationDetectorSHROOMCAP
model· 1 dl
1 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMental Health via Writing · Misinformation and Its Impacts · Topic Modeling