TUM-MiKaNi at SemEval-2025 Task 3: Towards Multilingual and Knowledge-Aware Non-factual Hallucination Identification
Miriam Ansch\"utz, Ekaterina Gikalo, Niklas Herbster, Georg Groh

TL;DR
This paper presents a multilingual system for detecting hallucinations in LLM outputs, combining fact verification and pattern identification, achieving top results in multiple languages and supporting more languages than the shared task.
Contribution
It introduces a novel multilingual hallucination detection pipeline that integrates retrieval-based fact verification with pattern recognition, extending beyond the languages covered in the shared task.
Findings
Achieved top-10 results in eight languages, including English.
Supports more than fourteen languages, surpassing shared task coverage.
Demonstrated effectiveness of combined retrieval and pattern-based approach.
Abstract
Hallucinations are one of the major problems of LLMs, hindering their trustworthiness and deployment to wider use cases. However, most of the research on hallucinations focuses on English data, neglecting the multilingual nature of LLMs. This paper describes our submission to the SemEval-2025 Task-3 - Mu-SHROOM, the Multilingual Shared-task on Hallucinations and Related Observable Overgeneration Mistakes. We propose a two-part pipeline that combines retrieval-based fact verification against Wikipedia with a BERT-based system fine-tuned to identify common hallucination patterns. Our system achieves competitive results across all languages, reaching top-10 results in eight languages, including English. Moreover, it supports multiple languages beyond the fourteen covered by the shared task. This multilingual hallucination identifier can help to improve LLM outputs and their usefulness in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Mental Health via Writing · Data Quality and Management
