CCHall: A Novel Benchmark for Joint Cross-Lingual and Cross-Modal Hallucinations Detection in Large Language Models

Yongheng Zhang; Xu Liu; Ruoxi Zhou; Qiguang Chen; Hao Fei; Wenpeng Lu; Libo Qin

arXiv:2505.19108·cs.CL·May 27, 2025

CCHall: A Novel Benchmark for Joint Cross-Lingual and Cross-Modal Hallucinations Detection in Large Language Models

Yongheng Zhang, Xu Liu, Ruoxi Zhou, Qiguang Chen, Hao Fei, Wenpeng Lu, Libo Qin

PDF

Open Access 1 Repo 1 Datasets 1 Video

TL;DR

This paper introduces CCHall, a new benchmark designed to evaluate large language models' ability to handle hallucinations across both cross-lingual and cross-modal scenarios, addressing a significant gap in current evaluation methods.

Contribution

The paper presents CCHall, the first benchmark to jointly assess cross-lingual and cross-modal hallucinations in LLMs, and provides comprehensive evaluation results highlighting current model limitations.

Findings

01

Current LLMs struggle with joint cross-lingual and cross-modal hallucinations.

02

CCHall serves as a valuable resource for future LLM evaluation.

03

Evaluation of mainstream open-source and closed-source LLMs on CCHall.

Abstract

Investigating hallucination issues in large language models (LLMs) within cross-lingual and cross-modal scenarios can greatly advance the large-scale deployment in real-world applications. Nevertheless, the current studies are limited to a single scenario, either cross-lingual or cross-modal, leaving a gap in the exploration of hallucinations in the joint cross-lingual and cross-modal scenarios. Motivated by this, we introduce a novel joint Cross-lingual and Cross-modal Hallucinations benchmark (CCHall) to fill this gap. Specifically, CCHall simultaneously incorporates both cross-lingual and cross-modal hallucination scenarios, which can be used to assess the cross-lingual and cross-modal capabilities of LLMs. Furthermore, we conduct a comprehensive evaluation on CCHall, exploring both mainstream open-source and closed-source LLMs. The experimental results highlight that current LLMs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

brz911/cchall
noneOfficial

Datasets

67L11/CCHall
dataset· 13 dl
13 dl

Videos

CCHall: A Novel Benchmark for Joint Cross-Lingual and Cross-Modal Hallucinations Detection in Large Language Models· underline

Taxonomy

TopicsText Readability and Simplification · Epilepsy research and treatment · Mental Health via Writing