VOPE: Revisiting Hallucination of Vision-Language Models in Voluntary Imagination Task

Xingming Long; Jie Zhang; Shiguang Shan; Xilin Chen

arXiv:2511.13420·cs.CV·November 18, 2025

VOPE: Revisiting Hallucination of Vision-Language Models in Voluntary Imagination Task

Xingming Long, Jie Zhang, Shiguang Shan, Xilin Chen

PDF

Open Access

TL;DR

This paper introduces VOPE, a new evaluation method for assessing hallucinations in vision-language models during voluntary imagination tasks, revealing high hallucination rates and limited mitigation effectiveness.

Contribution

The paper proposes VOPE, a novel presence evaluation method specifically designed for voluntary imagination tasks in LVLMs, highlighting the need for new hallucination mitigation strategies.

Findings

01

Most LVLMs hallucinate heavily during voluntary imagination.

02

Performance in presence evaluation is poor on imagined objects.

03

Existing mitigation methods have limited effect in these tasks.

Abstract

Most research on hallucinations in Large Vision-Language Models (LVLMs) focuses on factual description tasks that prohibit any output absent from the image. However, little attention has been paid to hallucinations in voluntary imagination tasks, e.g., story writing, where the models are expected to generate novel content beyond the given image. In these tasks, it is inappropriate to simply regard such imagined novel content as hallucinations. To address this limitation, we introduce Voluntary-imagined Object Presence Evaluation (VOPE)-a novel method to assess LVLMs' hallucinations in voluntary imagination tasks via presence evaluation. Specifically, VOPE poses recheck-based questions to evaluate how an LVLM interprets the presence of the imagined objects in its own response. The consistency between the model's interpretation and the object's presence in the image is then used to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Hallucinations in medical conditions · Face Recognition and Perception