Tone Matters: The Impact of Linguistic Tone on Hallucination in VLMs

Weihao Hong; Zhiyuan Jiang; Bingyu Shen; Xinlei Guan; Yangyi Feng; Meng Xu; Boyang Li

arXiv:2601.06460·cs.CV·January 13, 2026

Tone Matters: The Impact of Linguistic Tone on Hallucination in VLMs

Weihao Hong, Zhiyuan Jiang, Bingyu Shen, Xinlei Guan, Yangyi Feng, Meng Xu, Boyang Li

PDF

Open Access

TL;DR

This paper examines how different prompt styles influence hallucination behaviors in vision-language models, revealing that increased prompt coercion does not always lead to more hallucinations and highlighting model-specific limitations.

Contribution

Introduces Ghost-100, a synthetic dataset for controlled analysis of hallucinations, and a structured framework to evaluate prompt pressure effects on VLMs.

Findings

01

Hallucination rates vary non-monotonically with prompt intensity.

02

Models are more sensitive to semantic hostility than structural coercion.

03

Current safety measures are more effective against semantic threats than structural coercion.

Abstract

Vision-Language Models (VLMs) are increasingly used in safety-critical applications that require reliable visual grounding. However, these models often hallucinate details that are not present in the image to satisfy user prompts. While recent datasets and benchmarks have been introduced to evaluate systematic hallucinations in VLMs, many hallucination behaviors remain insufficiently characterized. In particular, prior work primarily focuses on object presence or absence, leaving it unclear how prompt phrasing and structural constraints can systematically induce hallucinations. In this paper, we investigate how different forms of prompt pressure influence hallucination behavior. We introduce Ghost-100, a procedurally generated dataset of synthetic scenes in which key visual details are deliberately removed, enabling controlled analysis of absence-based hallucinations. Using a structured…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Face Recognition and Perception · Multimodal Machine Learning Applications