HTDC: Hesitation-Triggered Differential Calibration for Mitigating Hallucination in Large Vision-Language Models

Xinyun Liu

arXiv:2604.12115·cs.CV·April 15, 2026

HTDC: Hesitation-Triggered Differential Calibration for Mitigating Hallucination in Large Vision-Language Models

Xinyun Liu

PDF

TL;DR

HTDC is a training-free decoding method that detects hesitation signals in large vision-language models to selectively calibrate and reduce hallucinations without disrupting stable predictions.

Contribution

The paper introduces Hesitation-Triggered Differential Calibration (HTDC), a novel approach that activates targeted calibration only at hesitation points to mitigate hallucinations in LVLMs.

Findings

01

HTDC reduces hallucinations across benchmarks.

02

It maintains high task accuracy while lowering computational costs.

03

Selective calibration improves model reliability.

Abstract

Large vision-language models (LVLMs) achieve strong multimodal performance, but still suffer from hallucinations caused by unstable visual grounding and over-reliance on language priors. Existing training-free decoding methods typically apply calibration at every decoding step, introducing unnecessary computation and potentially disrupting stable predictions. We address this problem by identifying layer-wise hesitation, a simple signal of grounding instability reflected by fluctuations in token preference across intermediate layers. Based on this observation, we propose Hesitation-Triggered Differential Calibration (HTDC), a training-free decoding framework that preserves standard full-branch inference and activates calibration only at hesitation-prone steps. When triggered, HTDC contrasts the full branch with two lightweight probes, a visual-nullification probe and a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.