Revealing Perception and Generation Dynamics in LVLMs: Mitigating Hallucinations via Validated Dominance Correction

Guangtao Lyu; Xinyi Cheng; Chenghao Xu; Qi Liu; Muli Yang; Fen Fang; Huilin Chen; Jiexi Yan; Xu Yang; Cheng Deng

arXiv:2512.18813·cs.CV·December 23, 2025

Revealing Perception and Generation Dynamics in LVLMs: Mitigating Hallucinations via Validated Dominance Correction

Guangtao Lyu, Xinyi Cheng, Chenghao Xu, Qi Liu, Muli Yang, Fen Fang, Huilin Chen, Jiexi Yan, Xu Yang, Cheng Deng

PDF

Open Access

TL;DR

This paper analyzes the internal perception and generation processes of LVLMs, identifies hallucination patterns, and introduces VDC, a correction method that significantly reduces hallucinations in model outputs.

Contribution

It provides a systematic analysis of LVLMs' perception and generation dynamics and proposes VDC, a novel correction strategy to mitigate hallucinations.

Findings

01

VDC reduces hallucinations across multiple models.

02

Perception follows a three-stage GATE process.

03

Generation exhibits a Subdominant Accumulation to Dominant pattern.

Abstract

Large Vision-Language Models (LVLMs) have shown remarkable capabilities, yet hallucinations remain a persistent challenge. This work presents a systematic analysis of the internal evolution of visual perception and token generation in LVLMs, revealing two key patterns. First, perception follows a three-stage GATE process: early layers perform a Global scan, intermediate layers Approach and Tighten on core content, and later layers Explore supplementary regions. Second, generation exhibits an SAD (Subdominant Accumulation to Dominant) pattern, where hallucinated tokens arise from the repeated accumulation of subdominant tokens lacking support from attention (visual perception) or feed-forward network (internal knowledge). Guided by these findings, we devise the VDC (Validated Dominance Correction) strategy, which detects unsupported tokens and replaces them with validated dominant ones…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHallucinations in medical conditions · Adversarial Robustness in Machine Learning · Multimodal Machine Learning Applications