Dynamic Multimodal Activation Steering for Hallucination Mitigation in Large Vision-Language Models
Jianghao Yin, Qin Chen, Kedi Chen, Jie Zhou, Xingjiao Wu, Liang He

TL;DR
This paper introduces a training-free method called Dynamic Multimodal Activation Steering to reduce hallucinations in large vision-language models by dynamically selecting relevant attention heads based on semantic context.
Contribution
It reveals distinct attention head roles for truthfulness and perception, and proposes a novel, context-aware, activation steering technique for hallucination mitigation.
Findings
Significant performance improvements over state-of-the-art methods
Effective in diverse models and datasets
Enhances truthfulness and visual perception capabilities
Abstract
Large Vision-Language Models (LVLMs) exhibit outstanding performance on vision-language tasks but struggle with hallucination problems. Through in-depth analysis of LVLM activation patterns, we reveal two key findings: 1) truthfulness and visual perception capabilities predominantly engage different subsets of attention heads within the model architecture; and 2) truthfulness steering vectors vary significantly across different semantic contexts. Based on these observations, we propose Dynamic Multimodal Activation Steering, a training-free approach for hallucination mitigation. Our method constructs a semantic-based truthfulness steering vector database and computes visual perception steering vectors, enabling context-aware interventions during inference by dynamically selecting the most relevant steering vectors based on input semantic similarity and applying them to the most…
Peer Reviews
Decision·ICLR 2026 Poster
The paper is well-motivated and proposes a clever approach to VLM hallcunation issue. The realization that truthfulness steering vectors vary significantly across semantic contexts and the resulting dynamic, context-aware database approach is novel within activation steering literature. In addition, the training-free nature of this work is highly preferred, especially for such kind of problems where efficiency matters.
I am giving a conditional weak reject, and I think the the following issue should be carefully addressed by authors. - A more comprehensive analysis of the constructed dataset should be provided. One of the major novelty of the method is to pre-construct a set of embeddings and select steering vectors accordingly from this dataset. As such, the property of the dataset matters a lot, but very limited analysis is given. How different the performance will be if we start from a different dataset? W
- A training-free and context-aware method to curb hallucinations in LVLMs by nudging a small set of attention heads tied to visual grounding and factuality. It helps on both open-ended captioning (e.g., CHAIR) and VQA-style benchmarks (e.g., POPE/MME). - The setup is easy to follow: datasets and metrics are spelled out, baselines are sensible, and the main knobs (α, β, and the number of intervened heads K) are reported with the ranges they tried. - Ablations show both components (truthfulness a
- Although training-free, the method isn’t truly plug-and-play across LVLMs: the influential-head masks and truthfulness/visual steering vectors must be recomputed for each new model (with α/β/K re-tuned), so deploying on a different backbone requires non-trivial one-time setup rather than drop-in reuse. - It’s encouraging to see gains without regressions and signs of cross-domain transfer (Table 5). To make the generality claim more convincing, reporting results against stronger baselines beyon
1. This paper conducts an interesting analysis of attention patterns, revealing which attention heads are most sensitive to truthfulness versus visual perception. 2. This paper proposes an interesting Dynamic Multimodal Activation Steering, which incorporates steering vector to mitigate hallucination. 3. The paper demonstrates good writing quality and is easy to read.
1. Experiments were conducted on a limited set of backbones; it would be better to include experiments on more recent models. 2. Some important hyperparameters are missing from the paper. For example, the temperature, top-p, and top-k settings are not reported. 3. The paper lacks comparisons with recent decoding strategies, such as DECO [1] and DAMO [2]. As far as I know, they also perform well in hallucination mitigation. - [1] MLLM can see? Dynamic Correction Decoding for Hallucination Mit
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Ethics and Social Impacts of AI · Multimodal Machine Learning Applications
