ASCD: Attention-Steerable Contrastive Decoding for Reducing Hallucination in MLLM
Yujun Wang, Aniri, Jinhe Bi, Soeren Pirk, Yunpu Ma

TL;DR
This paper introduces ASCD, a novel decoding method that reduces hallucinations in multimodal large language models by steering attention scores, leading to more accurate and faithful multimodal generation without additional training.
Contribution
ASCD is a new, attention-steering decoding technique that directly manipulates attention scores during inference to mitigate hallucinations in MLLMs, requiring no extra training.
Findings
Reduces hallucination by up to 38.2% across benchmarks.
Improves accuracy on multiple VQA datasets.
Works across five MLLM backbones and three decoding schemes.
Abstract
Multimodal large language models (MLLMs) frequently hallucinate by over-committing to spurious visual cues. Prior remedies-Visual and Instruction Contrastive Decoding (VCD, ICD)-mitigate this issue, yet the mechanism remains opaque. We first empirically show that their improvements systematically coincide with redistributions of cross-modal attention. Building on this insight, we propose Attention-Steerable Contrastive Decoding (ASCD), which directly steers the attention scores during decoding. ASCD combines (i) positive steering, which amplifies automatically mined text-centric heads-stable within a model and robust across domains-with (ii) negative steering, which dampens on-the-fly identified critical visual tokens. The method incurs negligible runtime and memory overhead and requires no additional training. Across five MLLM backbones and three decoding schemes, ASCD reduces…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Algorithms and Data Compression · Topological and Geometric Data Analysis
