Beyond Dominant Patches: Spatial Credit Redistribution For Grounded Vision-Language Models

Niamul Hassan Samin; Md Arifur Rahman; Abdullah Ibne Hanif Arean; Juena Ahmed Noshin; and Md Ashikur Rahman

arXiv:2602.22469·cs.CV·March 5, 2026

Beyond Dominant Patches: Spatial Credit Redistribution For Grounded Vision-Language Models

Niamul Hassan Samin, Md Arifur Rahman, Abdullah Ibne Hanif Arean, Juena Ahmed Noshin, and Md Ashikur Rahman

PDF

Open Access

TL;DR

This paper introduces Spatial Credit Redistribution (SCR), a training-free inference method that reduces hallucinations in vision-language models by restoring visual context, improving accuracy without retraining the models.

Contribution

The paper proposes SCR, a novel inference-time technique that mitigates hallucinations in VLMs by redistributing attention, outperforming prior methods in efficiency and effectiveness.

Findings

01

SCR reduces hallucination rates significantly across multiple benchmarks.

02

SCR maintains caption quality with minimal latency overhead.

03

It outperforms existing inference-time hallucination mitigation methods.

Abstract

Vision-Language Models (VLMs) often hallucinate objects that are not present in the input image. We identify a contributing cause of this behavior, which we term spatial credit collapse: in early transformer layers, hidden-state activation concentrates on a small number of visual patches, suppressing surrounding contextual evidence and increasing reliance on language priors. Across seven models we observe a strong correlation between visual attention entropy and hallucination rate (r = -0.65, p < 0.001), suggesting that reduced spatial credit diversity contributes to hallucination. To address this issue we propose Spatial Credit Redistribution (SCR), a training-free inference-time method. SCR uses a lightweight two-pass procedure. A diagnostic pass identifies the top-K high-attention source patches and their spatial neighbors. A redistribution pass then scales each source by 1/lambda…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis