Suppressing VLM Hallucinations with Spectral Representation Filtering

Ameen Ali; Tamim Zoabi; Lior Wolf

arXiv:2511.12220·cs.CV·November 18, 2025

Suppressing VLM Hallucinations with Spectral Representation Filtering

Ameen Ali, Tamim Zoabi, Lior Wolf

PDF

Open Access

TL;DR

This paper presents Spectral Representation Filtering (SRF), a post-hoc, training-free technique that reduces hallucinations in vision-language models by analyzing and correcting their feature covariance structure, improving faithfulness across multiple benchmarks.

Contribution

Introduces SRF, a novel spectral filtering method that suppresses hallucinations in VLMs without retraining or architectural changes, operating entirely post-hoc.

Findings

01

SRF significantly reduces hallucination rates in VLMs.

02

SRF achieves state-of-the-art faithfulness on multiple benchmarks.

03

SRF maintains caption quality while suppressing hallucinations.

Abstract

Vision-language models (VLMs) frequently produce hallucinations in the form of descriptions of objects, attributes, or relations that do not exist in the image due to over-reliance on language priors and imprecise cross-modal grounding. We introduce Spectral Representation Filtering (SRF), a lightweight, training-free method to suppress such hallucinations by analyzing and correcting the covariance structure of the model's representations. SRF identifies low-rank hallucination modes through eigendecomposition of the covariance of the differences between features collected for truthful and hallucinatory captions, revealing structured biases in the feature space. A soft spectral filter then attenuates these modes in the feed-forward projection weights of deeper vLLM layers, equalizing feature variance while preserving semantic fidelity. Unlike decoding or retraining-based approaches, SRF…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis