Do Not Leave a Gap: Hallucination-Free Object Concealment in Vision-Language Models
Amira Guesmi, Muhammad Shafique

TL;DR
This paper introduces a novel background-consistent object concealment attack on vision-language models that effectively hides objects without causing hallucination, outperforming previous suppression-based methods.
Contribution
The authors propose a new attack method that re-encodes objects to match background semantics, avoiding hallucination and improving concealment effectiveness in vision-language models.
Findings
Reduces hallucination by up to 3 times compared to prior methods.
Preserves up to 86% of non-target objects during concealment.
Effectively conceals objects without creating semantic gaps.
Abstract
Vision-language models (VLMs) have recently shown remarkable capabilities in visual understanding and generation, but remain vulnerable to adversarial manipulations of visual content. Prior object-hiding attacks primarily rely on suppressing or blocking region-specific representations, often creating semantic gaps that inadvertently induce hallucination, where models invent plausible but incorrect objects. In this work, we demonstrate that hallucination arises not from object absence per se, but from semantic discontinuity introduced by such suppression-based attacks. We propose a new class of \emph{background-consistent object concealment} attacks, which hide target objects by re-encoding their visual representations to be statistically and semantically consistent with surrounding background regions. Crucially, our approach preserves token structure and attention flow, avoiding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection
