Safeguarding Vision-Language Models Against Patched Visual Prompt   Injectors

Jiachen Sun; Changsheng Wang; Jiongxiao Wang; Yiwei Zhang and; Chaowei Xiao

arXiv:2405.10529·cs.CV·August 27, 2024·1 cites

Safeguarding Vision-Language Models Against Patched Visual Prompt Injectors

Jiachen Sun, Changsheng Wang, Jiongxiao Wang, Yiwei Zhang and, Chaowei Xiao

PDF

Open Access

TL;DR

This paper introduces SmoothVLM, a smoothing-based defense mechanism that significantly reduces the success of patched visual prompt injections in vision-language models, enhancing their robustness against adversarial patches.

Contribution

The paper proposes SmoothVLM, a novel smoothing technique that effectively defends VLMs from patched adversarial prompts, with minimal impact on image context recovery.

Findings

01

Attack success rate reduced to 0-5%

02

Achieves 67.3-95% context recovery

03

Robust against adaptive adversarial attacks

Abstract

Large language models have become increasingly prominent, also signaling a shift towards multimodality as the next frontier in artificial intelligence, where their embeddings are harnessed as prompts to generate textual content. Vision-language models (VLMs) stand at the forefront of this advancement, offering innovative ways to combine visual and textual data for enhanced understanding and interaction. However, this integration also enlarges the attack surface. Patch-based adversarial attack is considered the most realistic threat model in physical vision applications, as demonstrated in many existing literature. In this paper, we propose to address patched visual prompt injection, where adversaries exploit adversarial patches to generate target content in VLMs. Our investigation reveals that patched adversarial prompts exhibit sensitivity to pixel-wise randomization, a trait that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSecurity and Verification in Computing