Enhancing Targeted Adversarial Attacks on Large Vision-Language Models via Intermediate Projector

Yiming Cao; Yanjie Li; Kaisheng Liang; Bin Xiao

arXiv:2508.13739·cs.CV·September 25, 2025

Enhancing Targeted Adversarial Attacks on Large Vision-Language Models via Intermediate Projector

Yiming Cao, Yanjie Li, Kaisheng Liang, Bin Xiao

PDF

TL;DR

This paper introduces a novel black-box targeted attack framework on large vision-language models that leverages the projector and fine-grained query outputs to improve attack success rates and granularity, with strong transferability.

Contribution

It proposes the Intermediate Projector Guided Attack (IPGA) and Residual Query Alignment (RQA) modules, enhancing attack effectiveness and transferability by exploiting the Q-Former and preserving content.

Findings

01

IPGA outperforms baselines in global targeted attacks.

02

IPGA-R achieves higher success rates and content preservation in fine-grained attacks.

03

Effective transferability to commercial VLMs like Google Gemini and OpenAI GPT.

Abstract

The growing deployment of Large Vision-Language Models (VLMs) raises safety concerns, as adversaries may exploit model vulnerabilities to induce harmful outputs, with targeted black-box adversarial attacks posing a particularly severe threat. However, existing methods primarily maximize encoder-level global similarity, which lacks the granularity for stealthy and practical fine-grained attacks, where only specific target should be altered (e.g., modifying a car while preserving its background). Moreover, they largely neglect the projector, a key semantic bridge in VLMs for multimodal alignment. To address these limitations, we propose a novel black-box targeted attack framework that leverages the projector. Specifically, we utilize the widely adopted Querying Transformer (Q-Former) which transforms global image embeddings into fine-grained query outputs, to enhance attack effectiveness…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.