Understanding and Enhancing Encoder-based Adversarial Transferability against Large Vision-Language Models
Xinwei Zhang, Li Bai, Tianwei Zhang, Youqian Zhang, Qingqing Ye, Yingnan Zhao, Ruochen Du, Haibo Hu

TL;DR
This paper systematically studies the limited transferability of encoder-based adversarial attacks on large vision-language models and proposes a novel semantic-guided attack to improve transferability, revealing security vulnerabilities.
Contribution
It is the first comprehensive analysis of transferability issues in encoder-based attacks on LVLMs and introduces SGMA to enhance attack effectiveness across models.
Findings
Existing attacks have poor transferability across LVLMs.
Discovered causes include inconsistent visual grounding and redundant semantic alignment.
SGMA significantly improves attack transferability.
Abstract
Large vision-language models (LVLMs) have achieved impressive success across multimodal tasks, but their reliance on visual inputs exposes them to significant adversarial threats. Existing encoder-based attacks perturb the input image by optimizing solely on the vision encoder, rather than the entire LVLM, offering a computationally efficient alternative to end-to-end optimization. However, their transferability across different LVLM architectures in realistic black-box scenarios remains poorly understood. To address this gap, we present the first systematic study towards encoder-based adversarial transferability in LVLMs. Our contributions are threefold. First, through large-scale benchmarking over eight diverse LVLMs, we reveal that existing attacks exhibit severely limited transferability. Second, we perform in-depth analysis, disclosing two root causes that hinder the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
