Transfer Attack for Bad and Good: Explain and Boost Adversarial Transferability across Multimodal Large Language Models
Hao Cheng, Erjia Xiao, Jiayan Yang, Jinhao Duan, Yichi Wang, Jiahang Cao, Qiang Zhang, Le Yang, Kaidi Xu, Jindong Gu, Renjing Xu

TL;DR
This paper analyzes adversarial transferability in Multimodal Large Language Models, identifies key influencing factors, and proposes data augmentation methods to enhance transferability, with implications for both harmful and protective societal applications.
Contribution
It provides a detailed analysis of adversarial transferability among MLLMs, identifies key factors affecting transferability, and introduces two semantic-level data augmentation techniques to improve it.
Findings
Transferability exists in cross-LLM scenarios with the same vision encoder.
Two key factors influence adversarial transferability.
Proposed methods boost transferability across MLLMs.
Abstract
Multimodal Large Language Models (MLLMs) demonstrate exceptional performance in cross-modality interaction, yet they also suffer adversarial vulnerabilities. In particular, the transferability of adversarial examples remains an ongoing challenge. In this paper, we specifically analyze the manifestation of adversarial transferability among MLLMs and identify the key factors that influence this characteristic. We discover that the transferability of MLLMs exists in cross-LLM scenarios with the same vision encoder and indicate \underline{\textit{two key Factors}} that may influence transferability. We provide two semantic-level data augmentation methods, Adding Image Patch (AIP) and Typography Augment Transferability Method (TATM), which boost the transferability of adversarial examples across MLLMs. To explore the potential impact in the real world, we utilize two tasks that can have both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Translation Studies and Practices · Computational and Text Analysis Methods
