A Two-Stage Globally-Diverse Adversarial Attack for Vision-Language Pre-training Models
Wutao Chen, Huaqin Zou, Chen Wan, Lifeng Huang

TL;DR
This paper introduces 2S-GDA, a novel two-stage adversarial attack framework that enhances diversity and success rates in attacking vision-language pre-training models, especially in black-box scenarios.
Contribution
The paper presents a modular two-stage attack method combining textual and visual perturbations with globally-diverse strategies, improving attack success rates over existing methods.
Findings
Achieves up to 11.17% higher attack success in black-box settings.
Enhances perturbation diversity through multi-scale resizing and text expansion.
Outperforms state-of-the-art adversarial attack methods.
Abstract
Vision-language pre-training (VLP) models are vulnerable to adversarial examples, particularly in black-box scenarios. Existing multimodal attacks often suffer from limited perturbation diversity and unstable multi-stage pipelines. To address these challenges, we propose 2S-GDA, a two-stage globally-diverse attack framework. The proposed method first introduces textual perturbations through a globally-diverse strategy by combining candidate text expansion with globally-aware replacement. To enhance visual diversity, image-level perturbations are generated using multi-scale resizing and block-shuffle rotation. Extensive experiments on VLP models demonstrate that 2S-GDA consistently improves attack success rates over state-of-the-art methods, with gains of up to 11.17\% in black-box settings. Our framework is modular and can be easily combined with existing methods to further enhance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications · Hate Speech and Cyberbullying Detection
