Sample-agnostic Adversarial Perturbation for Vision-Language Pre-training Models
Haonan Zheng, Wen Jiang, Xinyang Deng, Wenrui Li

TL;DR
This paper introduces a novel universal, sample-agnostic adversarial perturbation method for vision-language pre-training models, demonstrating its effectiveness and transferability across multiple models and datasets.
Contribution
It is the first work to develop a universal adversarial perturbation for multimodal models by leveraging decision boundaries in the input space.
Findings
Universal perturbation successfully impairs VLP model performance
Method transfers across different models and datasets
Supports creation of global perturbations and adversarial patches
Abstract
Recent studies on AI security have highlighted the vulnerability of Vision-Language Pre-training (VLP) models to subtle yet intentionally designed perturbations in images and texts. Investigating multimodal systems' robustness via adversarial attacks is crucial in this field. Most multimodal attacks are sample-specific, generating a unique perturbation for each sample to construct adversarial samples. To the best of our knowledge, it is the first work through multimodal decision boundaries to explore the creation of a universal, sample-agnostic perturbation that applies to any image. Initially, we explore strategies to move sample points beyond the decision boundaries of linear classifiers, refining the algorithm to ensure successful attacks under the top accuracy metric. Based on this foundation, in visual-language tasks, we treat visual and textual modalities as reciprocal sample…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Topic Modeling
