When Data Manipulation Meets Attack Goals: An In-depth Survey of Attacks for VLMs
Aobotao Dai, Xinyu Ma, Lei Chen, Songze Li, Lin Wang

TL;DR
This survey comprehensively reviews attack strategies on Vision-Language Models, categorizing methods, defenses, and proposing a taxonomy to guide future research on improving VLM robustness and security.
Contribution
It provides the first detailed taxonomy and evaluation metrics for VLM attacks, along with a summary of defense mechanisms and future research directions.
Findings
Categorized attack types: jailbreak, camouflage, exploitation
Outlined data manipulation methodologies for VLMs
Summarized evaluation metrics for attack impact
Abstract
Vision-Language Models (VLMs) have gained considerable prominence in recent years due to their remarkable capability to effectively integrate and process both textual and visual information. This integration has significantly enhanced performance across a diverse spectrum of applications, such as scene perception and robotics. However, the deployment of VLMs has also given rise to critical safety and security concerns, necessitating extensive research to assess the potential vulnerabilities these VLM systems may harbor. In this work, we present an in-depth survey of the attack strategies tailored for VLMs. We categorize these attacks based on their underlying objectives - namely jailbreak, camouflage, and exploitation - while also detailing the various methodologies employed for data manipulation of VLMs. Meanwhile, we outline corresponding defense mechanisms that have been proposed to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques
