When Alignment Fails: Multimodal Adversarial Attacks on Vision-Language-Action Models
Yuping Yan, Yuhan Xie, Yixin Zhang, Lingjuan Lyu, Handing Wang, Yaochu Jin

TL;DR
This paper investigates the robustness of vision-language-action models against multimodal adversarial attacks, revealing their vulnerability to perturbations that disrupt perception, reasoning, and decision-making in embodied environments.
Contribution
It introduces VLA-Fool, a comprehensive framework for multimodal adversarial attacks, including cross-modal misalignment, and develops a semantically guided prompting method for robustness testing.
Findings
Minor perturbations cause significant behavioral deviations
Multimodal adversarial attacks expose fragility in embodied models
Cross-modal misalignment critically impacts model reasoning
Abstract
Vision-Language-Action models (VLAs) have recently demonstrated remarkable progress in embodied environments, enabling robots to perceive, reason, and act through unified multimodal understanding. Despite their impressive capabilities, the adversarial robustness of these systems remains largely unexplored, especially under realistic multimodal and black-box conditions. Existing studies mainly focus on single-modality perturbations and overlook the cross-modal misalignment that fundamentally affects embodied reasoning and decision-making. In this paper, we introduce VLA-Fool, a comprehensive study of multimodal adversarial robustness in embodied VLA models under both white-box and black-box settings. VLA-Fool unifies three levels of multimodal adversarial attacks: (1) textual perturbations through gradient-based and prompt-based manipulations, (2) visual perturbations via patch and noise…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
