When Alignment Fails: Multimodal Adversarial Attacks on Vision-Language-Action Models

Yuping Yan; Yuhan Xie; Yixin Zhang; Lingjuan Lyu; Handing Wang; Yaochu Jin

arXiv:2511.16203·cs.CV·December 12, 2025

When Alignment Fails: Multimodal Adversarial Attacks on Vision-Language-Action Models

Yuping Yan, Yuhan Xie, Yixin Zhang, Lingjuan Lyu, Handing Wang, Yaochu Jin

PDF

Open Access

TL;DR

This paper investigates the robustness of vision-language-action models against multimodal adversarial attacks, revealing their vulnerability to perturbations that disrupt perception, reasoning, and decision-making in embodied environments.

Contribution

It introduces VLA-Fool, a comprehensive framework for multimodal adversarial attacks, including cross-modal misalignment, and develops a semantically guided prompting method for robustness testing.

Findings

01

Minor perturbations cause significant behavioral deviations

02

Multimodal adversarial attacks expose fragility in embodied models

03

Cross-modal misalignment critically impacts model reasoning

Abstract

Vision-Language-Action models (VLAs) have recently demonstrated remarkable progress in embodied environments, enabling robots to perceive, reason, and act through unified multimodal understanding. Despite their impressive capabilities, the adversarial robustness of these systems remains largely unexplored, especially under realistic multimodal and black-box conditions. Existing studies mainly focus on single-modality perturbations and overlook the cross-modal misalignment that fundamentally affects embodied reasoning and decision-making. In this paper, we introduce VLA-Fool, a comprehensive study of multimodal adversarial robustness in embodied VLA models under both white-box and black-box settings. VLA-Fool unifies three levels of multimodal adversarial attacks: (1) textual perturbations through gradient-based and prompt-based manipulations, (2) visual perturbations via patch and noise…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)