RedVLA: Physical Red Teaming for Vision-Language-Action Models

Yuhao Zhang; Borong Zhang; Jiaming Fan; Jiachen Shen; Yishuai Cai; Yaodong Yang; Jiaming Ji

arXiv:2604.22591·cs.RO·April 27, 2026

RedVLA: Physical Red Teaming for Vision-Language-Action Models

Yuhao Zhang, Borong Zhang, Jiaming Fan, Jiachen Shen, Yishuai Cai, Yaodong Yang, Jiaming Ji

PDF

1 Repo

TL;DR

RedVLA introduces a systematic red teaming framework to identify and mitigate physical safety risks in vision-language-action models, enhancing deployment safety.

Contribution

It is the first framework to proactively detect physical safety risks in VLA models through risk scenario synthesis and amplification.

Findings

01

RedVLA uncovers diverse unsafe behaviors in VLA models.

02

Achieves up to 95.5% attack success rate within 10 iterations.

03

Proposes SimpleVLA-Guard for safety mitigation.

Abstract

The real-world deployment of Vision-Language-Action (VLA) models remains limited by the risk of unpredictable and irreversible physical harm. However, we currently lack effective mechanisms to proactively detect these physical safety risks before deployment. To address this gap, we propose \textbf{RedVLA}, the first red teaming framework for physical safety in VLA models. We systematically uncover unsafe behaviors through a two-stage process: (I) \textbf{Risk Scenario Synthesis} constructs a valid and task-feasible initial risk scene. Specifically, it identifies critical interaction regions from benign trajectories and positions the risk factor within these regions, aiming to entangle it with the VLA's execution flow and elicit a target unsafe behavior. (II) \textbf{Risk Amplification} ensures stable elicitation across heterogeneous models. It iteratively refines the risk factor state…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://redvla.github.io
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.