JailWAM: Jailbreaking World Action Models in Robot Control

Hanqing Liu; Songping Wang; Jiahuan Long; Jiacheng Hou; Jialiang Sun; Chao Li; Yang Yang; Wei Peng; Xu Liu; Tingsong Jiang; Wen Yao; Yao Mu

arXiv:2604.05498·cs.RO·April 8, 2026

JailWAM: Jailbreaking World Action Models in Robot Control

Hanqing Liu, Songping Wang, Jiahuan Long, Jiacheng Hou, Jialiang Sun, Chao Li, Yang Yang, Wei Peng, Xu Liu, Tingsong Jiang, Wen Yao, Yao Mu

PDF

TL;DR

JailWAM introduces a comprehensive framework to evaluate and exploit vulnerabilities in World Action Models for robot control, highlighting safety risks and proposing defenses.

Contribution

It is the first dedicated jailbreak attack and evaluation framework for WAM, including a benchmark for safety assessment under attacks.

Findings

01

Achieved 84.2% attack success rate on LingBot-VA.

02

Framework efficiently exposes physical vulnerabilities in WAM.

03

Proposes effective defense mechanisms for safe robot control.

Abstract

The World Action Model (WAM) can jointly predict future world states and actions, exhibiting stronger physical manipulation capabilities compared with traditional models. Such powerful physical interaction ability is a double-edged sword: if safety is ignored, it will directly threaten personal safety, property security and environmental safety. However, existing research pays extremely limited attention to the critical security gap: the vulnerability of WAM to jailbreak attacks. To fill this gap, we define the Three-Level Safety Classification Framework to systematically quantify the safety of robotic arm motions. Furthermore, we propose JailWAM, the first dedicated jailbreak attack and evaluation framework for WAM, which consists of three core components: (1) Visual-Trajectory Mapping, which unifies heterogeneous action spaces into visual trajectory representations and enables…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.