Uncovering Linguistic Fragility in Vision-Language-Action Models via Diversity-Aware Red Teaming

Baoshun Tong; Haoran He; Ling Pan; Yang Liu; Liang Lin

arXiv:2604.05595·cs.RO·April 8, 2026

Uncovering Linguistic Fragility in Vision-Language-Action Models via Diversity-Aware Red Teaming

Baoshun Tong, Haoran He, Ling Pan, Yang Liu, Liang Lin

PDF

TL;DR

This paper introduces DAERT, a diversity-aware red teaming framework that uncovers linguistic vulnerabilities in vision-language-action models, significantly reducing their success rate and revealing safety blind spots.

Contribution

We propose a novel diversity-aware red teaming method that generates a wide range of challenging instructions to expose vulnerabilities in VLA models.

Findings

01

DAERT reduces task success rate from 93.33% to 5.85%.

02

It discovers a broader set of adversarial instructions.

03

Demonstrates effectiveness across multiple robotic benchmarks.

Abstract

Vision-Language-Action (VLA) models have achieved remarkable success in robotic manipulation. However, their robustness to linguistic nuances remains a critical, under-explored safety concern, posing a significant safety risk to real-world deployment. Red teaming, or identifying environmental scenarios that elicit catastrophic behaviors, is an important step in ensuring the safe deployment of embodied AI agents. Reinforcement learning (RL) has emerged as a promising approach in automated red teaming that aims to uncover these vulnerabilities. However, standard RL-based adversaries often suffer from severe mode collapse due to their reward-maximizing nature, which tends to converge to a narrow set of trivial or repetitive failure patterns, failing to reveal the comprehensive landscape of meaningful risks. To bridge this gap, we propose a novel \textbf{D}iversity-\textbf{A}ware…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.