SafeAlign-VLA: A Negative-Enhanced Safe Alignment Framework for Risk-Aware Autonomous Driving

Kefei Tian; Yuansheng Lian; Kai Yang; Xiangdong Chen; Shen Li

arXiv:2605.19524·cs.RO·May 20, 2026

SafeAlign-VLA: A Negative-Enhanced Safe Alignment Framework for Risk-Aware Autonomous Driving

Kefei Tian, Yuansheng Lian, Kai Yang, Xiangdong Chen, Shen Li

PDF

TL;DR

SafeAlign-VLA introduces a negative data integration framework for risk-aware autonomous driving, enhancing safety and robustness by leveraging counterfactual reasoning and contrastive learning.

Contribution

It proposes a novel negative-enhanced safe alignment framework that incorporates negative samples into training for improved safety boundary understanding in VLA models.

Findings

01

Achieves 89.1 PDMS on NAVSIM v1, surpassing baseline by 1.3%.

02

Reduces collision rate to 3.36% on DeepAccident.

03

Maintains high language and risk prediction accuracy (84.2% and 85.8%).

Abstract

End-to-end autonomous driving systems excel in common scenarios but struggle with safety-critical long-tail cases. Vision-Language-Action (VLA) models are promising due to their strong reasoning capabilities. However, most VLA-based approaches rely on positive expert demonstrations, rarely exploiting negative samples, leading to insufficient understanding of risky behaviors and safety boundaries. To address this limitation, we propose SafeAlign-VLA, a unified negative-enhanced safe alignment framework that incorporates negative data into supervised learning and reinforcement learning. First, we develop a counterfactual safety pairing paradigm to generate structured safety labels and counterfactual positive trajectories from risky scenarios via counterfactual reasoning. Then, a two-stage training strategy is adopted: negative-enhanced supervised fine-tuning for failure feedback and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.