Safe-Night VLA: Seeing the Unseen via Thermal-Perceptive Vision-Language-Action Models for Safety-Critical Manipulation
Dian Yu, Qingchuan Zhou, Bingkun Huang, Majid Khadiv, Zewen Yang

TL;DR
Safe-Night VLA introduces a multimodal manipulation framework that integrates thermal perception with vision-language models and safety constraints, enabling robots to perceive unseen thermal signals and operate safely in unstructured environments.
Contribution
It is the first to incorporate thermal perception into vision-language-action models for robotic manipulation with explicit safety constraints.
Findings
Outperforms RGB-only baselines in thermal-aware tasks
Enables temperature-conditioned manipulation and subsurface target localization
Maintains safety through control barrier functions during execution
Abstract
Current Vision-Language-Action (VLA) models rely primarily on RGB perception, preventing them from capturing modalities such as thermal signals that are imperceptible to conventional visual sensors. Moreover, end-to-end generative policies lack explicit safety constraints, making them fragile when encountering obstacles and novel scenarios outside the training distribution. To address these limitations, we propose Safe-Night VLA, a multimodal manipulation framework that enables robots to see the unseen while enforcing rigorous safety constraints for thermal-aware manipulation in unstructured environments. Specifically, Safe-Night VLA integrates long-wave infrared thermal perception into a pre-trained vision-language backbone, enabling semantic reasoning grounded in thermodynamic properties. To ensure safe execution under out-of-distribution conditions, we incorporate a safety filter via…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Robot Manipulation and Learning · Adversarial Robustness in Machine Learning
