Your Vision-Language-Action Model Already Has Attention Heads For Path Deviation Detection
Jaehwan Jeong, Evelyn Zhu, Jinying Lin, Emmanuel Jaimes, Tuan-Anh Vu, Jungseock Joo, Sangpil Kim, M. Khalid Jawed

TL;DR
This paper reveals that specific attention heads within a frozen vision-language-action model can be used to detect navigation path deviations in real time, enabling a training-free anomaly detection and recovery system for robots.
Contribution
It introduces a novel, training-free method to detect navigation hallucinations by monitoring a few attention heads, improving robustness without additional training or computational overhead.
Findings
A combination of three attention heads detects 44.6% of deviations
Detection has a false-positive rate of 11.7%
The system is successfully deployed on a physical robot
Abstract
Vision-Language-Action (VLA) models have demonstrated strong potential for predicting semantic actions in navigation tasks, demonstrating the ability to reason over complex linguistic instructions and visual contexts. However, they are fundamentally hindered by visual-reasoning hallucinations that lead to trajectory deviations. Addressing this issue has conventionally required training external critic modules or relying on complex uncertainty heuristics. In this work, we discover that monitoring a few attention heads within a frozen VLA model can accurately detect path deviations without incurring additional computational overhead. We refer to these heads, which inherently capture the spatiotemporal causality between historical visual sequences and linguistic instructions, as Navigation Heads. Using these heads, we propose an intuitive, training-free anomaly-detection framework that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning
