AsyncShield: A Plug-and-Play Edge Adapter for Asynchronous Cloud-based VLA Navigation
Kai Yang, Zedong Chu, Yingnan Guo, Zhengbo Wang, Shichao Xie, Yanfen Shen, Xiaolong Wu, Xing Li, and Mu Xu

TL;DR
AsyncShield is a plug-and-play framework that enhances cloud-based VLA robot navigation by converting temporal delays into spatial corrections, improving safety and success rates without model fine-tuning.
Contribution
It introduces a deterministic white-box spatial mapping and reinforcement learning-based adaptation to address latency issues in cloud-based VLA navigation.
Findings
Improves navigation success rate in simulation and real-world tests.
Enhances physical safety by effectively avoiding collisions.
Operates as a lightweight, plug-and-play module without model fine-tuning.
Abstract
While Vision-Language-Action (VLA) models have been demonstrated possessing strong zero-shot generalization for robot control, their massive parameter sizes typically necessitate cloud-based deployment. However, cloud deployment introduces network jitter and inference latency, which can induce severe spatiotemporal misalignment in mobile navigation under continuous displacement, so that the stale intents expressed in past ego frames may become spatially incorrect in the current frame and lead to collisions. To address this issue, we propose AsyncShield, a plug-and-play asynchronous control framework. AsyncShield discards traditional black-box time-series prediction in favor of a deterministic physical white-box spatial mapping. By maintaining a temporal pose buffer and utilizing kinematic transformations, the system accurately converts temporal lag into spatial pose offsets to restore…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
