LiveVLN: Breaking the Stop-and-Go Loop in Vision-Language Navigation

Xiangchen Wang; Weiye Zhu; Teng Wang; TianTian Geng; Zekai Zhang; Zhiyuan Qi; Jinyu Yang; Feng Zheng

arXiv:2604.19536·cs.RO·April 22, 2026

LiveVLN: Breaking the Stop-and-Go Loop in Vision-Language Navigation

Xiangchen Wang, Weiye Zhu, Teng Wang, TianTian Geng, Zekai Zhang, Zhiyuan Qi, Jinyu Yang, Feng Zheng

PDF

1 Repo

TL;DR

LiveVLN is a framework that enables more continuous and smoother vision-language navigation by overlapping perception and action, significantly reducing idle waiting time during deployment.

Contribution

It introduces a training-free method to augment pretrained VLM navigators with multi-step action continuation for continuous online execution.

Findings

01

Reduces average episode waiting time by up to 77.7% in real-world deployments.

02

Shortens wall-clock episode time by 12.6% on StreamVLN and 19.6% on NaVIDA.

03

Preserves benchmark performance while enabling more continuous navigation.

Abstract

Recent navigation systems achieve strong benchmark results, yet real-world deployment often remains visibly stop-and-go. This bottleneck arises because the sense-inference-execution loop is still blocking: after each new observation, the controller must wait for sensing, transmission, and inference before motion can continue. Reducing action-generation cost alone therefore does not remove redundant waiting. To address this issue, we present LiveVLN, a training-free framework for more continuous embodied navigation by augmenting pretrained VLM navigators with multi-step action continuation. Instead of pausing for each full sense-and-inference round, LiveVLN overlaps execution with the processing of newly arrived observations, allowing refreshed future actions to be handed off before the current executable prefix is exhausted. This design keeps actions continuously available during…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

NIneeeeeem/LiveVLN
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.