A Deployable Embodied Vision-Language Navigation System with Hierarchical Cognition and Context-Aware Exploration

Kuan Xu; Ruimeng Liu; Yizhuo Yang; Denan Liang; Tongxing Jin; Shenghai Yuan; Chen Wang; Lihua Xie

arXiv:2604.21363·cs.RO·May 19, 2026

A Deployable Embodied Vision-Language Navigation System with Hierarchical Cognition and Context-Aware Exploration

Kuan Xu, Ruimeng Liu, Yizhuo Yang, Denan Liang, Tongxing Jin, Shenghai Yuan, Chen Wang, Lihua Xie

PDF

1 Repo

TL;DR

This paper introduces a deployable embodied vision-language navigation system that balances high-level reasoning and efficiency, utilizing hierarchical cognition and context-aware exploration for real-world robotic navigation.

Contribution

The authors propose a novel hierarchical VLN system with asynchronous layers and a shared memory, enabling efficient long-horizon reasoning and real-time deployment on resource-limited robots.

Findings

01

Achieves higher navigation success rates than existing VLN methods.

02

Maintains real-time performance on resource-constrained hardware.

03

Demonstrates effectiveness in both simulation and real-world environments.

Abstract

Bridging the gap between embodied intelligence and embedded deployment remains a key challenge in intelligent robotic systems, where perception, reasoning, and planning must operate under strict constraints on computation, memory, energy, and real-time execution. In vision-and-language navigation (VLN), existing approaches often face a trade-off between reasoning capability and deployment efficiency on real-world platforms. In this paper, we present a deployable embodied VLN system that achieves both high efficiency and strong high-level reasoning on real-world robots. The system is decomposed into a fast perception-action layer and a deep reasoning layer running asynchronously at different time scales, with a shared memory layer enabling efficient interaction between them. To support long-horizon reasoning, we incrementally construct a compact memory graph and progressively feed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xukuanHIT/HiCo-Nav
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.