SysNav: Multi-Level Systematic Cooperation Enables Real-World, Cross-Embodiment Object Navigation
Haokun Zhu, Zongtai Li, Zihan Liu, Kevin Guo, Zhengzhi Lin, Yuxin Cai, Guofei Chen, Chen Lv, Wenshan Wang, Jean Oh, Ji Zhang

TL;DR
SysNav is a multi-level system that integrates vision-language models with hierarchical planning to enable reliable, efficient, and cross-embodiment object navigation in complex real-world environments.
Contribution
The paper introduces SysNav, a novel three-level system that decouples semantic reasoning, navigation, and motion control for real-world cross-embodiment object navigation.
Findings
Achieves high success rate and efficiency in real-world experiments across three embodiments.
Demonstrates state-of-the-art performance on four simulation benchmarks.
First system capable of building-scale long-range object navigation in complex environments.
Abstract
Object navigation (ObjectNav) in real-world environments is a complex problem that requires simultaneously addressing multiple challenges, including complex spatial structure, long-horizon planning and semantic understanding. Recent advances in Vision-Language Models (VLMs) offer promising capabilities for semantic understanding, yet effectively integrating them into real-world navigation systems remains a non-trivial challenge. In this work, we formulate real-world ObjectNav as a system-level problem and introduce SysNav, a three-level ObjectNav system designed for real-world crossembodiment deployment. SysNav decouples semantic reasoning, navigation planning and motion control to ensure robustness and generalizability. At the high-level, we summarize the environment into a structured scene representation and leverage VLMs to provide semantic-grounded navigation guidance. At the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Robotics and Sensor-Based Localization · Robotic Path Planning Algorithms
