OpenNav: Open-World Navigation with Multimodal Large Language Models
Mingfeng Yuan, Letian Wang, Steven L. Waslander

TL;DR
OpenNav leverages multimodal large language models to enable robots to interpret complex natural language instructions and generate navigation trajectories in open-world environments, integrating semantic understanding with spatial mapping.
Contribution
This work introduces a zero-shot vision-language navigation framework using MLLMs and perception models, enabling diverse open-set instruction execution in outdoor and indoor scenarios.
Findings
Effective zero-shot navigation in outdoor datasets
Robust performance on open-set natural language instructions
Successful deployment on Husky robot in real-world scenes
Abstract
Pre-trained large language models (LLMs) have demonstrated strong common-sense reasoning abilities, making them promising for robotic navigation and planning tasks. However, despite recent progress, bridging the gap between language descriptions and actual robot actions in the open-world, beyond merely invoking limited predefined motion primitives, remains an open challenge. In this work, we aim to enable robots to interpret and decompose complex language instructions, ultimately synthesizing a sequence of trajectory points to complete diverse navigation tasks given open-set instructions and open-set objects. We observe that multi-modal large language models (MLLMs) exhibit strong cross-modal understanding when processing free-form language instructions, demonstrating robust scene comprehension. More importantly, leveraging their code-generation capability, MLLMs can interact with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems · Geographic Information Systems Studies
