GNM: A General Navigation Model to Drive Any Robot
Dhruv Shah, Ajay Sridhar, Arjun Bhorkar, Noriaki Hirose, Sergey Levine

TL;DR
This paper introduces GNM, a general goal-conditioned vision-based navigation model trained on diverse robot data, enabling broad generalization and robustness across different robots and environments.
Contribution
The paper presents a novel approach to train a unified navigation model from heterogeneous robot datasets, improving generalization and robustness in vision-based navigation tasks.
Findings
Omnipolicy trained on diverse datasets outperforms single-dataset policies.
Training on heterogeneous data enhances robustness to sensing and actuation variations.
The GNM successfully generalizes to new robots, including an underactuated quadrotor.
Abstract
Learning provides a powerful tool for vision-based navigation, but the capabilities of learning-based policies are constrained by limited training data. If we could combine data from all available sources, including multiple kinds of robots, we could train more powerful navigation models. In this paper, we study how a general goal-conditioned model for vision-based navigation can be trained on data obtained from many distinct but structurally similar robots, and enable broad generalization across environments and embodiments. We analyze the necessary design decisions for effective data sharing across robots, including the use of temporal context and standardized action spaces, and demonstrate that an omnipolicy trained from heterogeneous datasets outperforms policies trained on any single dataset. We curate 60 hours of navigation trajectories from 6 distinct robots, and deploy the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
