Navigation World Models
Amir Bar, Gaoyue Zhou, Danny Tran, Trevor Darrell, Yann LeCun

TL;DR
The paper introduces a controllable video generation model called Navigation World Model (NWM) that predicts future observations and plans navigation trajectories using a large-scale Conditional Diffusion Transformer, enabling flexible and dynamic navigation in familiar and unfamiliar environments.
Contribution
It presents NWM, a novel controllable video prediction model with a large-scale transformer that can plan and imagine navigation trajectories in diverse environments.
Findings
NWM effectively plans navigation trajectories in familiar environments.
It can generate plausible trajectories in unseen environments from a single image.
NWM outperforms existing methods in trajectory planning and environment generalization.
Abstract
Navigation is a fundamental skill of agents with visual-motor capabilities. We introduce a Navigation World Model (NWM), a controllable video generation model that predicts future visual observations based on past observations and navigation actions. To capture complex environment dynamics, NWM employs a Conditional Diffusion Transformer (CDiT), trained on a diverse collection of egocentric videos of both human and robotic agents, and scaled up to 1 billion parameters. In familiar environments, NWM can plan navigation trajectories by simulating them and evaluating whether they achieve the desired goal. Unlike supervised navigation policies with fixed behavior, NWM can dynamically incorporate constraints during planning. Experiments demonstrate its effectiveness in planning trajectories from scratch or by ranking trajectories sampled from an external policy. Furthermore, NWM leverages…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHistorical Geography and Cartography · Geographic Information Systems Studies
MethodsAttention Is All You Need · Absolute Position Encodings · Residual Connection · Adam · Softmax · Label Smoothing · Dropout · Dense Connections · Layer Normalization · Diffusion
