NavThinker: Action-Conditioned World Models for Coupled Prediction and Planning in Social Navigation
Tianshuai Hu, Zeying Gong, Lingdong Kong, XiaoDong Mei, Yiyi Ding, Qi Zeng, Ao Liang, Rong Li, Yangyi Zhong, Junwei Liang

TL;DR
NavThinker introduces a future-aware framework for social navigation that couples an action-conditioned world model with reinforcement learning, enabling robots to predict scene evolution and plan safer, more effective paths in dynamic human environments.
Contribution
It proposes a novel action-conditioned world model operating in feature space, integrated with reinforcement learning for coupled prediction and planning in social navigation.
Findings
Achieves state-of-the-art success in social navigation tasks.
Demonstrates effective zero-shot transfer to unseen environments.
Validates real-world deployment on a mobile robot.
Abstract
Social navigation requires robots to act safely in dynamic human environments. Effective behavior demands thinking ahead: reasoning about how the scene and pedestrians evolve under different robot actions rather than reacting to current observations alone. This creates a coupled prediction-planning challenge, where robot actions and human motion mutually influence each other. To address this challenge, we propose NavThinker, a future-aware framework that couples an action-conditioned world model with on-policy reinforcement learning. The world model operates in the Depth Anything V2 patch feature space and performs autoregressive prediction of future scene geometry and human motion; multi-head decoders then produce future depth maps and human trajectories, yielding a future-aware state aligned with traversability and interaction risk. Crucially, we train the policy with DD-PPO while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Social Robot Interaction and HRI · Autonomous Vehicle Technology and Safety
