LongNav-R1: Horizon-Adaptive Multi-Turn RL for Long-Horizon VLA Navigation
Yue Hu, Avery Xi, Qixin Xiao, Seth Isaacson, Henry X. Liu, Ram Vasudevan, Maani Ghaffari

TL;DR
LongNav-R1 introduces a horizon-adaptive multi-turn reinforcement learning framework for long-horizon visual-language navigation, enabling better reasoning, diverse behaviors, and improved success rates over existing methods.
Contribution
It presents a novel multi-turn RL approach with horizon-adaptive policy optimization for long-horizon VLA navigation tasks, enhancing reasoning and robustness.
Findings
Success rate increased from 64.3% to 73.0%.
Outperforms state-of-the-art methods in sample efficiency.
Demonstrates zero-shot generalization in real-world navigation.
Abstract
This paper develops LongNav-R1, an end-to-end multi-turn reinforcement learning (RL) framework designed to optimize Visual-Language-Action (VLA) models for long-horizon navigation. Unlike existing single-turn paradigm, LongNav-R1 reformulates the navigation decision process as a continuous multi-turn conversation between the VLA policy and the embodied environment. This multi-turn RL framework offers two distinct advantages: i) it enables the agent to reason about the causal effects of historical interactions and sequential future outcomes; and ii) it allows the model to learn directly from online interactions, fostering diverse trajectory generation and avoiding the behavioral rigidity often imposed by human demonstrations. Furthermore, we introduce Horizon-Adaptive Policy Optimization. This mechanism explicitly accounts for varying horizon lengths during advantage estimation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Robotic Path Planning Algorithms
