Progress-Think: Semantic Progress Reasoning for Vision-Language Navigation

Shuo Wang; Yucheng Wang; Guoxin Lian; Yongcai Wang; Maiyue Chen; Kaihui Wang; Bo Zhang; Zhizhong Su; Yutian Zhou; Wanting Li; Deying Li; Zhaoxin Fan

arXiv:2511.17097·cs.RO·April 15, 2026

Progress-Think: Semantic Progress Reasoning for Vision-Language Navigation

Shuo Wang, Yucheng Wang, Guoxin Lian, Yongcai Wang, Maiyue Chen, Kaihui Wang, Bo Zhang, Zhizhong Su, Yutian Zhou, Wanting Li, Deying Li, Zhaoxin Fan

PDF

TL;DR

Progress-Think introduces semantic progress reasoning for vision-language navigation, enabling agents to understand their advancement in multi-step instructions through a novel three-stage framework that improves accuracy and efficiency.

Contribution

It proposes a new semantic progress reasoning approach with a three-stage training framework, enhancing navigation performance without requiring expensive annotations.

Findings

01

Achieves state-of-the-art success rates on R2R-CE and RxR-CE datasets.

02

Demonstrates improved navigation consistency and efficiency.

03

Introduces a novel differentiable alignment for progress pretraining.

Abstract

Vision-Language Navigation requires agents to act coherently over long horizons by understanding not only local visual context but also how far they have advanced within a multi-step instruction. However, recent Vision-Language-Action models focus on direct action prediction and earlier progress methods predict numeric achievements; both overlook the monotonic co-progression property of the observation and instruction sequences. Building on this insight, Progress-Think introduces semantic progress reasoning, predicting instruction-style progress from visual observations to enable more accurate navigation. To achieve this without expensive annotations, we propose a three-stage framework. In the initial stage, Self-Aligned Progress Pretraining bootstraps a reasoning module via a novel differentiable alignment between visual history and instruction prefixes. Then, Progress-Guided Policy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.