VL-LN Bench: Towards Long-horizon Goal-oriented Navigation with Active Dialogs

Wensi Huang; Shaohao Zhu; Meng Wei; Jinming Xu; Xihui Liu; Hanqing Wang; Tai Wang; Feng Zhao; Jiangmiao Pang

arXiv:2512.22342·cs.RO·January 26, 2026

VL-LN Bench: Towards Long-horizon Goal-oriented Navigation with Active Dialogs

Wensi Huang, Shaohao Zhu, Meng Wei, Jinming Xu, Xihui Liu, Hanqing Wang, Tai Wang, Feng Zhao, Jiangmiao Pang

PDF

Open Access 1 Models 1 Datasets

TL;DR

This paper introduces VL-LN Bench, a new benchmark for long-horizon goal-oriented navigation that incorporates active dialog to resolve ambiguity, enabling more realistic and effective embodied navigation models.

Contribution

It proposes the VL-LN benchmark and a new task, IIGN, for training dialog-enabled navigation agents in realistic, ambiguous instruction scenarios.

Findings

01

Dialog-enabled navigation models outperform baselines.

02

VL-LN dataset contains over 41k trajectories for training.

03

Active dialog improves navigation success in ambiguous instructions.

Abstract

In most existing embodied navigation tasks, instructions are well-defined and unambiguous, such as instruction following and object searching. Under this idealized setting, agents are required solely to produce effective navigation outputs conditioned on vision and language inputs. However, real-world navigation instructions are often vague and ambiguous, requiring the agent to resolve uncertainty and infer user intent through active dialog. To address this gap, we propose Interactive Instance Goal Navigation (IIGN), a task that requires agents not only to generate navigation actions but also to produce language outputs via active dialog, thereby aligning more closely with practical settings. IIGN extends Instance Goal Navigation (IGN) by allowing agents to freely consult an oracle in natural language while navigating. Building on this task, we present the Vision Language-Language…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
InternRobotics/VL-LN-Bench-basemodel
model· 13 dl· ♡ 6
13 dl♡ 6

Datasets

InternRobotics/VL-LN-Bench
dataset· 477 dl
477 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Speech and dialogue systems