Navigating Beyond Instructions: Vision-and-Language Navigation in   Obstructed Environments

Haodong Hong; Sen Wang; Zi Huang; Qi Wu; Jiajun Liu

arXiv:2407.21452·cs.RO·August 1, 2024

Navigating Beyond Instructions: Vision-and-Language Navigation in Obstructed Environments

Haodong Hong, Sen Wang, Zi Huang, Qi Wu, Jiajun Liu

PDF

Open Access 1 Repo

TL;DR

This paper introduces R2R-UNO, a new dataset with obstructions for vision-and-language navigation, revealing current methods struggle with real-world obstacles, and proposes ObVLN, a training strategy to improve navigation in obstructed environments.

Contribution

The paper creates R2R-UNO dataset with diverse obstructions and proposes ObVLN, a novel training method, to enhance VLN agents' ability to navigate obstructed environments effectively.

Findings

01

State-of-the-art VLN methods struggle with obstructions.

02

ObVLN improves navigation performance in obstructed scenarios.

03

ObVLN maintains performance in unobstructed environments.

Abstract

Real-world navigation often involves dealing with unexpected obstructions such as closed doors, moved objects, and unpredictable entities. However, mainstream Vision-and-Language Navigation (VLN) tasks typically assume instructions perfectly align with the fixed and predefined navigation graphs without any obstructions. This assumption overlooks potential discrepancies in actual navigation graphs and given instructions, which can cause major failures for both indoor and outdoor agents. To address this issue, we integrate diverse obstructions into the R2R dataset by modifying both the navigation graphs and visual observations, introducing an innovative dataset and task, R2R with UNexpected Obstructions (R2R-UNO). R2R-UNO contains various types and numbers of path obstructions to generate instruction-reality mismatches for VLN research. Experiments on R2R-UNO reveal that state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

honghd16/ObstructedVLN
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpatial Cognition and Navigation

MethodsALIGN