Navigating Beyond Instructions: Vision-and-Language Navigation in Obstructed Environments
Haodong Hong, Sen Wang, Zi Huang, Qi Wu, Jiajun Liu

TL;DR
This paper introduces R2R-UNO, a new dataset with obstructions for vision-and-language navigation, revealing current methods struggle with real-world obstacles, and proposes ObVLN, a training strategy to improve navigation in obstructed environments.
Contribution
The paper creates R2R-UNO dataset with diverse obstructions and proposes ObVLN, a novel training method, to enhance VLN agents' ability to navigate obstructed environments effectively.
Findings
State-of-the-art VLN methods struggle with obstructions.
ObVLN improves navigation performance in obstructed scenarios.
ObVLN maintains performance in unobstructed environments.
Abstract
Real-world navigation often involves dealing with unexpected obstructions such as closed doors, moved objects, and unpredictable entities. However, mainstream Vision-and-Language Navigation (VLN) tasks typically assume instructions perfectly align with the fixed and predefined navigation graphs without any obstructions. This assumption overlooks potential discrepancies in actual navigation graphs and given instructions, which can cause major failures for both indoor and outdoor agents. To address this issue, we integrate diverse obstructions into the R2R dataset by modifying both the navigation graphs and visual observations, introducing an innovative dataset and task, R2R with UNexpected Obstructions (R2R-UNO). R2R-UNO contains various types and numbers of path obstructions to generate instruction-reality mismatches for VLN research. Experiments on R2R-UNO reveal that state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpatial Cognition and Navigation
MethodsALIGN
