AdaVLN: Towards Visual Language Navigation in Continuous Indoor Environments with Moving Humans
Dillon Loh, Tomasz Bednarz, Xinxing Xia, Frank Guan

TL;DR
AdaVLN extends visual language navigation to dynamic indoor environments with moving humans, introducing a new simulator, datasets, and mechanisms to handle real-world complexities and improve reproducibility.
Contribution
We propose AdaVLN, a novel task and environment for navigation amidst moving humans, along with datasets and a freeze-time mechanism to facilitate research and reproducibility.
Findings
Baseline models face challenges with dynamic obstacles.
AdaVLN bridges the sim-to-real gap in VLN.
New datasets and simulator support dynamic environment research.
Abstract
Visual Language Navigation is a task that challenges robots to navigate in realistic environments based on natural language instructions. While previous research has largely focused on static settings, real-world navigation must often contend with dynamic human obstacles. Hence, we propose an extension to the task, termed Adaptive Visual Language Navigation (AdaVLN), which seeks to narrow this gap. AdaVLN requires robots to navigate complex 3D indoor environments populated with dynamically moving human obstacles, adding a layer of complexity to navigation tasks that mimic the real-world. To support exploration of this task, we also present AdaVLN simulator and AdaR2R datasets. The AdaVLN simulator enables easy inclusion of fully animated human models directly into common datasets like Matterport3D. We also introduce a "freeze-time" mechanism for both the navigation task and simulator,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Hand Gesture Recognition Systems
