Memory-Maze: Scenario Driven Visual Language Navigation Benchmark for Guiding Blind People
Masaki Kuribayashi, Kohei Uehara, Allan Wang, Daisuke Sato, Simon Chu, Shigeo Morishima

TL;DR
Memory-Maze introduces a new benchmark for visual language navigation that simulates human memory-based route instructions, highlighting the challenges of understanding natural, error-prone guidance in guiding blind people.
Contribution
The paper presents Memory-Maze, a novel benchmark with human memory-based instructions for VLN, addressing a gap in existing datasets and evaluating model robustness in realistic scenarios.
Findings
Memory instructions are longer and more varied.
Memory-based instructions contain more errors and ambiguities.
State-of-the-art models struggle with memory-derived guidance.
Abstract
Visual Language Navigation (VLN) powered robots have the potential to guide blind people by understanding route instructions provided by sighted passersby. This capability allows robots to operate in environments often unknown a prior. Existing VLN models are insufficient for the scenario of navigation guidance for blind people, as they need to understand routes described from human memory, which frequently contains stutters, errors, and omissions of details, as opposed to those obtained by thinking out loud, such as in the R2R dataset. However, existing benchmarks do not contain instructions obtained from human memory in natural environments. To this end, we present our benchmark, Memory-Maze, which simulates the scenario of seeking route instructions for guiding blind people. Our benchmark contains a maze-like structured virtual environment and novel route instruction data from human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications
