Explore before Moving: A Feasible Path Estimation and Memory Recalling Framework for Embodied Navigation
Yang Wu, Shirui Feng, Guanbin Li, Liang Lin

TL;DR
This paper introduces PEMR, a novel route planning framework for embodied navigation that mimics human foresight and memory, significantly improving performance in unknown environments.
Contribution
The paper proposes PEMR, a new navigation framework with path estimation and memory recall modules, enhancing exploration in unfamiliar scenes for embodied agents.
Findings
PEMR outperforms existing navigation algorithms on EmbodiedQA tasks.
The framework effectively estimates feasible paths and leverages past experience.
Experimental results demonstrate improved accuracy in navigation and question answering.
Abstract
An embodied task such as embodied question answering (EmbodiedQA), requires an agent to explore the environment and collect clues to answer a given question that related with specific objects in the scene. The solution of such task usually includes two stages, a navigator and a visual Q&A module. In this paper, we focus on the navigation and solve the problem of existing navigation algorithms lacking experience and common sense, which essentially results in a failure finding target when robot is spawn in unknown environments. Inspired by the human ability to think twice before moving and conceive several feasible paths to seek a goal in unfamiliar scenes, we present a route planning method named Path Estimation and Memory Recalling (PEMR) framework. PEMR includes a "looking ahead" process, \textit{i.e.} a visual feature extractor module that estimates feasible paths for gathering 3D…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Video Analysis and Summarization
