Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering

Muhammad Fadhil Ginting; Dong-Ki Kim; Xiangyun Meng; Andrzej Reinke; Bandi Jai Krishna; Navid Kayhani; Oriana Peltzer; David D. Fan; Amirreza Shaban; Sung-Kyun Kim; Mykel J. Kochenderfer; Ali-akbar Agha-mohammadi; and Shayegan Omidshafiei

arXiv:2507.12846·cs.RO·September 26, 2025

Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering

Muhammad Fadhil Ginting, Dong-Ki Kim, Xiangyun Meng, Andrzej Reinke, Bandi Jai Krishna, Navid Kayhani, Oriana Peltzer, David D. Fan, Amirreza Shaban, Sung-Kyun Kim, Mykel J. Kochenderfer, Ali-akbar Agha-mohammadi, and Shayegan Omidshafiei

PDF

Open Access

TL;DR

This paper introduces Long-term Active Embodied Question Answering (LA-EQA), a task where robots must reason over past, present, and future states using a structured memory system inspired by the mind palace, to improve long-term environmental understanding and question answering.

Contribution

The paper proposes a novel structured memory system and reasoning algorithm for robots in LA-EQA, enabling targeted memory retrieval, active exploration, and effective decision-making for long-term tasks.

Findings

01

Significant improvement in answer accuracy over baselines.

02

Enhanced exploration efficiency in real-world environments.

03

Effective balance of exploration and recall through value-of-information stopping criteria.

Abstract

As robots become increasingly capable of operating over extended periods -- spanning days, weeks, and even months -- they are expected to accumulate knowledge of their environments and leverage this experience to assist humans more effectively. This paper studies the problem of Long-term Active Embodied Question Answering (LA-EQA), a new task in which a robot must both recall past experiences and actively explore its environment to answer complex, temporally-grounded questions. Unlike traditional EQA settings, which typically focus either on understanding the present environment alone or on recalling a single past observation, LA-EQA challenges an agent to reason over past, present, and possible future states, deciding when to explore, when to consult its memory, and when to stop gathering observations and provide a final answer. Standard EQA approaches based on large models struggle in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Topic Modeling · Natural Language Processing Techniques