Explore with Long-term Memory: A Benchmark and Multimodal LLM-based Reinforcement Learning Framework for Embodied Exploration

Sen Wang; Bangwei Liu; Zhenkun Gao; Lizhuang Ma; Xuhong Wang; Yuan Xie; Xin Tan

arXiv:2601.10744·cs.AI·March 24, 2026

Explore with Long-term Memory: A Benchmark and Multimodal LLM-based Reinforcement Learning Framework for Embodied Exploration

Sen Wang, Bangwei Liu, Zhenkun Gao, Lizhuang Ma, Xuhong Wang, Yuan Xie, Xin Tan

PDF

Open Access 1 Models 3 Datasets

TL;DR

This paper introduces LMEE, a benchmark and framework for embodied exploration that leverages long-term memory and multimodal LLMs to improve lifelong learning and proactive exploration in complex environments.

Contribution

It proposes a new benchmark, LMEE-Bench, and a novel method, MemoryExplorer, for enhancing long-term memory utilization and exploration in embodied agents using reinforcement learning.

Findings

01

MemoryExplorer improves proactive exploration in long-horizon tasks.

02

The LMEE-Bench dataset enables comprehensive evaluation of exploration processes.

03

The approach outperforms existing models in embodied exploration tasks.

Abstract

An ideal embodied agent should possess lifelong learning capabilities to handle long-horizon and complex tasks, enabling continuous operation in general environments. This not only requires the agent to accurately accomplish given tasks but also to leverage long-term episodic memory to optimize decision-making. However, existing mainstream one-shot embodied tasks primarily focus on task completion results, neglecting the crucial process of exploration and memory utilization. To address this, we propose Long-term Memory Embodied Exploration (LMEE), which aims to unify the agent's exploratory cognition and decision-making behaviors to promote lifelong learning. We further construct a corresponding dataset and benchmark, LMEE-Bench, incorporating multi-goal navigation and memory-based question answering to comprehensively evaluate both the process and outcome of embodied exploration. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
wangsen99/MemoryExplorer
model· 52 dl
52 dl

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Action Observation and Synchronization · Social Robot Interaction and HRI