MrSteve: Instruction-Following Agents in Minecraft with What-Where-When Memory
Junyeong Park, Junmo Cho, Sungjin Ahn

TL;DR
This paper introduces MrSteve, a low-level controller with episodic memory for Minecraft agents, significantly enhancing task-solving and exploration by addressing memory limitations in hierarchical AI approaches.
Contribution
We propose PEM, a novel episodic memory system for low-level controllers, and an exploration strategy that improves performance in Minecraft agents.
Findings
Enhanced task-solving efficiency
Improved exploration capabilities
Outperforms existing methods
Abstract
Significant advances have been made in developing general-purpose embodied AI in environments like Minecraft through the adoption of LLM-augmented hierarchical approaches. While these approaches, which combine high-level planners with low-level controllers, show promise, low-level controllers frequently become performance bottlenecks due to repeated failures. In this paper, we argue that the primary cause of failure in many low-level controllers is the absence of an episodic memory system. To address this, we introduce MrSteve (Memory Recall Steve), a novel low-level controller equipped with Place Event Memory (PEM), a form of episodic memory that captures what, where, and when information from episodes. This directly addresses the main limitation of the popular low-level controller, Steve-1. Unlike previous models that rely on short-term memory, PEM organizes spatial and event-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Games and Media · Artificial Intelligence in Games
