Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation
Egor Cherepanov, Nikita Kachaev, Artem Zholus, Alexey K. Kovalev, Aleksandr I. Panov

TL;DR
This paper clarifies the concept of memory in reinforcement learning agents, proposes a standardized evaluation methodology, and demonstrates its importance through empirical experiments to enable objective comparison of memory capabilities.
Contribution
It provides precise definitions of memory types in RL, categorizes agent memory, and introduces a standardized evaluation framework for memory assessment.
Findings
Standardized methodology improves evaluation consistency
Empirical results show the importance of proper memory assessment
Violation of methodology leads to unreliable memory judgments
Abstract
The incorporation of memory into agents is essential for numerous tasks within the domain of Reinforcement Learning (RL). In particular, memory is paramount for tasks that require the use of past information, adaptation to novel environments, and improved sample efficiency. However, the term "memory" encompasses a wide range of concepts, which, coupled with the lack of a unified methodology for validating an agent's memory, leads to erroneous judgments about agents' memory capabilities and prevents objective comparison with other memory-enhanced agents. This paper aims to streamline the concept of memory in RL by providing practical precise definitions of agent memory types, such as long-term vs. short-term memory and declarative vs. procedural memory, inspired by cognitive science. Using these definitions, we categorize different classes of agent memory, propose a robust experimental…
Peer Reviews
Decision·ICLR 2026 Poster
This paper is generally well-written and well-motivated. This paper studies the relatively underexplored subject of “memory in RL agents” and presents formal definitions of various memory concepts, highlighting the importance of appropriate experimental configurations for evaluating them. Pictorial illustrations help clarify the concepts.
Some of the definitions require more explanation to fully understand the concept. Paper formatting could be improved. Please see the questions and comments.
**Conceptual Clarification and Formalization** The paper successfully translates ambiguous cognitive science terms (e.g., STM/LTM, declarative/procedural memory) into precise, quantifiable, and verifiable definitions within RL (see Definitions 4.4–4.6). This formalization fills a significant void in current RL literature, where “memory” is often used loosely or inconsistently. **Proposal of a Unified Evaluation Framework** The distinction between Memory Decision-Making (Memory DM) and Meta-RL i
**Abstract Treatment of Memory Mechanisms** While Definition 4.7 defines a memory mechanism as a mapping from base context K to effective context K_eff, it does not differentiate between implementation strategies (e.g., external memory, world models, state-space models). A deeper discussion of how different architectures realize µ(K) would strengthen the framework. **Limited Coverage of Other Memory Types** The paper focuses primarily on declarative memory along the temporal axis. Although the
1. They standardized the definitions of different types of memory, thereby providing a framework for fair evaluation of subsequent models. 2. The definitions and formalizations are careful and rigorous.
The paper spends too much time on the basic theoretical and general analyses, while the experiments and analyses conducted under this theoretical framework are rather limited. Although the authors include several validation experiments demonstrating the necessity and importance of defining and distinguishing different types of memory, they do not provide any insightful conclusions or analyses derived from evaluations based on this distinction. In other words, what I expected to see was either a
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
