Memento 2: Learning by Stateful Reflective Memory
Jun Wang

TL;DR
This paper introduces a formal model and framework for continual learning in large language agents using episodic memory and reflection, enabling adaptive decision-making without parameter updates.
Contribution
It presents the Stateful Reflective Decision Process and a convergent read-write reflective learning framework integrating memory with reinforcement learning principles.
Findings
Reflection enables generalised adaptation across tasks
Memory-based methods can be analyzed with RL tools
Framework guarantees convergence to optimal policies
Abstract
We present a theoretical study of continual and experiential learning in large language model agents that combine episodic memory with reinforcement learning. We argue that the key mechanism for continual adaptation, without updating model parameters, is reflection: the agent's ability to use past experience to guide future actions. Empirical findings suggest that episodic, experience-driven reflection enables generalised adaptation across a wide range of open-ended, long-horizon tasks. This indicates that efficient learning can occur during deployment and weakens the traditional separation between training and testing. Motivated by this, we introduce the Stateful Reflective Decision Process, a formal model of reflective memory dynamics. In this abstraction, an agent maintains an episodic memory and performs two core operations. Writing stores interaction outcomes and plays the role of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Reinforcement Learning in Robotics · Multimodal Machine Learning Applications
