Dynamic Mixture of Latent Memories for Self-Evolving Agents
Dianzhi Yu, Vireo Zhang, Hongru Wang, Yanyu Chen, Minda Hu, Wanghan Xu, Siki Chen, Philip Torr, Zhenfei Yin, Irwin King

TL;DR
The paper introduces MoLEM, a dynamic mixture-of-experts framework for self-evolving agents that continually acquire knowledge without forgetting, demonstrated across math, science, and code tasks.
Contribution
MoLEM is a novel generative mixture of latent memory approach that enables continual learning without catastrophic forgetting by internalizing knowledge into additional modules.
Findings
MoLEM improves average accuracy by 10.40% over baseline.
It effectively preserves knowledge across multiple training stages.
Competing methods do not consistently outperform the baseline.
Abstract
Achieving self-evolution in intelligent agents requires the continual accumulation of new knowledge across changing task sequences without forgetting previously acquired abilities. Existing approaches either internalize knowledge by updating model parameters, which induces catastrophic forgetting, or rely on external memory, which fails to genuinely enhance the model's intrinsic capabilities. We propose MoLEM, a generative mixture of latent memory framework based on a dynamic mixture-of-experts (MoE). We treat multiple experts as independent carriers to generate memory. A router selects and weights experts through key-query matching, and the aggregated latent memory is injected into the reasoning process. The base model for reasoning remains entirely frozen, with all experiential knowledge internalized into the additional modules, avoiding catastrophic forgetting. For continual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
