M$^\star$: Every Task Deserves Its Own Memory Harness
Wenbo Pan, Shujie Liu, Xiangyang Zhou, Shiwei Zhang, Wanlu Shi, Mirror Xu, Xiaohua Jia

TL;DR
M$^igstar$ introduces an automated method to evolve task-specific memory systems for language model agents, improving performance across diverse domains by customizing memory design.
Contribution
The paper presents a novel approach that automatically discovers optimized memory architectures through executable program evolution, tailored to each task.
Findings
M$^igstar$ outperforms fixed-memory baselines across multiple benchmarks.
Evolved memory programs show distinct structures for different tasks.
Task-specific memory design enhances agent performance and flexibility.
Abstract
Large language model agents rely on specialized memory systems to accumulate and reuse knowledge during extended interactions. Recent architectures typically adopt a fixed memory design tailored to specific domains, such as semantic retrieval for conversations or skills reused for coding. However, a memory system optimized for one purpose frequently fails to transfer to others. To address this limitation, we introduce M, a method that automatically discovers task-optimized memory harnesses through executable program evolution. Specifically, M models an agent memory system as a memory program written in Python. This program encapsulates the data Schema, the storage Logic, and the agent workflow Instructions. We optimize these components jointly using a reflective code evolution method; this approach employs a population-based search strategy and analyzes evaluation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
