MemPO: Self-Memory Policy Optimization for Long-Horizon Agents

Ruoran Li; Xinghua Zhang; Haiyang Yu; Shitong Duan; Xiang Li; Wenxin Xiang; Chonghua Liao; Xudong Guo; Yongbin Li; Jinli Suo

arXiv:2603.00680·cs.AI·April 10, 2026

MemPO: Self-Memory Policy Optimization for Long-Horizon Agents

Ruoran Li, Xinghua Zhang, Haiyang Yu, Shitong Duan, Xiang Li, Wenxin Xiang, Chonghua Liao, Xudong Guo, Yongbin Li, Jinli Suo

PDF

1 Repo 1 Models

TL;DR

MemPO introduces a self-managing memory policy for long-horizon agents, enabling autonomous memory summarization and management to improve performance and reduce token usage.

Contribution

It presents MemPO, a novel algorithm allowing agents to autonomously optimize their memory management aligned with task objectives.

Findings

01

MemPO achieves a 25.98% F1 score improvement over the base model.

02

Reduces token consumption by approximately 70%.

03

Outperforms previous state-of-the-art methods in experiments.

Abstract

Long-horizon agents face the challenge of growing context size during interaction with environment, which degrades the performance and stability. Existing methods typically introduce the external memory module and look up the relevant information from the stored memory, which prevents the model itself from proactively managing its memory content and aligning with the agent's overarching task objectives. To address these limitations, we propose the self-memory policy optimization algorithm (MemPO), which enables the agent (policy model) to autonomously summarize and manage their memory during interaction with environment. By improving the credit assignment mechanism based on memory effectiveness, the policy model can selectively retain crucial information, significantly reducing token consumption while preserving task performance. Extensive experiments and analyses confirm that MemPO…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TheNewBeeKing/MemPO
github

Models

🤗
NewBeeKing/MemPO_Qwen2.5-SFT-RL
model· 89 dl
89 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.