Mem-T: Densifying Rewards for Long-Horizon Memory Agents

Yanwei Yue; Boci Peng; Xuanbo Fan; Jiaxin Guo; Qiankun Li; Yan Zhang

arXiv:2601.23014·cs.LG·March 10, 2026

Mem-T: Densifying Rewards for Long-Horizon Memory Agents

Yanwei Yue, Boci Peng, Xuanbo Fan, Jiaxin Guo, Qiankun Li, Yan Zhang

PDF

Open Access 1 Models

TL;DR

Mem-T introduces a hierarchical memory management system with a novel reinforcement learning framework, MoT-GRPO, enabling autonomous, efficient, and effective long-horizon memory operations for agents dealing with streaming inputs.

Contribution

The paper presents Mem-T, a new autonomous memory agent with dynamic memory updates and retrieval, and MoT-GRPO, a reinforcement learning method for dense training signals in long-horizon memory tasks.

Findings

01

Mem-T outperforms existing frameworks like A-Mem and Mem0 by up to 14.92%.

02

Mem-T reduces inference tokens per query by approximately 24.45% compared to GAM.

03

Mem-T achieves a favorable accuracy-efficiency trade-off.

Abstract

Memory agents, which depart from predefined memory-processing pipelines by endogenously managing the processing, storage, and retrieval of memories, have garnered increasing attention for their autonomy and adaptability. However, existing training paradigms remain constrained: agents often traverse long-horizon sequences of memory operations before receiving sparse and delayed rewards, which hinders truly end-to-end optimization of memory management policies. To address this limitation, we introduce Mem-T, an autonomous memory agent that interfaces with a lightweight hierarchical memory database to perform dynamic updates and multi-turn retrieval over streaming inputs. To effectively train long-horizon memory management capabilities, we further propose MoT-GRPO, a tree-guided reinforcement learning framework that transforms sparse terminal feedback into dense, step-wise supervision via…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
EdwinYue/Mem-T-4B
model· 24 dl· ♡ 2
24 dl♡ 2

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFerroelectric and Negative Capacitance Devices · Parallel Computing and Optimization Techniques · Reinforcement Learning in Robotics