MemQ: Integrating Q-Learning into Self-Evolving Memory Agents over Provenance DAGs

Junwei Liao; Haoting Shi; Ruiwen Zhou; Jiaqian Wang; Shengtao Zhang; Wei Zhang; Ying Wen; Zhiyu Li; Feiyu Xiong; Bo Tang; Weinan Zhang; Muning Wen

arXiv:2605.08374·cs.AI·May 15, 2026

MemQ: Integrating Q-Learning into Self-Evolving Memory Agents over Provenance DAGs

Junwei Liao, Haoting Shi, Ruiwen Zhou, Jiaqian Wang, Shengtao Zhang, Wei Zhang, Ying Wen, Zhiyu Li, Feiyu Xiong, Bo Tang, Weinan Zhang, Muning Wen

PDF

1 Repo

TL;DR

MemQ introduces a novel memory updating mechanism using TD($5$) eligibility traces over provenance DAGs, significantly improving generalization and learning in memory-augmented LLM agents across diverse benchmarks.

Contribution

It formalizes a new framework for memory credit assignment in LLM agents using DAG-structured provenance and eligibility traces, enhancing memory utilization and task performance.

Findings

01

Achieves highest success rates on all six benchmarks tested.

02

Significant improvements on multi-step tasks with deep provenance chains.

03

Provides guidance for parameter selection based on DAG structure.

Abstract

Episodic memory allows LLM agents to accumulate and retrieve experience, but current methods treat each memory independently, i.e., evaluating retrieval quality in isolation without accounting for the dependency chains through which memories enable the creation of future memories. We introduce MemQ, which applies TD( $λ$ ) eligibility traces to memory Q-values, propagating credit backward through a provenance DAG that records which memories were retrieved when each new memory was created. Credit weight decays as $(γ λ)^{d}$ with DAG depth $d$ , replacing temporal distance with structural proximity. We formalize the setting as an Exogenous-Context MDP, whose factored transition decouples the exogenous task stream from the endogenous memory store. Across six benchmarks, spanning OS interaction, function calling, code generation, multimodal reasoning, embodied reasoning, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jwliao-ai/MemQ
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.