Learning Query-Aware Budget-Tier Routing for Runtime Agent Memory
Haozhen Zhang, Haodong Yue, Tao Feng, Quanyu Long, Jianzhu Bao, Bowen Jin, Weizhi Zhang, Xiao Li, Jiaxuan You, Chengwei Qin, Wenya Wang

TL;DR
BudgetMem is a runtime, query-aware memory management framework for LLM agents that balances performance and cost through tiered memory modules and reinforcement learning-based routing.
Contribution
It introduces a novel, explicit control mechanism for memory resource allocation in LLM agents using tiered modules and a neural policy trained with reinforcement learning.
Findings
BudgetMem outperforms strong baselines when high performance is prioritized.
It achieves better accuracy-cost trade-offs under tight budgets.
Analysis clarifies when different tiering strategies are most effective.
Abstract
Memory is increasingly central to Large Language Model (LLM) agents operating beyond a single context window, yet most existing systems rely on offline, query-agnostic memory construction that can be inefficient and may discard query-critical information. Although runtime memory utilization is a natural alternative, prior work often incurs substantial overhead and offers limited explicit control over the performance-cost trade-off. In this work, we present \textbf{BudgetMem}, a runtime agent memory framework for explicit, query-aware performance-cost control. BudgetMem structures memory processing as a set of memory modules, each offered in three budget tiers (i.e., \textsc{Low}/\textsc{Mid}/\textsc{High}). A lightweight router performs budget-tier routing across modules to balance task performance and memory construction cost, which is implemented as a compact neural policy trained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Graph Theory and Algorithms · Topic Modeling
