MEMAUDIT: An Exact Package-Oracle Evaluation Protocol for Budgeted Long-Term LLM Memory Writing

Nishant Bhargava; Rodrigo Sobral Barrento

arXiv:2605.02199·cs.AI·May 5, 2026

MEMAUDIT: An Exact Package-Oracle Evaluation Protocol for Budgeted Long-Term LLM Memory Writing

Nishant Bhargava, Rodrigo Sobral Barrento

PDF

TL;DR

MEMAUDIT introduces an exact evaluation protocol for budgeted long-term memory writing in LLM agents, enabling precise assessment of memory representations independent of retrieval and reasoning.

Contribution

It formalizes memory writing as a certified optimization problem and provides exact solutions, improving evaluation accuracy for long-term memory in LLMs.

Findings

01

Separates representation quality from retrieval effects

02

Provides certified solvers for memory optimization

03

Enables reproducible evaluation of memory writing strategies

Abstract

Long-term LLM agents must compress streams of past interactions into persistent memory before future queries are known. Existing evaluations usually measure final question-answering accuracy, which entangles memory writing with retrieval, prompting, and reader reasoning. We introduce MEMAUDIT, an exact packageoracle evaluation protocol for budgeted long-term memory writing. A MEMAUDIT package fixes an experience stream, candidate memory representations, storage costs, semantic evidence units, future-query requirements, and a budget, turning write-time memory selection into a finite auditable optimization problem with a certified denominator. We instantiate this protocol with a concave-over-modular semantic coverage objective under storage and one-representation-per-experience constraints, and compute exact package optima using branch-and-bound with MILP certification. Across controlled…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.