Diagnosing Retrieval vs. Utilization Bottlenecks in LLM Agent Memory

Boqin Yuan; Yue Su; Kun Yao

arXiv:2603.02473·cs.AI·April 14, 2026

Diagnosing Retrieval vs. Utilization Bottlenecks in LLM Agent Memory

Boqin Yuan, Yue Su, Kun Yao

PDF

1 Repo

TL;DR

This paper introduces a diagnostic framework for analyzing memory management in LLM agents, revealing that retrieval quality impacts performance more than write strategies, with raw chunk storage being surprisingly effective.

Contribution

It provides a systematic analysis of how write and retrieval strategies affect LLM agent memory performance, highlighting the dominance of retrieval methods.

Findings

01

Retrieval method significantly impacts accuracy, with a 20-point variation.

02

Raw chunked storage performs as well or better than more complex methods.

03

Performance issues mostly arise during retrieval rather than utilization.

Abstract

Memory-augmented LLM agents store and retrieve information from prior interactions, yet the relative importance of how memories are written versus how they are retrieved remains unclear. We introduce a diagnostic framework that analyzes how performance differences manifest across write strategies, retrieval methods, and memory utilization behavior, and apply it to a 3x3 study crossing three write strategies (raw chunks, Mem0-style fact extraction, MemGPT-style summarization) with three retrieval methods (cosine, BM25, hybrid reranking). On LoCoMo, retrieval method is the dominant factor: average accuracy spans 20 points across retrieval methods (57.1% to 77.2%) but only 3-8 points across write strategies. Raw chunked storage, which requires zero LLM calls, matches or outperforms expensive lossy alternatives, suggesting that current memory pipelines may discard useful context that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

boqiny/memory-probe
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.