PRISM: Pareto-Efficient Retrieval over Intent-Aware Structured Memory for Long-Horizon Agents

Jingyi Peng; Zhongwei Wan; Weiting Liu; Qiuzhuang Sun

arXiv:2605.12260·cs.CL·May 13, 2026

PRISM: Pareto-Efficient Retrieval over Intent-Aware Structured Memory for Long-Horizon Agents

Jingyi Peng, Zhongwei Wan, Weiting Liu, Qiuzhuang Sun

PDF

TL;DR

PRISM is a retrieval framework for long-horizon language agents that efficiently manages memory by combining graph-structured retrieval, query-sensitive traversal, and compression, improving accuracy and cost-effectiveness.

Contribution

PRISM introduces a training-free, retrieval-side approach that treats memory management as a joint retrieval and compression problem over a graph structure, enhancing long-horizon agent performance.

Findings

01

PRISM outperforms baselines in accuracy on the LoCoMo benchmark.

02

PRISM achieves higher accuracy with significantly smaller context budgets.

03

PRISM demonstrates a superior balance between answer quality and retrieval efficiency.

Abstract

Long-horizon language agents accumulate conversation history far faster than any fixed context window can hold, making memory management critical to both answer accuracy and serving cost. Existing approaches either expand the context window without addressing what is retrieved, perform heavy ingestion-time fact extraction at substantial token cost, or rely on heuristic graph traversal that leaves both accuracy and efficiency on the table. We present PRISM, a training-free retrieval-side framework that treats long-horizon memory as a joint retrieval-and-compression problem over a graph-structured memory. PRISM combines four orthogonal inference-time components: Hierarchical Bundle Search over typed relation paths, Query-Sensitive Edge Costing that aligns traversal with detected query intent, Evidence Compression that compresses the candidate bundle into a compact answer-side context, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.