TL;DR
This paper introduces Causal Memory Intervention (CMI), a novel method for selecting relevant memories in long-horizon LLM agents by estimating their causal impact on task performance, improving robustness and answer quality.
Contribution
It proposes a causal memory-selection technique and a causally annotated benchmark to evaluate memory relevance beyond semantic similarity.
Findings
CMI outperforms baseline memory methods in answer quality and robustness.
The benchmark enables evaluation of memory relevance and causal usefulness.
Selecting memories based on causal impact improves long-term LLM performance.
Abstract
Long-horizon LLM agents rely on persistent memory to support interactions across sessions, yet existing memory systems often retrieve context using semantic similarity or broad history inclusion, treating retrieved memories as uniformly useful. This assumption is fragile because memories may be topically related while remaining irrelevant, stale, or misleading. We propose Causal Memory Intervention (CMI), a causal memory-selection technique that estimates how candidate memories affect the model's answer under controlled interventions, selecting memories that improve task performance while suppressing unstable, irrelevant, or harmful ones. To evaluate this setting, we introduce Causal-LoCoMo, a causally annotated benchmark derived from long conversational data, where each example contains a user request, a structured memory bank, useful memories, irrelevant distractors, and synthetic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
