MemoryGraft: Persistent Compromise of LLM Agents via Poisoned Experience Retrieval

Saksham Sahai Srivastava; Haoyu He

arXiv:2512.16962·cs.CR·December 22, 2025

MemoryGraft: Persistent Compromise of LLM Agents via Poisoned Experience Retrieval

Saksham Sahai Srivastava, Haoyu He

PDF

Open Access 1 Datasets

TL;DR

MemoryGraft reveals a novel attack on LLM agents that manipulates their long-term memory, causing persistent behavioral changes by implanting malicious experiences that are retrieved during task execution.

Contribution

Introduces MemoryGraft, a new indirect poisoning attack method that compromises LLM agent behavior through long-term memory manipulation, exploiting the semantic imitation heuristic.

Findings

01

Poisoned experiences significantly influence retrieval outcomes.

02

Small sets of malicious templates can cause widespread behavioral drift.

03

MemoryGraft effectively persists across sessions, enabling stealthy attacks.

Abstract

Large Language Model (LLM) agents increasingly rely on long-term memory and Retrieval-Augmented Generation (RAG) to persist experiences and refine future performance. While this experience learning capability enhances agentic autonomy, it introduces a critical, unexplored attack surface, i.e., the trust boundary between an agent's reasoning core and its own past. In this paper, we introduce MemoryGraft. It is a novel indirect injection attack that compromises agent behavior not through immediate jailbreaks, but by implanting malicious successful experiences into the agent's long-term memory. Unlike traditional prompt injections that are transient, or standard RAG poisoning that targets factual knowledge, MemoryGraft exploits the agent's semantic imitation heuristic which is the tendency to replicate patterns from retrieved successful tasks. We demonstrate that an attacker who can supply…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

npow/memshield-bench
dataset· 39 dl
39 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Artificial Intelligence in Healthcare and Education