AgentSys: Secure and Dynamic LLM Agents Through Explicit Hierarchical Memory Management
Ruoyao Wen, Hao Li, Chaowei Xiao, Ning Zhang

TL;DR
AgentSys introduces a hierarchical memory management framework for LLM agents that isolates tool interactions and validates data, significantly reducing vulnerabilities to prompt injection attacks while maintaining utility.
Contribution
It proposes a novel hierarchical memory architecture inspired by OS process isolation, enhancing security against prompt injection in LLM agents.
Findings
Isolation reduces attack success to 2.19%
Validator further improves defense with event-triggered checks
Maintains high utility and robustness across models
Abstract
Indirect prompt injection threatens LLM agents by embedding malicious instructions in external content, enabling unauthorized actions and data theft. LLM agents maintain working memory through their context window, which stores interaction history for decision-making. Conventional agents indiscriminately accumulate all tool outputs and reasoning traces in this memory, creating two critical vulnerabilities: (1) injected instructions persist throughout the workflow, granting attackers multiple opportunities to manipulate behavior, and (2) verbose, non-essential content degrades decision-making capabilities. Existing defenses treat bloated memory as given and focus on remaining resilient, rather than reducing unnecessary accumulation to prevent the attack. We present AgentSys, a framework that defends against indirect prompt injection through explicit memory management. Inspired by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSecurity and Verification in Computing · Digital and Cyber Forensics · Advanced Data Storage Technologies
