TL;DR
AgentDyn is a new benchmark with 60 dynamic, open-ended tasks designed to evaluate agent security defenses in realistic environments, revealing current defenses' limitations.
Contribution
This work introduces AgentDyn, a comprehensive benchmark with dynamic tasks and helpful instructions, addressing flaws in existing static agent security benchmarks.
Findings
Most existing defenses are insecure or over-conservative.
Current benchmarks do not reflect real-world dynamic environments.
AgentDyn exposes weaknesses in state-of-the-art defenses.
Abstract
AI agents that autonomously interact with external tools and environments have shown great promise across real-world applications. However, their reliance on external data exposes them to serious indirect prompt injection attacks, where malicious instructions embedded in third-party content hijack agent behaviors. To mitigate this threat, a growing number of defenses have been proposed and evaluated under existing agent security benchmarks. These benchmarks provide structured environments for comparing attacks and defenses, and have become a key driver for defense design and optimization. However, as agents move toward more complex and open-ended real-world deployments, there is a pressing need for benchmarks to become more adaptive and better reflect the dynamic environments faced by real-world agentic systems. In this work, we reveal three fundamental flaws in the current benchmarks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
