A Safety and Security Framework for Real-World Agentic Systems
Shaona Ghosh, Barnaby Simkin, Kyriacos Shiarlis, Soumili Nandi, Dan Zhao, Matthew Fiedler, Julia Bazinska, Nikki Pope, Roopa Prabhu, Daniel Rohrer, Michael Demoret, Bartley Richardson

TL;DR
This paper presents a comprehensive, dynamic framework for ensuring safety and security in agentic AI systems, emphasizing risk management through AI-driven red teaming and human oversight in complex enterprise environments.
Contribution
It introduces a novel agentic risk taxonomy and a dynamic safety framework that integrates auxiliary AI models and human oversight for risk discovery and mitigation.
Findings
Effective identification of novel agentic risks through sandboxed red teaming.
Demonstrated safety evaluation on NVIDIA's AI-Q Research Assistant.
Released a dataset with 10,000+ attack and defense traces for agentic workflows.
Abstract
This paper introduces a dynamic and actionable framework for securing agentic AI systems in enterprise deployment. We contend that safety and security are not merely fixed attributes of individual models but also emergent properties arising from the dynamic interactions among models, orchestrators, tools, and data within their operating environments. We propose a new way of identification of novel agentic risks through the lens of user safety. Although, for traditional LLMs and agentic models in isolation, safety and security has a clear separation, through the lens of safety in agentic systems, they appear to be connected. Building on this foundation, we define an operational agentic risk taxonomy that unifies traditional safety and security concerns with novel, uniquely agentic risks, including tool misuse, cascading action chains, and unintended control amplification among others. At…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation and Cyber Security · Access Control and Trust · Multi-Agent Systems and Negotiation
