Caging the Agents: A Zero Trust Security Architecture for Autonomous AI in Healthcare
Saikat Maiti

TL;DR
This paper introduces a comprehensive zero trust security architecture for autonomous AI agents in healthcare, addressing critical vulnerabilities and demonstrating effective defense over 90 days of deployment.
Contribution
It develops a six-domain threat model and implements a four-layer defense system, including kernel isolation, credential protection, network policies, and prompt integrity, tailored for healthcare AI agents.
Findings
Discovered and remediated four high-severity vulnerabilities.
Achieved progressive fleet hardening over three VM image generations.
Mapped defense coverage to all eleven recent attack patterns.
Abstract
Autonomous AI agents powered by large language models are being deployed in production with capabilities including shell execution, file system access, database queries, and multi-party communication. Recent red teaming research demonstrates that these agents exhibit critical vulnerabilities in realistic settings: unauthorized compliance with non-owner instructions, sensitive information disclosure, identity spoofing, cross-agent propagation of unsafe practices, and indirect prompt injection through external resources [7]. In healthcare environments processing Protected Health Information, every such vulnerability becomes a potential HIPAA violation. This paper presents a security architecture deployed for nine autonomous AI agents in production at a healthcare technology company. We develop a six-domain threat model for agentic AI in healthcare covering credential exposure, execution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSecurity and Verification in Computing · Adversarial Robustness in Machine Learning · Access Control and Trust
