Security Considerations for Artificial Intelligence Agents

Ninghui Li; Kaiyuan Zhang; Kyle Polley; Jerry Ma

arXiv:2603.12230·cs.LG·April 7, 2026

Security Considerations for Artificial Intelligence Agents

Ninghui Li, Kaiyuan Zhang, Kyle Polley, Jerry Ma

PDF

TL;DR

This paper discusses security challenges and recommendations for frontier AI agents, emphasizing attack surfaces, defense layers, and standards to enhance safety and robustness in real-world applications.

Contribution

It provides a comprehensive analysis of security risks, attack vectors, and defense strategies for general-purpose AI agents based on industry experience.

Findings

01

Identification of key attack surfaces like prompt injection and cascading failures.

02

Evaluation of layered defenses including input mitigation and sandboxing.

03

Highlighting research gaps in security standards and policy models.

Abstract

This article, a lightly adapted version of Perplexity's response to NIST/CAISI Request for Information 2025-0035, details our observations and recommendations concerning the security of frontier AI agents. These insights are informed by Perplexity's experience operating general-purpose agentic systems used by millions of users and thousands of enterprises in both controlled and open-world environments. Agent architectures change core assumptions around code-data separation, authority boundaries, and execution predictability, creating new confidentiality, integrity, and availability failure modes. We map principal attack surfaces across tools, connectors, hosting boundaries, and multi-agent coordination, with particular emphasis on indirect prompt injection, confused-deputy behavior, and cascading failures in long-running workflows. We then assess current defenses as a layered stack:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.