A Survey on Autonomy-Induced Security Risks in Large Model-Based Agents
Hang Su, Jun Luo, Chang Liu, Xiao Yang, Yichi Zhang, Yinpeng Dong, Jun Zhu

TL;DR
This survey reviews the security risks introduced by autonomous large-model agents, analyzes their vulnerabilities, and discusses defense strategies and a new risk-aware architecture to enhance safety.
Contribution
It systematically categorizes security risks in large-model agents, analyzes architectural vulnerabilities, and proposes the R2A2 framework for proactive safety in autonomous AI.
Findings
Identification of novel security failure modes in autonomous agents
Analysis of vulnerabilities across perception, cognition, and action modules
Introduction of the R2A2 architecture for risk-aware decision-making
Abstract
Recent advances in large language models (LLMs) have catalyzed the rise of autonomous AI agents capable of perceiving, reasoning, and acting in dynamic, open-ended environments. These large-model agents mark a paradigm shift from static inference systems to interactive, memory-augmented entities. While these capabilities significantly expand the functional scope of AI, they also introduce qualitatively novel security risks - such as memory poisoning, tool misuse, reward hacking, and emergent misalignment - that extend beyond the threat models of conventional systems or standalone LLMs. In this survey, we first examine the structural foundations and key capabilities that underpin increasing levels of agent autonomy, including long-term memory retention, modular tool use, recursive planning, and reflective reasoning. We then analyze the corresponding security vulnerabilities across the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Software Engineering Methodologies · Explainable Artificial Intelligence (XAI)
