Agent Skills for Large Language Models: Architecture, Acquisition, Security, and the Path Forward
Renjun Xu, Yang Yan

TL;DR
This survey explores the rapidly evolving landscape of agent skills in large language models, covering architecture, acquisition, deployment, security, and future challenges to enable trustworthy, modular, and self-improving AI agents.
Contribution
It provides a comprehensive overview of the emerging agent skills paradigm, formalizes key concepts, and proposes a research agenda for secure and trustworthy skill ecosystems.
Findings
26.1% of community-contributed skills have vulnerabilities
Progress in benchmark tasks like OSWorld and SWE-bench
Introduction of the Skill Trust and Lifecycle Governance Framework
Abstract
The transition from monolithic language models to modular, skill-equipped agents marks a defining shift in how large language models (LLMs) are deployed in practice. Rather than encoding all procedural knowledge within model weights, agent skills -- composable packages of instructions, code, and resources that agents load on demand -- enable dynamic capability extension without retraining. It is formalized in a paradigm of progressive disclosure, portable skill definitions, and integration with the Model Context Protocol (MCP). This survey provides a comprehensive treatment of the agent skills landscape, as it has rapidly evolved during the last few months. We organize the field along four axes: (i) architectural foundations, examining the SKILLmd specification, progressive context loading, and the complementary roles of skills and MCP; (ii) skill acquisition, covering reinforcement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Adversarial Robustness in Machine Learning · Software System Performance and Reliability
