SkillProbe: Security Auditing for Emerging Agent Skill Marketplaces via Multi-Agent Collaboration
Zihan Guo, Zhiyu Chen, Xiaohang Nie, Jianghao Lin, Yuanjian Zhou, Weinan Zhang

TL;DR
SkillProbe is a multi-agent security auditing framework for emerging LLM agent skill marketplaces, addressing behavioral and combinatorial risks through standardized, collaborative auditing processes, revealing systemic security issues in popular skills.
Contribution
We introduce SkillProbe, a novel multi-stage, multi-agent security auditing system utilizing a standardized 'Skills-for-Skills' paradigm to detect behavioral and combinatorial risks in skill marketplaces.
Findings
Over 90% of high-popularity skills failed security audits.
High-risk skills form a giant connected component, indicating systemic risks.
Download volume is unreliable as a proxy for skill security quality.
Abstract
With the rapid evolution of Large Language Model (LLM) agent ecosystems, centralized skill marketplaces have emerged as pivotal infrastructure for augmenting agent capabilities. However, these marketplaces face unprecedented security challenges, primarily stemming from semantic-behavioral inconsistency and inter-skill combinatorial risks, where individually benign skills induce malicious behaviors during collaborative invocation. To address these vulnerabilities, we propose SkillProbe, a multi-stage security auditing framework driven by multi-agent collaboration. SkillProbe introduces a "Skills-for-Skills" design paradigm, encapsulating auditing processes into standardized skill modules to drive specialized agents through a rigorous pipeline, including admission filtering, semantic-behavioral alignment detection, and combinatorial risk simulation. We conducted a large-scale evaluation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpam and Phishing Detection · Advanced Malware Detection Techniques · Access Control and Trust
