Do Skill Descriptions Tell the Truth? Detecting Undisclosed Security Behaviors in Code-Backed LLM Skills
Wenhui He, Yue Li, Bang Fu, Huan Xing, Xing Fan, ZeHua Zhang, Baoning Niu

TL;DR
This paper investigates whether the implementation of programmatic skills in LLM ecosystems aligns with their natural-language descriptions, revealing significant inconsistencies that could impact security understanding.
Contribution
The authors introduce SKILLSCOPE, a source-level security property graph-based system for detecting description-implementation inconsistencies in LLM skills.
Findings
Confirmed inconsistency affects 9.4 ext{%} of skills.
SKILLSCOPE achieves 84.8 ext{%} precision and 96.5 ext{%} recall.
Both the security property graph and taxonomy are crucial for accuracy.
Abstract
Programmatic skills in LLM ecosystems consist of a natural-language description and executable implementation files. Users and LLMs rely on the description to understand the skill's scope. However, the implementation may perform security-relevant operations, such as credential access, network communication, or command execution, that the description does not state. We study this description--implementation inconsistency by asking whether the implementation stays within the security-relevant scope declared in the description. We manually analyze 920 real-world programmatic skills and construct an 11-category security property taxonomy. Based on this taxonomy, we build SKILLSCOPE, which constructs source-level security property graphs (SPGs) from implementations and performs LLM-assisted consistency checking. SPG nodes retain source-level code patterns rather than abstract taxonomy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
