Black-Box Skill Stealing Attack from Proprietary LLM Agents: An Empirical Study

Zihan Wang; Rui Zhang; Yu Liu; Chi Liu; Qingchuan Zhao; Hongwei Li; Guowen Xu

arXiv:2604.21829·cs.CR·April 28, 2026

Black-Box Skill Stealing Attack from Proprietary LLM Agents: An Empirical Study

Zihan Wang, Rui Zhang, Yu Liu, Chi Liu, Qingchuan Zhao, Hongwei Li, Guowen Xu

PDF

TL;DR

This study systematically investigates black-box skill stealing attacks on proprietary LLM agents, revealing significant vulnerabilities and proposing defenses, but highlighting ongoing risks of copyright infringement.

Contribution

First comprehensive analysis of skill stealing attacks on LLM agents, including an automated attack pipeline and evaluation across platforms, emphasizing the need for better protections.

Findings

01

Skills can often be easily extracted from commercial LLM agents.

02

Existing defenses reduce leakage but do not eliminate the risk.

03

A single attack can compromise proprietary skills, posing copyright concerns.

Abstract

Large language model (LLM) agents increasingly rely on skills to package reusable capabilities through instructions, tools, and resources. High-quality skills embed expert knowledge, curated workflows, and execution constraints into agents, fueling a growing skill economy through their value and scalability. Yet this ecosystem also creates a new attack surface, as adversaries can interact with public agent interfaces to extract hidden proprietary skill content. We present the first systematic study of black-box skill stealing against LLM agent systems. Compared with conventional system prompt stealing, skill stealing targets modular and structured capability packages whose leakage is directly actionable for copying, redistribution, and monetization, making the resulting harm potentially greater. To study this threat, we derive an attack taxonomy from prior prompt-stealing methods and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.