SkillTrojan: Backdoor Attacks on Skill-Based Agent Systems

Yunhao Feng; Yifan Ding; Yingshui Tan; Boren Zheng; Yanming Guo; Xiaolong Li; Kun Zhai; Yishan Li; Wenke Huang

arXiv:2604.06811·cs.CR·April 9, 2026

SkillTrojan: Backdoor Attacks on Skill-Based Agent Systems

Yunhao Feng, Yifan Ding, Yingshui Tan, Boren Zheng, Yanming Guo, Xiaolong Li, Kun Zhai, Yishan Li, Wenke Huang

PDF

TL;DR

This paper introduces SkillTrojan, a novel backdoor attack targeting skill-based agent systems by embedding malicious logic into skills, revealing a significant security vulnerability with high attack success rates.

Contribution

It presents a new backdoor attack method on skill-based agents, along with a dataset for evaluation and analysis of attack effectiveness and impact on benign performance.

Findings

01

SkillTrojan achieves up to 97.2% attack success rate.

02

Minimal degradation of benign task performance.

03

Supports automated synthesis of backdoored skills.

Abstract

Skill-based agent systems tackle complex tasks by composing reusable skills, improving modularity and scalability while introducing a largely unexamined security attack surface. We propose SkillTrojan, a backdoor attack that targets skill implementations rather than model parameters or training data. SkillTrojan embeds malicious logic inside otherwise plausible skills and leverages standard skill composition to reconstruct and execute an attacker-specified payload. The attack partitions an encrypted payload across multiple benign-looking skill invocations and activates only under a predefined trigger. SkillTrojan also supports automated synthesis of backdoored skills from arbitrary skill templates, enabling scalable propagation across skill-based agent ecosystems. To enable systematic evaluation, we release a dataset of 3,000+ curated backdoored skills spanning diverse skill patterns…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.