CoEvoSkills: Self-Evolving Agent Skills via Co-Evolutionary Verification

Hanrong Zhang; Shicheng Fan; Henry Peng Zou; Yankai Chen; Zhenting Wang; Jiayu Zhou; Chengze Li; Wei-Chieh Huang; Yifei Yao; Kening Zheng; Xue Liu; Xiaoxiao Li; Philip S. Yu

arXiv:2604.01687·cs.AI·April 14, 2026

CoEvoSkills: Self-Evolving Agent Skills via Co-Evolutionary Verification

Hanrong Zhang, Shicheng Fan, Henry Peng Zou, Yankai Chen, Zhenting Wang, Jiayu Zhou, Chengze Li, Wei-Chieh Huang, Yifei Yao, Kening Zheng, Xue Liu, Xiaoxiao Li, Philip S. Yu

PDF

TL;DR

CoEvoSkills is a framework that enables large language model agents to autonomously generate and refine complex multi-file skills through co-evolutionary verification, improving performance on multi-step tasks.

Contribution

It introduces a novel self-evolving skills framework coupling a skill generator with a surrogate verifier for autonomous skill construction.

Findings

01

Achieves highest pass rate on SkillsBench among five baselines.

02

Demonstrates strong generalization to six additional LLMs.

03

Enables autonomous construction of complex multi-file skills.

Abstract

Anthropic proposes the concept of skills for LLM agents to tackle multi-step professional tasks that simple tool invocations cannot address. A tool is a single, self-contained function, whereas a skill is a structured bundle of interdependent multi-file artifacts. Currently, skill generation is not only label-intensive due to manual authoring, but also may suffer from human--machine cognitive misalignment, which can lead to degraded agent performance, as evidenced by evaluations on SkillsBench. Therefore, we aim to enable agents to autonomously generate skills. However, existing self-evolving methods designed for tools cannot be directly applied to skills due to their increased complexity. To address these issues, we propose CoEvoSkills, a self-evolving skills framework that enables agents to autonomously construct complex, multi-file skill packages. Specifically, CoEvoSkills couples a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.