SkillOps: Managing LLM Agent Skill Libraries as Self-Maintaining Software Ecosystems
Hongji Pu, Xinyuan Song, Liang Zhao

TL;DR
SkillOps introduces a framework for maintaining large language model skill libraries, reducing defects and improving task success without increasing runtime costs.
Contribution
It presents a novel plug-in system that diagnoses and maintains skill libraries as self-sustaining ecosystems, enhancing performance and robustness.
Findings
Achieves 79.5% task success on ALFWorld, outperforming baselines by 8.8 percentage points.
Improves retrieval-heavy baselines by 0.68 to 2.90 percentage points.
Uses minimal LLM calls, demonstrating low-overhead maintenance.
Abstract
Large language model agents increasingly rely on skill libraries for multi-step tasks, yet these libraries can accumulate persistent defects as skills are added, reused, patched, and linked to changing dependencies. We call this failure mode skill technical debt: library-level defects that may not break a single skill locally but can harm future retrieval, composition, and execution. Existing skill-based agents mainly focus on task-time retrieval, planning, and repair, while library-time maintenance remains underexplored. We propose SkillOps, a method-agnostic plug-in framework for maintaining skill libraries. SkillOps represents each skill as a typed Skill Contract (P, O, A, V, F), organizes skills with a Hierarchical Skill Ecosystem Graph, and diagnoses library health across utility, compatibility, risk, and validation dimensions. Given a raw skill library, SkillOps produces a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
