Library Drift: Diagnosing and Fixing a Silent Failure Mode in Self-Evolving LLM Skill Libraries

Xing Zhang; Yanwei Cui; Guanghui Wang; Ziyuan Li; Wei Qiu; Bing Zhu; Peiyang He

arXiv:2605.19576·cs.AI·May 20, 2026

Library Drift: Diagnosing and Fixing a Silent Failure Mode in Self-Evolving LLM Skill Libraries

Xing Zhang, Yanwei Cui, Guanghui Wang, Ziyuan Li, Wei Qiu, Bing Zhu, Peiyang He

PDF

TL;DR

This paper identifies a silent failure mode called library drift in self-evolving LLM skill libraries, and proposes diagnostics and governance fixes to improve performance and prevent degradation.

Contribution

It introduces a reproducible trigger for library drift, trace-level diagnostics tools, and a verified governance recipe to fix and prevent drift in self-evolving LLM skill libraries.

Findings

01

Diagnostics reveal drift before end-task scores degrade.

02

Governance fixes significantly improve pass@1 performance.

03

A concrete playbook for diagnosing and fixing library drift.

Abstract

Self-evolving skill libraries face a silent failure mode we term \emph{library drift}: unbounded skill accumulation without outcome-driven lifecycle management causes retrieval degradation, false-positive injections, and performance stagnation. Recent evaluation confirms the symptom--LLM-authored skills deliver +0.0pp gain while human-curated ones deliver +16.2pp (SkillsBench)--yet the underlying mechanism has not been isolated. We provide (1) a reproducible trigger: ablations that isolate drift--one disables skill injection (flat floor, +0.002), one imposes premature retirement (active harm, $-$ 0.019); (2) trace-level diagnostics: an append-only evidence log with per-skill contribution scores, attribution verdicts, and router engagement metrics that make the failure visible before it reaches end-task scores; and (3) a verified fix: a minimal governance recipe (outcome-driven retirement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.