TL;DR
This paper introduces SLIM, a dynamic framework for managing external skills in reinforcement learning agents, optimizing skill usage based on their contribution to improve task performance.
Contribution
SLIM enables dynamic, joint optimization of active external skills and policy learning, allowing for non-monotonic, task-dependent skill management in agentic RL.
Findings
SLIM outperforms baselines by 7.1% on ALFWorld and SearchQA.
Skills are selectively retained, retired, or expanded based on their marginal contribution.
Policy learning and skill retention can coexist, with some skills absorbed into the policy.
Abstract
Large language model agents increasingly rely on external skills to solve complex tasks, where skills act as modular units that extend their capabilities beyond what parametric memory alone supports. Existing methods assume external skills either accumulate as persistent guidance or internalized into the policy, eventually leading to zero-skill inference. We argue this assumption is overly restrictive, since with limited parametric capacity and uneven marginal contribution across skills, the optimal active skill set is non-monotonic, task- and stage-dependent. In this work, we propose SLIM, a framework of dynamic Skill LIfecycle Management for agentic reinforcement learning (RL), which treats the active external skill set as a dynamic optimization variable jointly updated with policy learning. Specifically, SLIM estimates each active skill's marginal external contribution through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
