Knowledge is Not Enough: Injecting RL Skills for Continual Adaptation
Pingzhi Tang, Yiding Wang, Muhan Zhang

TL;DR
This paper introduces PaST, a modular framework that enhances continual knowledge adaptation in LLMs by linearly injecting learned skills, outperforming traditional fine-tuning methods across multiple benchmarks.
Contribution
The paper proposes a novel Parametric Skill Transfer (PaST) framework that enables efficient, modular skill transfer for continual knowledge adaptation in language models.
Findings
PaST outperforms state-of-the-art self-editing SFT baseline on SQuAD by up to 9.9 points.
PaST achieves an 8.0-point accuracy gain on long-context QA LooGLE.
PaST improves zero-shot ToolBench success rates by +10.3 points on average.
Abstract
Large Language Models (LLMs) face the "knowledge cutoff" challenge, where their frozen parametric memory prevents direct internalization of new information. While Supervised Fine-Tuning (SFT) is commonly used to update model knowledge, it often updates factual content without reliably improving the model's ability to use the newly incorporated information for question answering or decision-making. Reinforcement Learning (RL) is essential for acquiring reasoning skills; however, its high computational cost makes it impractical for efficient online adaptation. We empirically observe that the parameter updates induced by SFT and RL are nearly orthogonal. Based on this observation, we propose Parametric Skill Transfer (PaST), a framework that supports modular skill transfer for efficient and effective knowledge adaptation. By extracting a domain-agnostic Skill Vector from a source domain,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
