Knowledge is Not Enough: Injecting RL Skills for Continual Adaptation

Pingzhi Tang; Yiding Wang; Muhan Zhang

arXiv:2601.11258·cs.LG·May 12, 2026

Knowledge is Not Enough: Injecting RL Skills for Continual Adaptation

Pingzhi Tang, Yiding Wang, Muhan Zhang

PDF

TL;DR

This paper introduces PaST, a modular framework that enhances continual knowledge adaptation in LLMs by linearly injecting learned skills, outperforming traditional fine-tuning methods across multiple benchmarks.

Contribution

The paper proposes a novel Parametric Skill Transfer (PaST) framework that enables efficient, modular skill transfer for continual knowledge adaptation in language models.

Findings

01

PaST outperforms state-of-the-art self-editing SFT baseline on SQuAD by up to 9.9 points.

02

PaST achieves an 8.0-point accuracy gain on long-context QA LooGLE.

03

PaST improves zero-shot ToolBench success rates by +10.3 points on average.

Abstract

Large Language Models (LLMs) face the "knowledge cutoff" challenge, where their frozen parametric memory prevents direct internalization of new information. While Supervised Fine-Tuning (SFT) is commonly used to update model knowledge, it often updates factual content without reliably improving the model's ability to use the newly incorporated information for question answering or decision-making. Reinforcement Learning (RL) is essential for acquiring reasoning skills; however, its high computational cost makes it impractical for efficient online adaptation. We empirically observe that the parameter updates induced by SFT and RL are nearly orthogonal. Based on this observation, we propose Parametric Skill Transfer (PaST), a framework that supports modular skill transfer for efficient and effective knowledge adaptation. By extracting a domain-agnostic Skill Vector from a source domain,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.