Self-Policy Distillation via Capability-Selective Subspace Projection
Guangya Hao, Yitong Shang, Yunbo Long, Zhuokai Zhao, Hanxue Liang

TL;DR
This paper introduces Self-Policy Distillation (SPD), a method that enhances large language models by extracting and projecting onto a capability-specific subspace, improving performance without external signals.
Contribution
SPD is a novel capability-selective self-distillation approach that operates without external signals by extracting a low-rank subspace from model gradients and projecting activations during generation.
Findings
SPD achieves up to 13% improvement over state-of-the-art methods.
SPD outperforms pre-trained baselines by up to 16%.
SPD shows 15% better performance in out-of-domain settings.
Abstract
Self-distillation bootstraps large language models (LLMs) by training on their own generations. However, existing methods either rely on external signals to curate self-generated outputs (e.g., correctness filtering, execution feedback, and reward search), which are costly and unavailable for the best-performing frontier models, or skip curation entirely and train on all raw outputs, an approach that is often domain-specific and hard to generalize. Both also share a deeper weakness that self-generated outputs entangle task-relevant capability with others, such as stylistic patterns, formatting artifacts, and model-specific errors, diluting the signal for the specific capability one aims to improve. In this paper, we propose Self-Policy Distillation (SPD), which achieves generalizable, capability selective without any external signal. Specifically, SPD extracts a low-rank capability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
