RAST: Reasoning Activation in LLMs via Small-model Transfer
Siru Ouyang, Xinyu Zhu, Zilin Xiao, Minhao Jiang, Yu Meng, Jiawei Han

TL;DR
RAST introduces a transfer method that leverages small-model RL training to enhance large language models' reasoning abilities efficiently, reducing resource requirements while maintaining or improving performance.
Contribution
The paper proposes RAST, a novel transfer approach that applies RL-induced probability shifts from small models to large models, enabling scalable reasoning improvements.
Findings
RAST significantly improves large models' reasoning performance.
RL-induced output distribution shifts are largely model-size invariant.
RAST requires less GPU memory than direct RL training, often outperforming it.
Abstract
Reinforcement learning (RL) has become a powerful approach for improving the reasoning capabilities of large language models (LLMs), as evidenced by recent successes such as OpenAI's o1 and Deepseek-R1. However, applying RL at scale remains intimidatingly resource-intensive, requiring multiple model copies and extensive GPU workloads. On the other hand, while being powerful, recent studies suggest that RL does not fundamentally endow models with new knowledge; rather, it primarily reshapes the model's output distribution to activate reasoning capabilities latent in the base model. Building on this insight, we hypothesize that the changes in output probabilities induced by RL are largely model-size invariant, opening the door to a more efficient paradigm: training a small model with RL and transferring its induced probability shifts to larger base models. To verify our hypothesis, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI)
