On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models

Charlie Zhang; Graham Neubig; Xiang Yue

arXiv:2512.07783·cs.CL·December 9, 2025

On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models

Charlie Zhang, Graham Neubig, Xiang Yue

PDF

Open Access 2 Models 2 Datasets

TL;DR

This paper develops a controlled experimental framework to disentangle the effects of pre-training, mid-training, and reinforcement learning on reasoning capabilities in language models, revealing their interplay and conditions for improvement.

Contribution

It introduces a systematic method to isolate and analyze the causal impacts of different training stages and RL on reasoning in language models.

Findings

01

RL improves reasoning only when pre-training leaves sufficient headroom.

02

Contextual generalization needs minimal pre-training exposure for effective transfer.

03

Mid-training significantly boosts performance compared to RL alone.

Abstract

Recent reinforcement learning (RL) techniques have yielded impressive reasoning improvements in language models, yet it remains unclear whether post-training truly extends a model's reasoning ability beyond what it acquires during pre-training. A central challenge is the lack of control in modern training pipelines: large-scale pre-training corpora are opaque, mid-training is often underexamined, and RL objectives interact with unknown prior knowledge in complex ways. To resolve this ambiguity, we develop a fully controlled experimental framework that isolates the causal contributions of pre-training, mid-training, and RL-based post-training. Our approach employs synthetic reasoning tasks with explicit atomic operations, parseable step-by-step reasoning traces, and systematic manipulation of training distributions. We evaluate models along two axes: extrapolative generalization to more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI)