PretrainZero: Reinforcement Active Pretraining

Xingrun Xing; Zhiyuan Fan; Jie Lou; Guoqi Li; Jiajun Zhang; Debing Zhang

arXiv:2512.03442·cs.CL·December 4, 2025

PretrainZero: Reinforcement Active Pretraining

Xingrun Xing, Zhiyuan Fan, Jie Lou, Guoqi Li, Jiajun Zhang, Debing Zhang

PDF

Open Access

TL;DR

PretrainZero introduces a reinforcement active pretraining framework that enables large models to learn reasoning abilities from unlabeled data, significantly improving general reasoning without relying on domain-specific rewards.

Contribution

It proposes a novel reinforcement active pretraining method that enhances reasoning capabilities of large models using self-supervised RL on general corpora, bypassing the need for labeled data.

Findings

01

Improves reasoning benchmarks by 8.43 points on MMLU-Pro

02

Enhances general reasoning abilities through tackling complex masked spans

03

Enables pretrained models to serve as reasoning foundation models for downstream tasks

Abstract

Mimicking human behavior to actively learning from general experience and achieve artificial general intelligence has always been a human dream. Recent reinforcement learning (RL) based large-thinking models demonstrate impressive expert-level abilities, i.e., software and math, but still rely heavily on verifiable rewards in specific domains, placing a significant bottleneck to extend the performance boundary of general reasoning capabilities. In this work, we propose PretrainZero, a reinforcement active learning framework built on the pretraining corpus to extend RL from domain-specific post-training to general pretraining. PretrainZero features the following characteristics: 1) Active pretraining: inspired by the active learning ability of humans, PretrainZero learns a unified reasoning policy to actively identify reasonable and informative contents from pretraining corpus, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Reinforcement Learning in Robotics · Multimodal Machine Learning Applications