Learning to Reason with Curriculum I: Provable Benefits of Autocurriculum

Nived Rajaraman; Audrey Huang; Miro Dudik; Robert Schapire; Dylan J. Foster; Akshay Krishnamurthy

arXiv:2603.18325·cs.LG·March 20, 2026

Learning to Reason with Curriculum I: Provable Benefits of Autocurriculum

Nived Rajaraman, Audrey Huang, Miro Dudik, Robert Schapire, Dylan J. Foster, Akshay Krishnamurthy

PDF

Open Access

TL;DR

This paper demonstrates that autocurriculum, an adaptive training method where models select training problems based on performance, significantly reduces data and compute costs in training reasoning models, with provable benefits over standard methods.

Contribution

It introduces autocurriculum as a provably effective approach that improves training efficiency for reasoning models in both supervised and reinforcement learning settings.

Findings

01

Autocurriculum requires exponentially fewer demonstrations in supervised fine-tuning.

02

In reinforcement learning, autocurriculum decouples training cost from reference model quality.

03

Adaptive data selection based on model performance enhances training efficiency.

Abstract

Chain-of-thought reasoning, where language models expend additional computation by producing thinking tokens prior to final responses, has driven significant advances in model capabilities. However, training these reasoning models is extremely costly in terms of both data and compute, as it involves collecting long traces of reasoning behavior from humans or synthetic generators and further post-training the model via reinforcement learning. Are these costs fundamental, or can they be reduced through better algorithmic design? We show that autocurriculum, where the model uses its own performance to decide which problems to focus training on, provably improves upon standard training recipes for both supervised fine-tuning (SFT) and reinforcement learning (RL). For SFT, we show that autocurriculum requires exponentially fewer reasoning demonstrations than non-adaptive fine-tuning, by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Explainable Artificial Intelligence (XAI)