A Task-Centric Theory for Iterative Self-Improvement with Easy-to-Hard Curricula

Chenruo Liu; Yijun Dong; Yiqiu Shen; Qi Lei

arXiv:2602.10014·cs.LG·March 23, 2026

A Task-Centric Theory for Iterative Self-Improvement with Easy-to-Hard Curricula

Chenruo Liu, Yijun Dong, Yiqiu Shen, Qi Lei

PDF

Open Access

TL;DR

This paper develops a theoretical framework for iterative self-improvement of large language models, analyzing how curricula from easy to hard tasks can enhance learning and providing finite-sample guarantees for reward-based fine-tuning.

Contribution

It introduces a task-centric theory for self-improvement, deriving conditions under which easy-to-hard curricula outperform fixed task mixtures, supported by theoretical analysis and experiments.

Findings

01

Self-improvement models accept more data as they improve, enabling sustained progress.

02

Easy-to-hard curricula can provably outperform fixed task mixtures under certain conditions.

03

Finite-sample guarantees are established for reward-optimized fine-tuning in iterative settings.

Abstract

Iterative self-improvement fine-tunes an autoregressive large language model (LLM) on reward-verified outputs generated by the LLM itself. In contrast to the empirical success of self-improvement, the theoretical foundation of this generative, iterative procedure in a practical, finite-sample setting remains limited. We make progress toward this goal by modeling each round of self-improvement as maximum-likelihood fine-tuning on a reward-filtered distribution and deriving finite-sample guarantees for the expected reward. Our analysis reveals an explicit feedback loop where better models accept more data per iteration, supporting sustained self-improvement while explaining eventual saturation of such improvement. Adopting a task-centric view by considering reasoning tasks with multiple difficulty levels, we further prove quantifiable conditions on model initialization, task difficulty,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Intelligent Tutoring Systems and Adaptive Learning