Avoiding Overthinking and Underthinking: Curriculum-Aware Budget Scheduling for LLMs
Amirul Rahman, Aisha Karim, Kenji Nakamura, Yi-Fan Ng

TL;DR
This paper introduces Budget-Adaptive Curriculum Reasoning (BACR), a framework that optimizes reasoning quality and token efficiency for large language models by adaptively managing compute budgets during training and inference.
Contribution
It proposes a unified, budget-aware approach with a curriculum scheduler, a conditioned policy, and a dense reward mechanism, improving reasoning efficiency and accuracy across diverse tasks.
Findings
BACR outperforms baselines on mathematical reasoning benchmarks.
Achieves up to 8.3% accuracy improvement under tight budgets.
Reduces average token consumption by 34% compared to unconstrained reasoning.
Abstract
Scaling test-time compute via extended reasoning has become a key paradigm for improving the capabilities of large language models (LLMs). However, existing approaches optimize reasoning under fixed or uniformly sampled token budgets, ignoring the fundamental mismatch between problem difficulty and allocated compute. This leads to overthinking on easy problems and underthinking on hard ones, resulting in suboptimal token efficiency across diverse reasoning scenarios. In this paper, we propose Budget-Adaptive Curriculum Reasoning (BCAE), a unified framework that jointly optimizes reasoning quality and token efficiency through three synergistic components: (1) a \emph{budget-conditioned unified policy} that embeds the token budget as a continuous conditioning signal, eliminating the need for decoupled thinking and summarization strategies; (2) a \emph{curriculum-aware budget scheduler}…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
