Nice Fold or Hero Call: Learning Budget-Efficient Thinking for Adaptive Reasoning
Zhaomeng Zhou, Lan Zhang, Junyang Wang, Mu Yuan, Junda Lin

TL;DR
This paper introduces BET, a framework for adaptive reasoning that optimizes test-time compute by learning when to solve, fold, or call, significantly reducing token usage while maintaining or improving performance.
Contribution
It proposes a novel investment-based approach to adaptive reasoning, combining behavioral cold-start with GRPO to improve efficiency and effectiveness across multiple benchmarks.
Findings
BET reduces reasoning tokens by ~55% on average.
BET improves overall performance while saving compute.
BET transfers zero-shot to new reasoning tasks with efficiency gains.
Abstract
Large reasoning models (LRMs) improve problem solving through extended reasoning, but often misallocate test-time compute. Existing efficiency methods reduce cost by compressing reasoning traces or conditioning budget on perceived difficulty, yet largely overlook solvability. As a result, they may spend large budgets on queries beyond the model's capability while compressing hard-but-solvable queries that require deeper reasoning. In this work, we formulate adaptive reasoning as a computational investment under uncertainty, where budget should follow the expected return of reasoning rather than perceived difficulty alone. To instantiate this principle, we propose Budget-Efficient Thinking (BET), a two-stage framework that combines behavioral cold-start with GRPO under an investment-cost-aware reward. By aligning solve-or-fold decisions with rollout-derived solvability, BET learns three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
