CODA: Difficulty-Aware Compute Allocation for Adaptive Reasoning

Siye Wu; Jian Xie; Yikai Zhang; Yanghua Xiao

arXiv:2603.08659·cs.CL·April 7, 2026

CODA: Difficulty-Aware Compute Allocation for Adaptive Reasoning

Siye Wu, Jian Xie, Yikai Zhang, Yanghua Xiao

PDF

TL;DR

CODA introduces a dynamic compute allocation method for reasoning models that adjusts reasoning depth based on instance difficulty, reducing costs on simple tasks and enhancing performance on complex ones.

Contribution

The paper formalizes adaptive reasoning as a utility maximization problem and proposes CODA, a novel difficulty-aware compute allocation method that operates without external annotations.

Findings

01

CODA reduces token costs by over 60% on easy tasks while maintaining accuracy.

02

On hard tasks, CODA encourages more deliberative reasoning to improve performance.

03

CODA achieves adaptive reasoning across different model scales and benchmarks.

Abstract

The emergence of large reasoning models demonstrates that scaling inference-time compute significantly enhances performance on complex tasks. However, it often falls into another trap: overthinking simple problems, where repetitive rationales yield minimal accuracy gains at a disproportionately high cost. This motivates adaptive reasoning: dynamically aligning reasoning depth with instance difficulty. In this paper, we study adaptive reasoning from an optimality perspective, formalizing it as a utility maximization problem where tokens are allocated until the marginal accuracy gain falls below the incremental cost. Based on this, we propose CODA (Compute Allocation by Difficulty Awareness), a method that operationalizes this principle by allocating tokens via a policy-internal difficulty signal. Specifically, CODA estimates difficulty via group-based rollouts and maps it to two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.