SCALE: Selective Resource Allocation for Overcoming Performance Bottlenecks in Mathematical Test-time Scaling

Yang Xiao; Chunpu Xu; Ruifeng Yuan; Jiashuo Wang; Wenjie Li; Pengfei Liu

arXiv:2512.00466·cs.CL·December 2, 2025

SCALE: Selective Resource Allocation for Overcoming Performance Bottlenecks in Mathematical Test-time Scaling

Yang Xiao, Chunpu Xu, Ruifeng Yuan, Jiashuo Wang, Wenjie Li, Pengfei Liu

PDF

Open Access 1 Datasets 1 Video

TL;DR

SCALE introduces a selective resource allocation framework for large language models, improving mathematical reasoning accuracy and efficiency by focusing computational effort on challenging sub-problems during inference.

Contribution

The paper proposes SCALE, a novel framework that dynamically allocates resources based on sub-problem difficulty, overcoming limitations of uniform scaling methods in LLM inference.

Findings

01

Achieves up to 13.75% accuracy improvement on AIME25

02

Reduces computational costs by 33%-53%

03

Outperforms uniform scaling baselines significantly

Abstract

Test-time compute scaling has emerged as a powerful paradigm for enhancing mathematical reasoning in large language models (LLMs) by allocating additional computational resources during inference. However, current methods employ uniform resource distribution across all reasoning sub-problems, creating fundamental bottlenecks where challenging sub-problems receive insufficient attention while routine operations consume disproportionate resources. This uniform allocation creates performance bottlenecks where additional computational resources yield diminishing returns. Inspired by dual-process theory, we propose \textbf{SCALE} (Selective Resource Allocation), a framework that selectively allocates computational resources based on sub-problem difficulty. SCALE operates through four stages: (1) problem decomposition into sequential reasoning sub-problems, (2) difficulty assessment of each…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

YangXiao-nlp/DualThinking
dataset· 68 dl
68 dl

Videos

SCALE: Selective Resource Allocation for Overcoming Performance Bottlenecks in Mathematical Test-time Scaling· underline

Taxonomy

TopicsTopic Modeling · Machine Learning in Materials Science · Natural Language Processing Techniques