TL;DR
This paper introduces MUR, a dynamic reasoning method guided by uncertainty and momentum concepts, which improves reasoning efficiency and accuracy in large language models without additional training.
Contribution
MUR adaptively allocates reasoning resources using uncertainty tracking and a gamma-control mechanism, enhancing efficiency and stability over existing test-time scaling methods.
Findings
MUR reduces computation by over 45% on average.
MUR improves accuracy from 0.33 to 3.46%.
MUR outperforms various TTS methods across multiple benchmarks.
Abstract
Large Language Models have achieved impressive performance on reasoning-intensive tasks, yet optimizing their reasoning efficiency remains an open challenge. While Test-Time Scaling (TTS) improves reasoning quality, it often leads to overthinking, wasting tokens on redundant computations. This work investigates how to efficiently and adaptively guide current model' test-time scaling without additional training. Inspired by the concept of momentum in physics, we propose Momentum Uncertainty-guided Reasoning (MUR), which dynamically allocates thinking budgets to critical reasoning steps by tracking and aggregating stepwise uncertainty over time. To support flexible inference-time control, we introduce gamma-control, a simple mechanism that tunes the reasoning budget via a single hyperparameter. We provide in-depth theoretical proof to support the superiority of MUR in terms of stability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
