Guided by Gut: Efficient Test-Time Scaling with Reinforced Intrinsic Confidence

Amirhosein Ghasemabadi; Keith G. Mills; Baochun Li; Di Niu

arXiv:2505.20325·cs.CL·May 28, 2025

Guided by Gut: Efficient Test-Time Scaling with Reinforced Intrinsic Confidence

Amirhosein Ghasemabadi, Keith G. Mills, Baochun Li, Di Niu

PDF

Open Access 2 Models

TL;DR

Guided by Gut (GG) is a self-guided test-time scaling method for large language models that improves efficiency and accuracy without external reward models, using intrinsic signals and reinforcement learning.

Contribution

Introduces GG, a novel self-guided TTS framework that replaces costly external verifiers with intrinsic signals and reinforcement learning, enabling smaller models to match larger ones efficiently.

Findings

01

Achieves comparable accuracy to larger models with smaller models (e.g., 1.5B vs. 70B).

02

Reduces GPU memory usage by up to 10x and inference time by 8x.

03

Decreases KV cache memory by approximately 50% compared to BoN.

Abstract

Test-Time Scaling (TTS) methods for enhancing Large Language Model (LLM) reasoning often incur substantial computational costs, primarily due to extensive reliance on external Process Reward Models (PRMs) or sampling methods like Best-of-N (BoN). This paper introduces Guided by Gut (GG), an efficient self-guided TTS framework that achieves PRM-level performance without costly external verifier models. Our method employs a lightweight tree search guided solely by intrinsic LLM signals, token-level confidence and step novelty. One critical innovation is improving the reliability of internal confidence estimates via a targeted reinforcement learning fine-tuning phase. Empirical evaluations on challenging mathematical reasoning benchmarks demonstrate that GG enables smaller models (e.g., 1.5B parameters) to achieve accuracy matching or surpassing significantly larger models (e.g., 32B-70B…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing Techniques and Applications · Neural Networks and Applications