Value-Guided Search for Efficient Chain-of-Thought Reasoning

Kaiwen Wang; Jin Peng Zhou; Jonathan Chang; Zhaolin Gao; Nathan Kallus; Kiant\'e Brantley; Wen Sun

arXiv:2505.17373·cs.LG·October 1, 2025

Value-Guided Search for Efficient Chain-of-Thought Reasoning

Kaiwen Wang, Jin Peng Zhou, Jonathan Chang, Zhaolin Gao, Nathan Kallus, Kiant\'e Brantley, Wen Sun

PDF

1 Repo 1 Models 2 Datasets

TL;DR

This paper introduces a value-guided search method for long-context reasoning that improves efficiency and performance without needing detailed step annotations, using a large dataset and a token-level value model.

Contribution

It presents a novel value-guided search approach that does not rely on step annotations, trained on a large dataset, enhancing reasoning efficiency and scalability.

Findings

01

VGS outperforms standard voting methods in test-time scaling.

02

VGS reduces inference FLOPs while maintaining performance.

03

The dataset, model, and code are publicly available.

Abstract

In this paper, we propose a simple and efficient method for value model training on long-context reasoning traces. Compared to existing process reward models (PRMs), our method does not require a fine-grained notion of "step," which is difficult to define for long-context reasoning models. By collecting a dataset of 2.5 million reasoning traces, we train a 1.5B token-level value model and apply it to DeepSeek models for improved performance with test-time compute scaling. We find that block-wise value-guided search (VGS) with a final weighted majority vote achieves better test-time scaling than standard methods such as majority voting or best-of-n. Moreover, VGS significantly reduces the inference FLOPs required to achieve the same performance of majority voting. Our dataset, model and codebase are open-sourced.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kaiwenw/value-guided-search
pytorchOfficial

Models

🤗
VGS-AI/DeepSeek-VM-1.5B
model· 167 dl
167 dl

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.