Token-Supervised Value Models for Enhancing Mathematical Problem-Solving Capabilities of Large Language Models
Jung Hyun Lee, June Yong Yang, Byeongho Heo, Dongyoon Han, Kyungsu, Kim, Eunho Yang, Kang Min Yoo

TL;DR
This paper introduces token-supervised value models (TVMs) that improve the evaluation of partial solutions during tree search in large language models, significantly enhancing their mathematical problem-solving accuracy.
Contribution
The paper proposes TVMs, a novel token-level verifier that explicitly assesses partial solutions, addressing limitations of existing verifiers in tree search strategies for LLMs.
Findings
TVMs outperform existing verifiers in mathematical problem-solving tasks.
Combining TVMs with tree search strategies improves LLM accuracy.
Experimental results show significant performance gains over prior methods.
Abstract
With the rapid advancement of test-time compute search strategies to improve the mathematical problem-solving capabilities of large language models (LLMs), the need for building robust verifiers has become increasingly important. However, all these inference strategies rely on existing verifiers originally designed for Best-of-N search, which makes them sub-optimal for tree search techniques at test time. During tree search, existing verifiers can only offer indirect and implicit assessments of partial solutions or under-value prospective intermediate steps, thus resulting in the premature pruning of promising intermediate steps. To overcome these limitations, we propose token-supervised value models (TVMs) - a new class of verifiers that assign each token a probability that reflects the likelihood of reaching the correct final answer. This new token-level supervision enables TVMs to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
