Token-Supervised Value Models for Enhancing Mathematical Problem-Solving   Capabilities of Large Language Models

Jung Hyun Lee; June Yong Yang; Byeongho Heo; Dongyoon Han; Kyungsu; Kim; Eunho Yang; Kang Min Yoo

arXiv:2407.12863·cs.CL·March 11, 2025

Token-Supervised Value Models for Enhancing Mathematical Problem-Solving Capabilities of Large Language Models

Jung Hyun Lee, June Yong Yang, Byeongho Heo, Dongyoon Han, Kyungsu, Kim, Eunho Yang, Kang Min Yoo

PDF

Open Access

TL;DR

This paper introduces token-supervised value models (TVMs) that improve the evaluation of partial solutions during tree search in large language models, significantly enhancing their mathematical problem-solving accuracy.

Contribution

The paper proposes TVMs, a novel token-level verifier that explicitly assesses partial solutions, addressing limitations of existing verifiers in tree search strategies for LLMs.

Findings

01

TVMs outperform existing verifiers in mathematical problem-solving tasks.

02

Combining TVMs with tree search strategies improves LLM accuracy.

03

Experimental results show significant performance gains over prior methods.

Abstract

With the rapid advancement of test-time compute search strategies to improve the mathematical problem-solving capabilities of large language models (LLMs), the need for building robust verifiers has become increasingly important. However, all these inference strategies rely on existing verifiers originally designed for Best-of-N search, which makes them sub-optimal for tree search techniques at test time. During tree search, existing verifiers can only offer indirect and implicit assessments of partial solutions or under-value prospective intermediate steps, thus resulting in the premature pruning of promising intermediate steps. To overcome these limitations, we propose token-supervised value models (TVMs) - a new class of verifiers that assign each token a probability that reflects the likelihood of reaching the correct final answer. This new token-level supervision enables TVMs to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques