Trust but Verify! A Survey on Verification Design for Test-time Scaling

V Venktesh; Mandeep Rathee; Avishek Anand

arXiv:2508.16665·cs.CL·September 10, 2025

Trust but Verify! A Survey on Verification Design for Test-time Scaling

V Venktesh, Mandeep Rathee, Avishek Anand

PDF

TL;DR

This survey reviews various verification methods used in test-time scaling of large language models, categorizing their training mechanisms and utility to improve inference performance.

Contribution

It provides a comprehensive categorization and analysis of verifier training approaches in test-time scaling for LLMs, which was lacking in prior literature.

Findings

01

Diverse verifier types include prompt-based, discriminative, and generative models.

02

Verification approaches enhance LLM performance by exploring decoding search space.

03

The survey offers a unified view and a repository of verification methods.

Abstract

Test-time scaling (TTS) has emerged as a new frontier for scaling the performance of Large Language Models. In test-time scaling, by using more computational resources during inference, LLMs can improve their reasoning process and task performance. Several approaches have emerged for TTS such as distilling reasoning traces from another model or exploring the vast decoding search space by employing a verifier. The verifiers serve as reward models that help score the candidate outputs from the decoding process to diligently explore the vast solution space and select the best outcome. This paradigm commonly termed has emerged as a superior approach owing to parameter free scaling at inference time and high performance gains. The verifiers could be prompt-based, fine-tuned as a discriminative or generative model to verify process paths, outcomes or both. Despite their widespread adoption,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.