FUSE: Ensembling Verifiers with Zero Labeled Data

Joonhyuk Lee; Virginia Ma; Sarah Zhao; Yash Nair; Asher Spector; Regev Cohen; Emmanuel J. Cand\`es

arXiv:2604.18547·stat.ML·April 21, 2026

FUSE: Ensembling Verifiers with Zero Labeled Data

Joonhyuk Lee, Virginia Ma, Sarah Zhao, Yash Nair, Asher Spector, Regev Cohen, Emmanuel J. Cand\`es

PDF

TL;DR

FUSE is a novel unsupervised ensembling method that enhances model output verification without needing labeled data, improving accuracy across various benchmarks.

Contribution

Introduces FUSE, a zero-label ensembling technique that controls verifier dependencies to boost verification performance without ground truth labels.

Findings

01

FUSE matches or outperforms semi-supervised methods in diverse benchmarks.

02

Effective across academic and frontier verification tasks.

03

Improves verification quality without labeled data.

Abstract

Verification of model outputs is rapidly emerging as a key primitive for both training and real-world deployment of large language models (LLMs). In practice, this often involves using imperfect LLM judges and reward models since ground truth acquisition can be time-consuming and expensive. We introduce Fully Unsupervised Score Ensembling (FUSE), a method for improving verification quality by ensembling verifiers without access to ground truth correctness labels. The key idea behind FUSE is to control conditional dependencies between verifiers in a manner that improves the unsupervised performance of a class of spectral algorithms from the ensembling literature. Despite requiring zero ground truth labels, FUSE typically matches or improves upon semi-supervised alternatives in test-time scaling experiments with diverse sets of generator models, verifiers, and benchmarks. In particular,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.