Multi-Agent Verification: Scaling Test-Time Compute with Multiple   Verifiers

Shalev Lifshitz; Sheila A. McIlraith; Yilun Du

arXiv:2502.20379·cs.AI·February 28, 2025

Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers

Shalev Lifshitz, Sheila A. McIlraith, Yilun Du

PDF

Open Access

TL;DR

This paper introduces Multi-Agent Verification (MAV), a novel test-time compute paradigm that uses multiple verifiers to enhance large language model performance without additional training, demonstrating promising scaling benefits.

Contribution

The paper proposes using multiple verifiers, including Aspect Verifiers, in a new MAV framework, and introduces BoN-MAV, a simple algorithm that improves performance by combining multiple verifiers.

Findings

01

BoN-MAV outperforms self-consistency and reward model verification.

02

Combining weak verifiers enhances strong LLMs.

03

Using multiple verifiers enables effective test-time performance scaling.

Abstract

By utilizing more computational resources at test-time, large language models (LLMs) can improve without additional training. One common strategy uses verifiers to evaluate candidate outputs. In this work, we propose a novel scaling dimension for test-time compute: scaling the number of verifiers. We introduce Multi-Agent Verification (MAV) as a test-time compute paradigm that combines multiple verifiers to improve performance. We propose using Aspect Verifiers (AVs), off-the-shelf LLMs prompted to verify different aspects of outputs, as one possible choice for the verifiers in a MAV system. AVs are a convenient building block for MAV since they can be easily combined without additional training. Moreover, we introduce BoN-MAV, a simple multi-agent verification algorithm that combines best-of-n sampling with multiple verifiers. BoN-MAV demonstrates stronger scaling patterns than…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsBalanced Selection