Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers
Shalev Lifshitz, Sheila A. McIlraith, Yilun Du

TL;DR
This paper introduces Multi-Agent Verification (MAV), a novel test-time compute paradigm that uses multiple verifiers to enhance large language model performance without additional training, demonstrating promising scaling benefits.
Contribution
The paper proposes using multiple verifiers, including Aspect Verifiers, in a new MAV framework, and introduces BoN-MAV, a simple algorithm that improves performance by combining multiple verifiers.
Findings
BoN-MAV outperforms self-consistency and reward model verification.
Combining weak verifiers enhances strong LLMs.
Using multiple verifiers enables effective test-time performance scaling.
Abstract
By utilizing more computational resources at test-time, large language models (LLMs) can improve without additional training. One common strategy uses verifiers to evaluate candidate outputs. In this work, we propose a novel scaling dimension for test-time compute: scaling the number of verifiers. We introduce Multi-Agent Verification (MAV) as a test-time compute paradigm that combines multiple verifiers to improve performance. We propose using Aspect Verifiers (AVs), off-the-shelf LLMs prompted to verify different aspects of outputs, as one possible choice for the verifiers in a MAV system. AVs are a convenient building block for MAV since they can be easily combined without additional training. Moreover, we introduce BoN-MAV, a simple multi-agent verification algorithm that combines best-of-n sampling with multiple verifiers. BoN-MAV demonstrates stronger scaling patterns than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsBalanced Selection
