MAS-ProVe: Understanding the Process Verification of Multi-Agent Systems
Vishal Venkataramani, Haizhou Shi, Zixuan Ke, Austin Xu, Xiaoxiao He, Yingbo Zhou, Semih Yavuz, Hao Wang, Shafiq Joty

TL;DR
This paper empirically investigates process verification methods for multi-agent systems built on large language models, revealing challenges in reliability, high variance, and the need for further research to develop more effective verification techniques.
Contribution
It provides a systematic empirical evaluation of various process verification paradigms and strategies for MAS, highlighting their limitations and performance gaps.
Findings
Process verification does not consistently improve MAS performance.
LLM-as-a-Judge generally outperforms reward-based methods.
There is a trade-off between context length and verification performance.
Abstract
Multi-Agent Systems (MAS) built on Large Language Models (LLMs) often exhibit high variance in their reasoning trajectories. Process verification, which evaluates intermediate steps in trajectories, has shown promise in general reasoning settings, and has been suggested as a potential tool for guiding coordination of MAS; however, its actual effectiveness in MAS remains unclear. To fill this gap, we present MAS-ProVe, a systematic empirical study of process verification for multi-agent systems (MAS). Our study spans three verification paradigms (LLM-as-a-Judge, reward models, and process reward models), evaluated across two levels of verification granularity (agent-level and iteration-level). We further examine five representative verifiers and four context management strategies, and conduct experiments over six diverse MAS frameworks on multiple reasoning benchmarks. We find that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Multi-Agent Systems and Negotiation · Ethics and Social Impacts of AI
