MAS-ProVe: Understanding the Process Verification of Multi-Agent Systems

Vishal Venkataramani; Haizhou Shi; Zixuan Ke; Austin Xu; Xiaoxiao He; Yingbo Zhou; Semih Yavuz; Hao Wang; Shafiq Joty

arXiv:2602.03053·cs.AI·February 4, 2026

MAS-ProVe: Understanding the Process Verification of Multi-Agent Systems

Vishal Venkataramani, Haizhou Shi, Zixuan Ke, Austin Xu, Xiaoxiao He, Yingbo Zhou, Semih Yavuz, Hao Wang, Shafiq Joty

PDF

Open Access

TL;DR

This paper empirically investigates process verification methods for multi-agent systems built on large language models, revealing challenges in reliability, high variance, and the need for further research to develop more effective verification techniques.

Contribution

It provides a systematic empirical evaluation of various process verification paradigms and strategies for MAS, highlighting their limitations and performance gaps.

Findings

01

Process verification does not consistently improve MAS performance.

02

LLM-as-a-Judge generally outperforms reward-based methods.

03

There is a trade-off between context length and verification performance.

Abstract

Multi-Agent Systems (MAS) built on Large Language Models (LLMs) often exhibit high variance in their reasoning trajectories. Process verification, which evaluates intermediate steps in trajectories, has shown promise in general reasoning settings, and has been suggested as a potential tool for guiding coordination of MAS; however, its actual effectiveness in MAS remains unclear. To fill this gap, we present MAS-ProVe, a systematic empirical study of process verification for multi-agent systems (MAS). Our study spans three verification paradigms (LLM-as-a-Judge, reward models, and process reward models), evaluated across two levels of verification granularity (agent-level and iteration-level). We further examine five representative verifiers and four context management strategies, and conduct experiments over six diverse MAS frameworks on multiple reasoning benchmarks. We find that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Multi-Agent Systems and Negotiation · Ethics and Social Impacts of AI