Auditing Multi-Agent LLM Reasoning Trees Outperforms Majority Vote and LLM-as-Judge

Wei Yang; Shixuan Li; Heng Ping; Peiyu Zhang; Paul Bogdan; Jesse Thomason

arXiv:2602.09341·cs.AI·February 11, 2026

Auditing Multi-Agent LLM Reasoning Trees Outperforms Majority Vote and LLM-as-Judge

Wei Yang, Shixuan Li, Heng Ping, Peiyu Zhang, Paul Bogdan, Jesse Thomason

PDF

Open Access

TL;DR

AgentAuditor introduces a reasoning tree-based aggregation method for multi-agent LLM systems, outperforming majority voting and LLM-as-Judge by explicitly analyzing reasoning divergences and conflicts.

Contribution

It presents a novel reasoning tree approach and an optimization method (ACPO) for more accurate multi-agent LLM aggregation, addressing biases and confabulation issues.

Findings

01

Up to 5% accuracy improvement over majority voting

02

Up to 3% accuracy improvement over LLM-as-Judge

03

Effective across 5 different multi-agent settings

Abstract

Multi-agent systems (MAS) can substantially extend the reasoning capacity of large language models (LLMs), yet most frameworks still aggregate agent outputs with majority voting. This heuristic discards the evidential structure of reasoning traces and is brittle under the confabulation consensus, where agents share correlated biases and converge on the same incorrect rationale. We introduce AgentAuditor, which replaces voting with a path search over a Reasoning Tree that explicitly represents agreements and divergences among agent traces. AgentAuditor resolves conflicts by comparing reasoning branches at critical divergence points, turning global adjudication into efficient, localized verification. We further propose Anti-Consensus Preference Optimization (ACPO), which trains the adjudicator on majority-failure cases and rewards evidence-based minority selections over popular errors.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation · Natural Language Processing Techniques · Mobile Crowdsensing and Crowdsourcing