Epistemic Gain, Aleatoric Cost: Uncertainty Decomposition in Multi-Agent Debate for Math Reasoning

Dan Qiao; Binbin Chen; Fengyu Cai; Jianlong Chen; Wenhao Li; Fuxin Jiang; Zuzhi Chen; Hongyuan Zha; Tieying Zhang; Baoxiang Wang

arXiv:2603.01221·cs.MA·March 3, 2026

Epistemic Gain, Aleatoric Cost: Uncertainty Decomposition in Multi-Agent Debate for Math Reasoning

Dan Qiao, Binbin Chen, Fengyu Cai, Jianlong Chen, Wenhao Li, Fuxin Jiang, Zuzhi Chen, Hongyuan Zha, Tieying Zhang, Baoxiang Wang

PDF

Open Access

TL;DR

This paper introduces a Bayesian uncertainty analysis framework for Multi-Agent Debate (MAD) in math reasoning, decomposing uncertainty into epistemic and aleatoric components, and proposes an uncertainty-guided MARL algorithm that improves accuracy and stability.

Contribution

It presents a novel Bayesian uncertainty decomposition for MAD and develops an uncertainty-guided MARL method to enhance reasoning performance.

Findings

01

Effective debate achieves high epistemic gain with controlled aleatoric cost.

02

Training with the proposed method improves post-debate accuracy and stability.

03

The approach enhances individual reasoning beyond single-agent reinforcement learning.

Abstract

Multi-Agent Debate (MAD) has shown promise in leveraging collective intelligence to improve reasoning and reduce hallucinations, yet it remains unclear how information exchange shapes the underlying ability. Empirically, MAD exhibits paradoxical phenomena, such as accuracy improvement accompanied by substantial increase in token entropy, and remarkable divergence between homogeneous and heterogeneous model combinations. In this paper, we propose a Bayesian uncertainty analysis framework for MAD, which decomposes total predictive uncertainty into epistemic uncertainty reducible by debate context and aleatoric uncertainty induced by internal model noise. Across multiple model configurations, we find that effective debate hinges on achieving high epistemic gain under controlled aleatoric cost. Building on this insight, we design an uncertainty-guided multi-agent reinforcement learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation · Embodied and Extended Cognition · Reinforcement Learning in Robotics