From Debate to Decision: Conformal Social Choice for Safe Multi-Agent Deliberation
Mengdie Flora Wang, Haochen Xie, Guanghui Wang, Aijing Gao, Guang Yang, Ziyuan Li, Qucy Wei Qiu, Fangwei Han, Hengzhi Qiu, Yajing Huang, Bing Zhu, Jae Oh Woo

TL;DR
This paper introduces Conformal Social Choice, a post-hoc decision layer that aggregates debate outputs from multiple agents into calibrated, safe decisions with coverage guarantees, improving safety in multi-agent AI systems.
Contribution
It proposes a novel conformal prediction-based aggregation method that intercepts wrong consensus cases, enhancing safety without assuming individual model calibration.
Findings
Coverage stays within 1-2 points of the target across domains.
81.9% of wrong consensus cases are intercepted at α=0.05.
Remaining singleton decisions reach 90.0-96.8% accuracy, significantly above consensus.
Abstract
Multi-agent debate improves LLM reasoning, yet agreement among agents is not evidence of correctness. When agents converge on a wrong answer through social reinforcement, consensus-based stopping commits that error to an automated action with no recourse. We introduce Conformal Social Choice, a post-hoc decision layer that converts debate outputs into calibrated act-versus-escalate decisions. Verbalized probability distributions from heterogeneous agents are aggregated via a linear opinion pool and calibrated with split conformal prediction, yielding prediction sets with a marginal coverage guarantee: the correct answer is included with probability , without assumptions on individual model calibration. A hierarchical action policy maps singleton sets to autonomous action and larger sets to human escalation. On eight MMLU-Pro domains with three agents (Claude Haiku,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
