DiscoUQ: Structured Disagreement Analysis for Uncertainty Quantification in LLM Agent Ensembles

Bo Jiang

arXiv:2603.20975·cs.CL·March 24, 2026

DiscoUQ: Structured Disagreement Analysis for Uncertainty Quantification in LLM Agent Ensembles

Bo Jiang

PDF

Open Access

TL;DR

DiscoUQ introduces a novel framework that analyzes inter-agent disagreement in multi-agent LLM systems to produce better uncertainty estimates, leveraging linguistic and embedding structures for improved calibration and robustness.

Contribution

The paper presents DiscoUQ, a structured disagreement analysis framework that enhances uncertainty quantification in LLM ensembles by utilizing semantic and geometric disagreement features.

Findings

01

DiscoUQ-LLM achieves an average AUROC of 0.802, outperforming baselines.

02

DiscoUQ provides well-calibrated confidence estimates with lower ECE.

03

Features generalize across diverse benchmarks with minimal performance loss.

Abstract

Multi-agent LLM systems, where multiple prompted instances of a language model independently answer questions, are increasingly used for complex reasoning tasks. However, existing methods for quantifying the uncertainty of their collective outputs rely on shallow voting statistics that discard the rich semantic information in agents' reasoning. We introduce DiscoUQ, a framework that extracts and leverages the structure of inter-agent disagreement -- both linguistic properties (evidence overlap, argument strength, divergence depth) and embedding geometry (cluster distances, dispersion, cohesion) -- to produce well-calibrated confidence estimates. We propose three methods of increasing complexity: DiscoUQ-LLM (logistic regression on LLM-extracted structure features), DiscoUQ-Embed (logistic regression on embedding geometry), and DiscoUQ-Learn (a neural network combining all features).…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Graph Neural Networks · Explainable Artificial Intelligence (XAI)