AGSC: Adaptive Granularity and Semantic Clustering for Uncertainty Quantification in Long-text Generation
Guanran Luo, Wentao Qiu, Wanru Zhao, Wenhan Lv, Zhongquan Jian, Meihong Wang, Qingqiang Wu

TL;DR
AGSC is a novel uncertainty quantification framework for long-text generation that improves factuality assessment and reduces computational costs by leveraging semantic clustering and neutral information cues.
Contribution
Introduces AGSC, a new UQ method combining neutral probability triggers and GMM clustering for efficient, accurate long-form text evaluation.
Findings
Achieves state-of-the-art correlation with factuality on BIO and LongFact datasets.
Reduces inference time by approximately 60% compared to atomic decomposition.
Effectively distinguishes irrelevant content from uncertain information.
Abstract
Large Language Models (LLMs) have demonstrated impressive capabilities in long-form generation, yet their application is hindered by the hallucination problem. While Uncertainty Quantification (UQ) is essential for assessing reliability, the complex structure makes reliable aggregation across heterogeneous themes difficult, in addition, existing methods often overlook the nuance of neutral information and suffer from the high computational cost of fine-grained decomposition. To address these challenges, we propose AGSC (Adaptive Granularity and GMM-based Semantic Clustering), a UQ framework tailored for long-form generation. AGSC first uses NLI neutral probabilities as triggers to distinguish irrelevance from uncertainty, reducing unnecessary computation. It then applies Gaussian Mixture Model (GMM) soft clustering to model latent semantic themes and assign topic-aware weights for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
