Atomic Calibration of LLMs in Long-Form Generations
Caiqi Zhang, Ruihan Yang, Zhisong Zhang, Xinting Huang, Sen Yang, Dong Yu, and Nigel Collier

TL;DR
This paper introduces atomic calibration for long-form LLM outputs, decomposing responses into claims to better evaluate and improve confidence calibration, revealing insights into model trustworthiness during extended generation.
Contribution
It proposes a novel atomic calibration approach for long-form LLMs, categorizes confidence methods, and introduces fusion strategies to enhance calibration accuracy.
Findings
LLMs show poorer calibration at the atomic level during long-form generation.
Atomic calibration reveals patterns in confidence alignment and changes during generation.
Proposed methods improve confidence calibration in long-form outputs.
Abstract
Large language models (LLMs) often suffer from hallucinations, posing significant challenges for real-world applications. Confidence calibration, as an effective indicator of hallucination, is thus essential to enhance the trustworthiness of LLMs. Prior work mainly focuses on short-form tasks using a single response-level score (macro calibration), which is insufficient for long-form outputs that may contain both accurate and inaccurate claims. In this work, we systematically study atomic calibration, which evaluates factuality calibration at a fine-grained level by decomposing long responses into atomic claims. We further categorize existing confidence elicitation methods into discriminative and generative types, and propose two new confidence fusion strategies to improve calibration. Our experiments demonstrate that LLMs exhibit poorer calibration at the atomic level during long-form…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsElectron and X-Ray Spectroscopy Techniques · X-ray Spectroscopy and Fluorescence Analysis · Advancements in Photolithography Techniques
