Double-Calibration: Towards Reliable LLMs via Calibrating Knowledge and Reasoning Confidence

Yuyin Lu; Ziran Liang; Yanghui Rao; Wenqi Fan; Fu Lee Wang; Qing Li

arXiv:2601.11956·cs.CL·May 19, 2026

Double-Calibration: Towards Reliable LLMs via Calibrating Knowledge and Reasoning Confidence

Yuyin Lu, Ziran Liang, Yanghui Rao, Wenqi Fan, Fu Lee Wang, Qing Li

PDF

TL;DR

DoublyCal is a framework that enhances the reliability of Large Language Models by calibrating both knowledge evidence and reasoning confidence, leading to more accurate and trustworthy predictions.

Contribution

It introduces a novel double-calibration principle and a lightweight proxy model to improve LLMs' factual accuracy and confidence calibration.

Findings

01

Significantly improves accuracy on knowledge-intensive benchmarks.

02

Enhances confidence calibration of black-box LLMs.

03

Maintains low token cost during inference.

Abstract

Reliable reasoning in Large Language Models (LLMs) is challenged by their propensity for hallucination. While augmenting LLMs with Knowledge Graphs (KGs) improves factual accuracy, existing KG-augmented methods fail to quantify epistemic uncertainty in both the retrieved evidence and LLMs' reasoning. To bridge this gap, we introduce DoublyCal, a framework built on a novel double-calibration principle. DoublyCal employs a lightweight proxy model to first generate KG evidence alongside a calibrated evidence confidence. This calibrated supporting evidence then guides a black-box LLM, yielding final predictions that are not only more accurate but also well-calibrated, with confidence scores traceable to the uncertainty of the supporting evidence. Experiments on knowledge-intensive benchmarks show that DoublyCal significantly improves both the accuracy and confidence calibration of black-box…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Explainable Artificial Intelligence (XAI) · Topic Modeling