Calibration without Ground Truth

Yuqing Kong; Mingyu Song; Yizhou Wang; Yifan Wu

arXiv:2601.19862·cs.LG·January 28, 2026

Calibration without Ground Truth

Yuqing Kong, Mingyu Song, Yizhou Wang, Yifan Wu

PDF

Open Access

TL;DR

This paper introduces a label-free post-processing method that enhances model calibration and performance by leveraging a weaker reference model, without requiring ground-truth labels, and guarantees worst-case loss reduction.

Contribution

It presents a novel calibration framework that guarantees performance improvement without labels, based on a characterization of model calibration relationships and an efficient Bregman projection algorithm.

Findings

01

Significantly reduces calibration errors and proper losses in large language models.

02

Achieves performance comparable to supervised methods without using labels.

03

Provides theoretical guarantees for worst-case loss reduction.

Abstract

Villalobos et al. [2024] predict that publicly available human text will be exhausted within the next decade. Thus, improving models without access to ground-truth labels becomes increasingly important. We propose a label-free post-processing framework that improves a strong but miscalibrated model using a weaker yet better-calibrated reference. Our framework guarantees a strict performance improvement under any proper loss. Our approach is based on a characterization of when strict improvement is possible: when the strong and reference models are not mutually calibrated. We formalize this condition, connect it to arbitrage and no-trade results from economics, and develop an efficient Bregman projection algorithm that guarantees worst-case loss reduction without labels. Experiments on representative LLMs across varying scales demonstrate that our label-free method significantly reduces…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Text and Document Classification Technologies