Testable and Actionable Calibration for Full Swap Regret

Konstantina Bairaktari; Lunjia Hu; Huy L. Nguyen; Jonathan Ullman

arXiv:2605.17749·cs.LG·May 19, 2026

Testable and Actionable Calibration for Full Swap Regret

Konstantina Bairaktari, Lunjia Hu, Huy L. Nguyen, Jonathan Ullman

PDF

TL;DR

This paper introduces SCDL, a new calibration measure for AI predictions that is both fully actionable and testable, addressing key limitations of existing measures.

Contribution

The paper proposes SCDL, a calibration measure that is simultaneously actionable and testable, with proven theoretical properties and empirical validation.

Findings

01

SCDL is fully actionable without weakening calibration requirements.

02

SCDL can be tested with nearly optimal estimation error.

03

Experiments show SCDL outperforms existing calibration measures in practice.

Abstract

AI generated predictions increasingly inform decision making in critical tasks, and therefore must be trustworthy. One widely used measure of trustworthiness is calibration, which requires that the predictions match the true frequencies and can be treated like real probabilities of a given outcome. However, defining calibration is subtle, and designing good measures of calibration error has been an active topic of recent research. The first goal is to find calibration measures that are actionable, meaning they can inform decision makers about their utility loss when predictions are treated as true probabilities, which is known as swap regret. The second goal is to find calibration measures that are testable, meaning that calibration error can be measured from a small sample of predictions and outcomes. Although these are very basic requirements, there is no existing calibration measure…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.