Can a calibration metric be both testable and actionable?
Raphael Rossellini, Jake A. Soloff, Rina Foygel Barber, Zhimei Ren, Rebecca Willett

TL;DR
This paper introduces Cutoff Calibration Error, a new calibration measure that is both testable and actionable, addressing limitations of existing metrics like ECE and dCE in high-stakes decision-making.
Contribution
The paper proposes Cutoff Calibration Error, a novel calibration metric that combines testability with decision-theoretic actionability, bridging a key gap in calibration assessment.
Findings
Cutoff Calibration Error is both testable and actionable.
It provides insights into calibration methods like isotonic regression and Platt scaling.
The measure improves calibration assessment for high-stakes applications.
Abstract
Forecast probabilities often serve as critical inputs for binary decision making. In such settings, calibrationensuring forecasted probabilities match empirical frequenciesis essential. Although the common notion of Expected Calibration Error (ECE) provides actionable insights for decision making, it is not testable: it cannot be empirically estimated in many practical cases. Conversely, the recently proposed Distance from Calibration (dCE) is testable, but it is not actionable since it lacks decision-theoretic guarantees needed for high-stakes applications. To resolve this question, we consider Cutoff Calibration Error, a calibration measure that bridges this gap by assessing calibration over intervals of forecasted probabilities. We show that Cutoff Calibration Error is both testable and actionable, and we examine its implications for popular post-hoc…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsForecasting Techniques and Applications · Risk and Portfolio Optimization · Meteorological Phenomena and Simulations
