Discovery of Hidden Miscalibration Regimes
Katarzyna Kobalczyk, Mihaela van der Schaar

TL;DR
This paper introduces a method to identify and analyze hidden, input-dependent calibration errors in models, revealing that models often have localized miscalibration that traditional global metrics miss.
Contribution
It proposes a diagnostic framework that learns a calibration-aware input representation to discover and correct local miscalibration regimes without predefined data slices.
Findings
Input-dependent calibration heterogeneity is common across LLMs.
Discovered miscalibration fields enable effective local confidence correction.
The approach improves calibration in systematically miscalibrated regions.
Abstract
Calibration is commonly evaluated by comparing model confidence with its empirical correctness, implicitly treating reliability as a function of the confidence score alone. However, this view can hide substantial structure: models may be systematically overconfident on some kinds of inputs and underconfident on others, causing global reliability diagnostics to obscure localised calibration failures. To address this, we formulate the problem of discovering hidden miscalibration regimes without assuming access to predefined data slices. We define the corresponding miscalibration field and propose a diagnostic framework for estimating it. Our approach learns a calibration-aware representation of the input space and estimates signed local miscalibration by kernel smoothing in the learned geometry. Across four real-world LLM benchmarks and twelve LLMs, we find that input-dependent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
