Stratification of uncertainties recalibrated by isotonic regression and its impact on calibration error statistics
Pascal Pernot

TL;DR
This paper investigates how isotonic regression-based recalibration of prediction uncertainties can introduce stratification and ordering effects that impact the accuracy of bin-based calibration error statistics in machine learning regression models.
Contribution
It reveals the potential issues caused by stratified uncertainties and data ordering in calibration error estimation, highlighting the need for careful binning strategies.
Findings
Stratification of uncertainties affects calibration error estimates.
Data ordering and tie-breaking influence bin-based calibration statistics.
Calibration diagnostics can be significantly impacted by stratification and binning methods.
Abstract
Abstract Post hoc recalibration of prediction uncertainties of machine learning regression problems by isotonic regression might present a problem for bin-based calibration error statistics (e.g. ENCE). Isotonic regression often produces stratified uncertainties, i.e. subsets of uncertainties with identical numerical values. Partitioning of the resulting data into equal-sized bins introduces an aleatoric component to the estimation of bin-based calibration statistics. The partitioning of stratified data into bins depends on the order of the data, which is typically an uncontrolled property of calibration test/validation sets. The tie-braking method of the ordering algorithm used for binning might also introduce an aleatoric component. I show on an example how this might significantly affect the calibration diagnostics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Statistical Methods and Models · Fault Detection and Control Systems
MethodsHigh-Order Consensuses
