Properties of the ENCE and other MAD-based calibration metrics
Pascal Pernot

TL;DR
This paper examines the properties of ENCE and MAD-based calibration metrics, highlighting their dependence on binning choices and proposing a method to estimate calibration errors independent of binning for well-calibrated datasets.
Contribution
It introduces a binning-independent estimation method for ENCE and ZVE calibration metrics, along with a statistical calibration test, addressing issues of bin sensitivity.
Findings
ENCE's proportionality to the square root of the number of bins for well-calibrated data
ZVE is less sensitive to outliers than ENCE
Proposed method provides binning-independent calibration error estimates
Abstract
The Expected Normalized Calibration Error (ENCE) is a popular calibration statistic used in Machine Learning to assess the quality of prediction uncertainties for regression problems. Estimation of the ENCE is based on the binning of calibration data. In this short note, I illustrate an annoying property of the ENCE, i.e. its proportionality to the square root of the number of bins for well calibrated or nearly calibrated datasets. A similar behavior affects the calibration error based on the variance of z-scores (ZVE), and in both cases this property is a consequence of the use of a Mean Absolute Deviation (MAD) statistic to estimate calibration errors. Hence, the question arises of which number of bins to choose for a reliable estimation of calibration error statistics. A solution is proposed to infer ENCE and ZVE values that do not depend on the number of bins for datasets assumed to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · Advanced Statistical Process Monitoring · Advanced Statistical Methods and Models
