Variable-Based Calibration for Machine Learning Classifiers
Markelle Kelly, Padhraic Smyth

TL;DR
This paper introduces variable-based calibration to assess how well a model's confidence scores are calibrated across different data features, revealing limitations of traditional calibration metrics and proposing new detection and mitigation strategies.
Contribution
It generalizes calibration metrics to variable-based measures, demonstrating their importance and limitations, and offers methods for better calibration assessment and improvement.
Findings
Models with near-perfect ECE can still be miscalibrated across data features
Existing calibration methods may not address variable-based miscalibration
Variable-based calibration is crucial for fairness and interpretability
Abstract
The deployment of machine learning classifiers in high-stakes domains requires well-calibrated confidence scores for model predictions. In this paper we introduce the notion of variable-based calibration to characterize calibration properties of a model with respect to a variable of interest, generalizing traditional score-based metrics such as expected calibration error (ECE). In particular, we find that models with near-perfect ECE can exhibit significant miscalibration as a function of features of the data. We demonstrate this phenomenon both theoretically and in practice on multiple well-known datasets, and show that it can persist after the application of existing calibration methods. To mitigate this issue, we propose strategies for detection, visualization, and quantification of variable-based calibration error. We then examine the limitations of current score-based calibration…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning
