Multiclass Calibration Assessment and Recalibration of Probability Predictions via the Linear Log Odds Calibration Function
Amy Vennos, Xin Xing, and Christopher T. Franck

TL;DR
This paper introduces MCLLO, a new multicategory recalibration method that assesses and improves probability predictions without needing internal model access, offering interpretability and broad applicability.
Contribution
The paper presents MCLLO, a novel multicategory recalibration approach that includes a hypothesis test, is model-agnostic, and is easy for humans to interpret.
Findings
MCLLO effectively assesses calibration across diverse models.
It outperforms existing recalibration techniques in simulations.
MCLLO provides reliable calibration in real-world case studies.
Abstract
Machine-generated probability predictions are essential in modern classification tasks such as image classification. A model is well calibrated when its predicted probabilities correspond to observed event frequencies. Despite the need for multicategory recalibration methods, existing methods are limited to (i) comparing calibration between two or more models rather than directly assessing the calibration of a single model, (ii) requiring under-the-hood model access, e.g., accessing logit-scale predictions within the layers of a neural network, and (iii) providing output which is difficult for human analysts to understand. To overcome (i)-(iii), we propose Multicategory Linear Log Odds (MCLLO) recalibration, which (i) includes a likelihood ratio hypothesis test to assess calibration, (ii) does not require under-the-hood access to models and is thus applicable on a wide range of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
