Optimizing Estimators of Squared Calibration Errors in Classification

Sebastian G. Gruber; Francis Bach

arXiv:2410.07014·cs.LG·February 24, 2025

Optimizing Estimators of Squared Calibration Errors in Classification

Sebastian G. Gruber, Francis Bach

PDF

Open Access

TL;DR

This paper introduces a new mean-squared error risk for comparing and optimizing calibration error estimators in classification, utilizing a regression reformulation to improve estimator selection and tuning.

Contribution

It reformulates calibration estimation as a regression problem, providing a systematic way to compare and optimize calibration estimators using a new risk measure.

Findings

01

The proposed pipeline improves calibration estimator performance.

02

Kernel ridge regression-based estimators outperform existing methods.

03

The approach enhances calibration accuracy on standard image classification tasks.

Abstract

In this work, we propose a mean-squared error-based risk that enables the comparison and optimization of estimators of squared calibration errors in practical settings. Improving the calibration of classifiers is crucial for enhancing the trustworthiness and interpretability of machine learning models, especially in sensitive decision-making scenarios. Although various calibration (error) estimators exist in the current literature, there is a lack of guidance on selecting the appropriate estimator and tuning its hyperparameters. By leveraging the bilinear structure of squared calibration errors, we reformulate calibration estimation as a regression problem with independent and identically distributed (i.i.d.) input pairs. This reformulation allows us to quantify the performance of different estimators even for the most challenging calibration criterion, known as canonical calibration.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFault Detection and Control Systems