Calibration tests beyond classification

David Widmann; Fredrik Lindsten; Dave Zachariah

arXiv:2210.13355·stat.ML·October 25, 2022·1 cites

Calibration tests beyond classification

David Widmann, Fredrik Lindsten, Dave Zachariah

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a unified framework for evaluating calibration in probabilistic models across classification and regression tasks, generalizing existing measures and tests to improve interpretability and applicability.

Contribution

It proposes the first comprehensive framework that unifies calibration evaluation and testing for all probabilistic predictive models, including regression and multi-class classification.

Findings

01

Generalizes kernel calibration error and tests to scalar-valued kernels

02

Applies calibration evaluation to real-valued regression problems

03

Provides a more intuitive reformulation of calibration measures

Abstract

Most supervised machine learning tasks are subject to irreducible prediction errors. Probabilistic predictive models address this limitation by providing probability distributions that represent a belief over plausible targets, rather than point estimates. Such models can be a valuable tool in decision-making under uncertainty, provided that the model output is meaningful and interpretable. Calibrated models guarantee that the probabilistic predictions are neither over- nor under-confident. In the machine learning literature, different measures and statistical tests have been proposed and studied for evaluating the calibration of classification models. For regression problems, however, research has been focused on a weaker condition of calibration based on predicted quantiles for real-valued targets. In this paper, we propose the first framework that unifies calibration evaluation and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

devmotion/calibration_iclr2021
noneOfficial

Videos

Calibration tests beyond classification· slideslive

Taxonomy

TopicsMachine Learning and Data Classification · Advanced Statistical Methods and Models · Statistical Methods and Inference