Probabilistic Scores of Classifiers, Calibration is not Enough

Agathe Fernandes Machado; Arthur Charpentier; Emmanuel Flachaire; Ewen; Gallic; Fran\c{c}ois Hu

arXiv:2408.03421·cs.LG·August 8, 2024

Probabilistic Scores of Classifiers, Calibration is not Enough

Agathe Fernandes Machado, Arthur Charpentier, Emmanuel Flachaire, Ewen, Gallic, Fran\c{c}ois Hu

PDF

Open Access 1 Repo

TL;DR

This paper argues that traditional calibration metrics are insufficient for probabilistic predictions in binary classification, and demonstrates that optimizing KL divergence with tree-based models improves score alignment with true probabilities.

Contribution

It introduces a focus on optimizing distributional alignment via KL divergence, especially in tree-based models, over traditional calibration metrics.

Findings

01

Optimizing KL divergence improves score-probability alignment.

02

Traditional calibration metrics may lead to performance decline.

03

Tree-based models can be tuned to minimize distributional divergence.

Abstract

In binary classification tasks, accurate representation of probabilistic predictions is essential for various real-world applications such as predicting payment defaults or assessing medical risks. The model must then be well-calibrated to ensure alignment between predicted probabilities and actual outcomes. However, when score heterogeneity deviates from the underlying data probability distribution, traditional calibration metrics lose reliability, failing to align score distribution with actual probabilities. In this study, we highlight approaches that prioritize optimizing the alignment between predicted scores and true probability distributions over minimizing traditional performance or calibration metrics. When employing tree-based models such as Random Forest and XGBoost, our analysis emphasizes the flexibility these models offer in tuning hyperparameters to minimize the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fer-agathe/scores-classif-calibration
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical and Computational Modeling

MethodsALIGN