Using Platt's scaling for calibration after undersampling -- limitations and how to address them
Nathan Phelps, Daniel J. Lizotte, and Douglas G. Woolford

TL;DR
This paper critically examines the use of Platt's scaling for calibrating models trained on undersampled data, revealing its limitations and proposing a modified approach for better calibration.
Contribution
It provides the first detailed analysis of Platt's scaling limitations after undersampling and introduces a theoretically motivated modification using logistic generalized additive models.
Findings
Platt's scaling often fails to calibrate models trained on undersampled data.
A modified calibration method using logistic GAMs performs better in various scenarios.
Using the original Platt's scaling without critical assessment can lead to biased calibration.
Abstract
When modelling data where the response is dichotomous and highly imbalanced, response-based sampling where a subset of the majority class is retained (i.e., undersampling) is often used to create more balanced training datasets prior to modelling. However, the models fit to this undersampled data, which we refer to as base models, generate predictions that are severely biased. There are several calibration methods that can be used to combat this bias, one of which is Platt's scaling. Here, a logistic regression model is used to model the relationship between the base model's original predictions and the response. Despite its popularity for calibrating models after undersampling, Platt's scaling was not designed for this purpose. Our work presents what we believe is the first detailed study focused on the validity of using Platt's scaling to calibrate models after undersampling. We show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Fault Detection and Control Systems
MethodsLogistic Regression · Balanced Selection
