Improving Calibration by Relating Focal Loss, Temperature Scaling, and Properness
Viacheslav Komisarenko, Meelis Kull

TL;DR
This paper explains why focal loss improves test calibration over cross-entropy by decomposing it into a confidence-raising transformation and a proper loss, and introduces focal temperature scaling, a new calibration method that outperforms standard temperature scaling.
Contribution
It provides a theoretical explanation for focal loss's calibration benefits, reveals its connection to temperature scaling, and proposes a novel calibration method called focal temperature scaling.
Findings
Focal loss leads to better calibration due to its confidence-raising transformation.
Focal temperature scaling outperforms standard temperature scaling in experiments.
A new theoretical link between focal loss and temperature scaling is established.
Abstract
Proper losses such as cross-entropy incentivize classifiers to produce class probabilities that are well-calibrated on the training data. Due to the generalization gap, these classifiers tend to become overconfident on the test data, mandating calibration methods such as temperature scaling. The focal loss is not proper, but training with it has been shown to often result in classifiers that are better calibrated on test data. Our first contribution is a simple explanation about why focal loss training often leads to better calibration than cross-entropy training. For this, we prove that focal loss can be decomposed into a confidence-raising transformation and a proper loss. This is why focal loss pushes the model to provide under-confident predictions on the training data, resulting in being better calibrated on the test data, due to the generalization gap. Secondly, we reveal a strong…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Imbalanced Data Classification Techniques
MethodsFocal Loss
