Revisiting the Calibration of Modern Neural Networks
Matthias Minderer, Josip Djolonga, Rob Romijnders, Frances Hubis,, Xiaohua Zhai, Neil Houlsby, Dustin Tran, Mario Lucic

TL;DR
This paper investigates the calibration of modern neural networks, revealing that recent architectures, especially non-convolutional models, are better calibrated and that calibration trends vary with architecture rather than size or pretraining.
Contribution
It systematically analyzes calibration in recent models, highlighting architecture as a key factor influencing calibration quality, contrasting with prior assumptions about model size and training.
Findings
Recent models, especially non-convolutional, are better calibrated.
Calibration trends are less pronounced in newer architectures.
Model architecture significantly impacts calibration properties.
Abstract
Accurate estimation of predictive uncertainty (model calibration) is essential for the safe application of neural networks. Many instances of miscalibration in modern neural networks have been reported, suggesting a trend that newer, more accurate models produce poorly calibrated predictions. Here, we revisit this question for recent state-of-the-art image classification models. We systematically relate model calibration and accuracy, and find that the most recent models, notably those not using convolutions, are among the best calibrated. Trends observed in prior model generations, such as decay of calibration with distribution shift or model size, are less pronounced in recent architectures. We also show that model size and amount of pretraining do not fully explain these differences, suggesting that architecture is a major determinant of calibration properties.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)
