A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration
Ramya Hebbalaguppe, Jatin Prakash, Neelabh Madan, Chetan Arora

TL;DR
This paper introduces MDCA, a novel train-time auxiliary loss for neural networks that significantly improves calibration across classification, segmentation, and language tasks, especially under domain shift and class imbalance.
Contribution
The paper proposes MDCA, a new auxiliary loss function for training neural networks that enhances calibration directly during training, unlike existing post-hoc methods.
Findings
MDCA achieves lower ECE and SCE scores on CIFAR 100 compared to SOTA.
MDCA reduces calibration error under domain shift on PACS dataset.
MDCA doubles calibration accuracy on PASCAL-VOC segmentation task.
Abstract
Deep Neural Networks ( DNN s) are known to make overconfident mistakes, which makes their use problematic in safety-critical applications. State-of-the-art ( SOTA ) calibration techniques improve on the confidence of predicted labels alone and leave the confidence of non-max classes (e.g. top-2, top-5) uncalibrated. Such calibration is not suitable for label refinement using post-processing. Further, most SOTA techniques learn a few hyper-parameters post-hoc, leaving out the scope for image, or pixel specific calibration. This makes them unsuitable for calibration under domain shift, or for dense prediction tasks like semantic segmentation. In this paper, we argue for intervening at the train time itself, so as to directly produce calibrated DNN models. We propose a novel auxiliary loss function: Multi-class Difference in Confidence and Accuracy ( MDCA ), to achieve the same MDCA can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications
MethodsFocal Loss
