VOLTA: The Surprising Ineffectiveness of Auxiliary Losses for Calibrated Deep Learning
Rahul D Ray, Utkarsh Srivastava

TL;DR
This paper benchmarks various uncertainty quantification methods for deep learning, revealing that a simplified model called VOLTA is highly effective, well-calibrated, and often outperforms complex baselines across multiple datasets.
Contribution
The paper introduces a simplified variant of VOLTA that is lightweight, deterministic, and achieves competitive calibration and accuracy, challenging the necessity of auxiliary losses in UQ.
Findings
VOLTA achieves up to 0.864 accuracy on CIFAR 10.
VOLTA significantly lowers expected calibration error to 0.010.
VOLTA demonstrates strong out-of-distribution detection with AUROC 0.802.
Abstract
Uncertainty quantification (UQ) is essential for deploying deep learning models in safety critical applications, yet no consensus exists on which UQ method performs best across different data modalities and distribution shifts. This paper presents a comprehensive benchmark of ten widely used UQ baselines including MC Dropout, SWAG, ensemble methods, temperature scaling, energy based OOD, Mahalanobis, hyperbolic classifiers, ENN, Taylor Sensus, and split conformal prediction against a simplified yet highly effective variant of VOLTA that retains only a deep encoder, learnable prototypes, cross entropy loss, and post hoc temperature scaling. We evaluate all methods on CIFAR 10 (in distribution), CIFAR 100, SVHN, uniform noise (out of distribution), CIFAR 10 C (corruptions), and Tiny ImageNet features (tabular). VOLTA achieves competitive or superior accuracy (up to 0.864 on CIFAR 10),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
