Something for (almost) nothing: Improving deep ensemble calibration using unlabeled data
Konstantinos Pitas, Julyan Arbel

TL;DR
This paper introduces a simple method to enhance deep ensemble calibration using unlabeled data by fitting random labels, backed by theoretical guarantees and empirical improvements in diversity and calibration for small datasets.
Contribution
The paper proposes a novel, easy-to-implement approach for improving deep ensemble calibration with unlabeled data, supported by theoretical analysis and empirical validation.
Findings
Improved calibration and diversity in deep ensembles with unlabeled data.
Significant performance gains on small to moderate training sets.
Theoretical PAC-Bayes bound supports the method's effectiveness.
Abstract
We present a method to improve the calibration of deep ensembles in the small training data regime in the presence of unlabeled data. Our approach is extremely simple to implement: given an unlabeled set, for each unlabeled data point, we simply fit a different randomly selected label with each ensemble member. We provide a theoretical analysis based on a PAC-Bayes bound which guarantees that if we fit such a labeling on unlabeled data, and the true labels on the training data, we obtain low negative log-likelihood and high ensemble diversity on testing samples. Empirically, through detailed experiments, we find that for low to moderately-sized training sets, our ensembles are more diverse and provide better calibration than standard ensembles, sometimes significantly.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Anomaly Detection Techniques and Applications
MethodsDeep Ensembles
