Is BatchEnsemble a Single Model? On Calibration and Diversity of Efficient Ensembles
Anton Zamyatin, Patrick Indri, Sagar Malhotra, Thomas G\"artner

TL;DR
BatchEnsemble aims to provide ensemble-like uncertainty estimates efficiently but behaves more like a single model with limited diversity, closely matching a baseline in accuracy and calibration.
Contribution
The paper critically evaluates BatchEnsemble, revealing it functions more as a single model rather than a true ensemble, challenging assumptions about its diversity and effectiveness.
Findings
BatchEnsemble underperforms Deep Ensembles in accuracy and calibration.
Members of BatchEnsemble are nearly identical in function and parameters.
BatchEnsemble behaves more like a single model than a true ensemble.
Abstract
In resource-constrained and low-latency settings, uncertainty estimates must be efficiently obtained. Deep Ensembles provide robust epistemic uncertainty (EU) but require training multiple full-size models. BatchEnsemble aims to deliver ensemble-like EU at far lower parameter and memory cost by applying learned rank-1 perturbations to a shared base network. We show that BatchEnsemble not only underperforms Deep Ensembles but closely tracks a single model baseline in terms of accuracy, calibration and out-of-distribution (OOD) detection on CIFAR10/10C/SVHN. A controlled study on MNIST finds members are near-identical in function and parameter space, indicating limited capacity to realize distinct predictive modes. Thus, BatchEnsemble behaves more like a single model than a true ensemble.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Advanced Graph Neural Networks
