BatchEnsemble: An Alternative Approach to Efficient Ensemble and Lifelong Learning
Yeming Wen, Dustin Tran, Jimmy Ba

TL;DR
BatchEnsemble introduces a computationally efficient ensemble method using shared weights and rank-one matrices, achieving similar accuracy and uncertainty as traditional ensembles with significantly reduced training and inference costs.
Contribution
The paper presents BatchEnsemble, a novel ensemble approach that reduces computational and memory costs while maintaining competitive accuracy, and demonstrates its effectiveness in lifelong learning scenarios.
Findings
3X speedup at test time for ensembles of size 4
3X memory reduction compared to traditional ensembles
Achieves comparable accuracy and uncertainty in various tasks
Abstract
Ensembles, where multiple neural networks are trained individually and their predictions are averaged, have been shown to be widely successful for improving both the accuracy and predictive uncertainty of single neural networks. However, an ensemble's cost for both training and testing increases linearly with the number of networks, which quickly becomes untenable. In this paper, we propose BatchEnsemble, an ensemble method whose computational and memory costs are significantly lower than typical ensembles. BatchEnsemble achieves this by defining each weight matrix to be the Hadamard product of a shared weight among all ensemble members and a rank-one matrix per member. Unlike ensembles, BatchEnsemble is not only parallelizable across devices, where one device trains one member, but also parallelizable within a device, where multiple ensemble members are updated simultaneously for a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Sparse and Compressive Sensing Techniques
MethodsTest
