Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors
Michael W. Dusenberry, Ghassen Jerfel, Yeming Wen, Yi-An Ma, Jasper, Snoek, Katherine Heller, Balaji Lakshminarayanan, Dustin Tran

TL;DR
This paper introduces a rank-1 parameterization for Bayesian neural networks that enhances scalability and efficiency, combining benefits of BNNs and deep ensembles, and achieves state-of-the-art results on multiple benchmarks.
Contribution
The authors propose a novel rank-1 parameterization for BNNs and a memory-efficient mixture posterior approach, improving scalability and performance over existing methods.
Findings
Rank-1 BNNs outperform traditional BNNs in accuracy and calibration.
Mixture posteriors with minimal memory increase capture multiple modes effectively.
Achieves state-of-the-art results on ImageNet, CIFAR, and MIMIC-III datasets.
Abstract
Bayesian neural networks (BNNs) demonstrate promising success in improving the robustness and uncertainty quantification of modern deep learning. However, they generally struggle with underfitting at scale and parameter efficiency. On the other hand, deep ensembles have emerged as alternatives for uncertainty quantification that, while outperforming BNNs on certain problems, also suffer from efficiency issues. It remains unclear how to combine the strengths of these two approaches and remediate their common issues. To tackle this challenge, we propose a rank-1 parameterization of BNNs, where each weight matrix involves only a distribution on a rank-1 subspace. We also revisit the use of mixture approximate posteriors to capture multiple modes, where unlike typical mixtures, this approach admits a significantly smaller memory increase (e.g., only a 0.4% increase for a ResNet-50 mixture…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Anomaly Detection Techniques and Applications
MethodsDeep Ensembles · 1x1 Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Bottleneck Residual Block · Batch Normalization · Average Pooling · Max Pooling · Global Average Pooling · Residual Connection · Kaiming Initialization
