Ensemble Model Patching: A Parameter-Efficient Variational Bayesian Neural Network
Oscar Chang, Yuling Yao, David Williams-King, Hod Lipson

TL;DR
This paper introduces a parameter-efficient ensemble patching method for variational Bayesian neural networks that overcomes high parameter and implementation overhead, improving accuracy and calibration on large-scale image classification.
Contribution
It proposes a new variational family for ensemble Bayesian neural networks that works well with batch normalization, reducing overhead and enhancing performance.
Findings
Improved predictive accuracy on ImageNet with ResNet-18.
Achieved near-perfect calibration in Bayesian neural networks.
Reduced parameter and programming overhead compared to traditional methods.
Abstract
Two main obstacles preventing the widespread adoption of variational Bayesian neural networks are the high parameter overhead that makes them infeasible on large networks, and the difficulty of implementation, which can be thought of as "programming overhead." MC dropout [Gal and Ghahramani, 2016] is popular because it sidesteps these obstacles. Nevertheless, dropout is often harmful to model performance when used in networks with batch normalization layers [Li et al., 2018], which are an indispensable part of modern neural networks. We construct a general variational family for ensemble-based Bayesian neural networks that encompasses dropout as a special case. We further present two specific members of this family that work well with batch normalization layers, while retaining the benefits of low parameter and programming overhead, comparable to non-Bayesian training. Our proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
MethodsDropout · Batch Normalization
