Ensemble Model Patching: A Parameter-Efficient Variational Bayesian   Neural Network

Oscar Chang; Yuling Yao; David Williams-King; Hod Lipson

arXiv:1905.09453·cs.LG·May 24, 2019·5 cites

Ensemble Model Patching: A Parameter-Efficient Variational Bayesian Neural Network

Oscar Chang, Yuling Yao, David Williams-King, Hod Lipson

PDF

Open Access

TL;DR

This paper introduces a parameter-efficient ensemble patching method for variational Bayesian neural networks that overcomes high parameter and implementation overhead, improving accuracy and calibration on large-scale image classification.

Contribution

It proposes a new variational family for ensemble Bayesian neural networks that works well with batch normalization, reducing overhead and enhancing performance.

Findings

01

Improved predictive accuracy on ImageNet with ResNet-18.

02

Achieved near-perfect calibration in Bayesian neural networks.

03

Reduced parameter and programming overhead compared to traditional methods.

Abstract

Two main obstacles preventing the widespread adoption of variational Bayesian neural networks are the high parameter overhead that makes them infeasible on large networks, and the difficulty of implementation, which can be thought of as "programming overhead." MC dropout [Gal and Ghahramani, 2016] is popular because it sidesteps these obstacles. Nevertheless, dropout is often harmful to model performance when used in networks with batch normalization layers [Li et al., 2018], which are an indispensable part of modern neural networks. We construct a general variational family for ensemble-based Bayesian neural networks that encompasses dropout as a special case. We further present two specific members of this family that work well with batch normalization layers, while retaining the benefits of low parameter and programming overhead, comparable to non-Bayesian training. Our proposed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning

MethodsDropout · Batch Normalization