On Stein Variational Neural Network Ensembles
Francesco D'Angelo, Vincent Fortuin, Florian Wenzel

TL;DR
This paper investigates Stein variational gradient descent (SVGD) methods for neural network ensembles, demonstrating that functional and hybrid kernels enhance diversity and Bayesian approximation, with stochastic updates further improving results.
Contribution
The study compares SVGD in weight, function, and hybrid spaces, showing functional and hybrid kernels outperform traditional deep ensembles in diversity and Bayesian accuracy.
Findings
SVGD with functional and hybrid kernels surpasses deep ensembles in diversity.
Stochastic SVGD updates improve approximation of the Bayesian posterior.
SVGD approaches the true Bayesian posterior more closely than traditional methods.
Abstract
Ensembles of deep neural networks have achieved great success recently, but they do not offer a proper Bayesian justification. Moreover, while they allow for averaging of predictions over several hypotheses, they do not provide any guarantees for their diversity, leading to redundant solutions in function space. In contrast, particle-based inference methods, such as Stein variational gradient descent (SVGD), offer a Bayesian framework, but rely on the choice of a kernel to measure the similarity between ensemble members. In this work, we study different SVGD methods operating in the weight space, function space, and in a hybrid setting. We compare the SVGD approaches to other ensembling-based methods in terms of their theoretical properties and assess their empirical performance on synthetic and real-world tasks. We find that SVGD using functional and hybrid kernels can overcome the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Gaussian Processes and Bayesian Inference · Adversarial Robustness in Machine Learning
