Bayesian Dark Knowledge

Anoop Korattikara; Vivek Rathod; Kevin Murphy; Max Welling

arXiv:1506.04416·cs.LG·November 10, 2015·135 cites

Bayesian Dark Knowledge

Anoop Korattikara, Vivek Rathod, Kevin Murphy, Max Welling

PDF

Open Access 1 Repo

TL;DR

This paper introduces a method to distill Bayesian neural network posterior samples into a single, efficient model, improving upon existing approaches in accuracy, simplicity, and computational efficiency.

Contribution

It proposes a novel distillation technique for Bayesian neural networks that outperforms recent methods like expectation propagation and variational Bayes.

Findings

01

Better predictive accuracy than existing methods

02

Simpler implementation and less computational cost

03

More efficient at test time

Abstract

We consider the problem of Bayesian parameter estimation for deep neural networks, which is important in problem settings where we may have little data, and/ or where we need accurate posterior predictive densities, e.g., for applications involving bandits or active learning. One simple approach to this is to use online Monte Carlo methods, such as SGLD (stochastic gradient Langevin dynamics). Unfortunately, such a method needs to store many copies of the parameters (which wastes memory), and needs to make predictions using many versions of the model (which wastes time). We describe a method for "distilling" a Monte Carlo approximation to the posterior predictive density into a more compact form, namely a single deep neural network. We compare to two very recent approaches to Bayesian neural networks, namely an approach based on expectation propagation [Hernandez-Lobato and Adams,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ofnt/real_anot
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Markov Chains and Monte Carlo Methods · Machine Learning and Algorithms