Reconsidering Analytical Variational Bounds for Output Layers of Deep Networks
Otmane Sakhi, Stephen Bonner, David Rohde, Flavian Vasile

TL;DR
This paper explores alternative analytical variational bounds for output layers in deep networks, enabling efficient Bayesian training without re-parameterization or Monte Carlo methods, especially for classification tasks.
Contribution
It introduces the use of Jaakola and Jordan bounds for binary classification and Bouchard bounds for multi-class classification, improving training efficiency of Bayesian neural network layers.
Findings
Binary classification layer trained with standard SGD using Jaakola-Jordan bound.
Multi-class latent variable model trained efficiently with Bouchard bound.
Fast probabilistic training of large-scale classification models.
Abstract
The combination of the re-parameterization trick with the use of variational auto-encoders has caused a sensation in Bayesian deep learning, allowing the training of realistic generative models of images and has considerably increased our ability to use scalable latent variable models. The re-parameterization trick is necessary for models in which no analytical variational bound is available and allows noisy gradients to be computed for arbitrary models. However, for certain standard output layers of a neural network, analytical bounds are available and the variational auto-encoder may be used both without the re-parameterization trick or the need for any Monte Carlo approximation. In this work, we show that using Jaakola and Jordan bound, we can produce a binary classification layer that allows a Bayesian output layer to be trained, using the standard stochastic gradient descent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference · Bayesian Methods and Mixture Models
