Investigating Batch Inference in a Sequential Monte Carlo Framework for Neural Networks
Andrew Millard, Joshua Murphy, Peter Green, Simon Maskell

TL;DR
This paper explores data annealing techniques in Sequential Monte Carlo methods for neural network Bayesian inference, achieving up to 6x faster training with minimal accuracy loss.
Contribution
It introduces methods for gradually incorporating mini-batches into SMC-based Bayesian neural network inference, improving computational efficiency.
Findings
Up to 6x faster training times.
Minimal accuracy loss with data annealing.
Effective for image classification benchmarks.
Abstract
Bayesian inference allows us to define a posterior distribution over the weights of a generic neural network (NN). Exact posteriors are usually intractable, in which case approximations can be employed. One such approximation - variational inference - is computationally efficient when using mini-batch stochastic gradient descent as subsets of the data are used for likelihood and gradient evaluations, though the approach relies on the selection of a variational distribution which sufficiently matches the form of the posterior. Particle-based methods such as Markov chain Monte Carlo and Sequential Monte Carlo (SMC) do not assume a parametric family for the posterior by typically require higher computational cost. These sampling methods typically use the full-batch of data for likelihood and gradient evaluations, which contributes to this computational expense. We explore several methods…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Adversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis
