Law of Large Numbers for Bayesian two-layer Neural Network trained with   Variational Inference

Arnaud Descours (LMBP); Tom Huix (X); Arnaud Guillin (LMBP); Manon; Michel (LMBP); \'Eric Moulines (X); Boris Nectoux (LMBP)

arXiv:2307.04779·stat.ML·July 12, 2023·COLT

Law of Large Numbers for Bayesian two-layer Neural Network trained with Variational Inference

Arnaud Descours (LMBP), Tom Huix (X), Arnaud Guillin (LMBP), Manon, Michel (LMBP), \'Eric Moulines (X), Boris Nectoux (LMBP)

PDF

Open Access

TL;DR

This paper rigorously analyzes the training of Bayesian two-layer neural networks with variational inference, proving a law of large numbers for different schemes and showing convergence to a mean-field limit.

Contribution

It introduces a unified law of large numbers for various VI training schemes of Bayesian neural networks, including a new computationally efficient method.

Findings

01

All training schemes converge to the same mean-field limit.

02

The paper introduces Minimal VI, a new efficient training algorithm.

03

Numerical illustrations support the theoretical results.

Abstract

We provide a rigorous analysis of training by variational inference (VI) of Bayesian neural networks in the two-layer and infinite-width case. We consider a regression problem with a regularized evidence lower bound (ELBO) which is decomposed into the expected log-likelihood of the data and the Kullback-Leibler (KL) divergence between the a priori distribution and the variational posterior. With an appropriate weighting of the KL, we prove a law of large numbers for three different training schemes: (i) the idealized case with exact estimation of a multiple Gaussian integral from the reparametrization trick, (ii) a minibatch scheme using Monte Carlo sampling, commonly known as Bayes by Backprop, and (iii) a new and computationally cheaper algorithm which we introduce as Minimal VI. An important result is that all methods converge to the same mean-field limit. Finally, we illustrate our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Adversarial Robustness in Machine Learning · Model Reduction and Neural Networks

MethodsVariational Inference