Bayesian Inference with Deep Weakly Nonlinear Networks
Boris Hanin, Alexander Zlokapa

TL;DR
This paper analytically studies Bayesian inference in deep neural networks with a specific nonlinearity, revealing how network width, depth, and data size influence the inference regime and generalization, including kernel and feature learning regimes.
Contribution
It provides a perturbative analytical framework for Bayesian inference in deep networks with shaped nonlinearity, connecting kernel and feature learning regimes, and analyzing the effects of depth and width.
Findings
Neural network Bayesian inference matches kernel methods when width is large.
The nonlinearity shape determines the embedded data geometry.
Depth enhances evidence and generalization in certain regimes.
Abstract
We show at a physics level of rigor that Bayesian inference with a fully connected neural network and a shaped nonlinearity of the form is (perturbatively) solvable in the regime where the number of training datapoints , the input dimension , the network layer widths , and the network depth are simultaneously large. Our results hold with weak assumptions on the data; the main constraint is that . We provide techniques to compute the model evidence and posterior to arbitrary order in and at arbitrary temperature. We report the following results from the first-order computation: 1. When the width is much larger than the depth and training set size , neural network Bayesian inference coincides with Bayesian inference using a kernel. The value of determines the curvature of a sphere, hyperbola, or plane into which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Anomaly Detection Techniques and Applications · Gaussian Processes and Bayesian Inference
MethodsSparse Evolutionary Training
