Deep Neural Networks as Gaussian Processes
Jaehoon Lee, Yasaman Bahri, Roman Novak, Samuel S. Schoenholz, Jeffrey, Pennington, Jascha Sohl-Dickstein

TL;DR
This paper establishes an exact equivalence between infinitely wide deep neural networks and Gaussian processes, enabling Bayesian inference and revealing insights into network performance and uncertainty as width increases.
Contribution
It derives the exact correspondence between deep neural networks and GPs, and develops an efficient method to compute the covariance functions for Bayesian inference.
Findings
GP predictions outperform finite-width networks as width increases
Uncertainty estimates from GPs correlate with network prediction errors
Test performance improves with increased network width approaching GP performance
Abstract
It has long been known that a single-layer fully-connected neural network with an i.i.d. prior over its parameters is equivalent to a Gaussian process (GP), in the limit of infinite network width. This correspondence enables exact Bayesian inference for infinite width neural networks on regression tasks by means of evaluating the corresponding GP. Recently, kernel functions which mimic multi-layer random neural networks have been developed, but only outside of a Bayesian framework. As such, previous work has not identified that these kernels can be used as covariance functions for GPs and allow fully Bayesian prediction with a deep neural network. In this work, we derive the exact equivalence between infinitely wide deep networks and GPs. We further develop a computationally efficient pipeline to compute the covariance function for these GPs. We then use the resulting GPs to perform…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Target Tracking and Data Fusion in Sensor Networks · Statistical Mechanics and Entropy
