Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes
Roman Novak, Lechao Xiao, Jaehoon Lee, Yasaman Bahri, Greg Yang, Jiri, Hron, Daniel A. Abolafia, Jeffrey Pennington, Jascha Sohl-Dickstein

TL;DR
This paper establishes an equivalence between multi-layer convolutional neural networks and Gaussian processes, enabling Bayesian analysis and state-of-the-art results on CIFAR10 without trainable kernels.
Contribution
It derives the Gaussian process equivalence for CNNs with and without pooling, introduces a Monte Carlo method for estimation, and explores the impact of translation equivariance in the infinite channel limit.
Findings
GP equivalence for CNNs with/without pooling
State-of-the-art CIFAR10 results with GPs
SGD-trained CNNs can outperform their GP counterparts
Abstract
There is a previously identified equivalence between wide fully connected neural networks (FCNs) and Gaussian processes (GPs). This equivalence enables, for instance, test set predictions that would have resulted from a fully Bayesian, infinitely wide trained FCN to be computed without ever instantiating the FCN, but by instead evaluating the corresponding GP. In this work, we derive an analogous equivalence for multi-layer convolutional neural networks (CNNs) both with and without pooling layers, and achieve state of the art results on CIFAR10 for GPs without trainable kernels. We also introduce a Monte Carlo method to estimate the GP corresponding to a given neural network architecture, even in cases where the analytic form has too many terms to be computationally feasible. Surprisingly, in the absence of pooling layers, the GPs corresponding to CNNs with and without weight sharing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
MethodsMax Pooling · Convolution · Fully Convolutional Network · Stochastic Gradient Descent
