The role of a layer in deep neural networks: a Gaussian Process perspective
Oded Ben-David, Zohar Ringel

TL;DR
This paper establishes a novel connection between Gaussian Processes and SGD-trained deep neural networks, leading to explicit layer-wise loss functions that enhance understanding and optimization of internal representations.
Contribution
It introduces Deep Gaussian Layer-wise loss functions (DGLs) derived from a Gaussian Process correspondence, enabling layer-wise optimization and analysis of deep neural networks.
Findings
DGLs are explicit and competitive in accuracy.
The approach offers a new analytic perspective on internal representations.
Provides a foundation for layer-wise optimization in deep learning.
Abstract
A fundamental question in deep learning concerns the role played by individual layers in a deep neural network (DNN) and the transferable properties of the data representations which they learn. To the extent that layers have clear roles, one should be able to optimize them separately using layer-wise loss functions. Such loss functions would describe what is the set of good data representations at each depth of the network and provide a target for layer-wise greedy optimization (LEGO). Here we derive a novel correspondence between Gaussian Processes and SGD trained deep neural networks. Leveraging this correspondence, we derive the Deep Gaussian Layer-wise loss functions (DGLs) which, we believe, are the first supervised layer-wise loss functions which are both explicit and competitive in terms of accuracy. Being highly structured and symmetric, the DGLs provide a promising analytic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Machine Learning and Data Classification · Advanced Neural Network Applications
MethodsStochastic Gradient Descent
