A self consistent theory of Gaussian Processes captures feature learning   effects in finite CNNs

Gadi Naveh; Zohar Ringel

arXiv:2106.04110·cs.LG·June 9, 2021·6 cites

A self consistent theory of Gaussian Processes captures feature learning effects in finite CNNs

Gadi Naveh, Zohar Ringel

PDF

Open Access 1 Video

TL;DR

This paper develops a self-consistent Gaussian Process theory that captures feature learning effects in finite deep neural networks, bridging the gap between infinite-width limits and practical finite models.

Contribution

It introduces a novel theoretical framework that models feature learning in finite DNNs, validated on CNNs and fully connected networks, revealing a transition between learning regimes.

Findings

01

Good agreement with experiments on a toy CNN model

02

Identifies a sharp transition between feature learning and lazy learning regimes

03

Derives finite-DNN effects for non-linear fully connected networks

Abstract

Deep neural networks (DNNs) in the infinite width/channel limit have received much attention recently, as they provide a clear analytical window to deep learning via mappings to Gaussian Processes (GPs). Despite its theoretical appeal, this viewpoint lacks a crucial ingredient of deep learning in finite DNNs, laying at the heart of their success -- feature learning. Here we consider DNNs trained with noisy gradient descent on a large training set and derive a self consistent Gaussian Process theory accounting for strong finite-DNN and feature learning effects. Applying this to a toy model of a two-layer linear convolutional neural network (CNN) shows good agreement with experiments. We further identify, both analytical and numerically, a sharp transition between a feature learning regime and a lazy learning regime in this model. Strong finite-DNN effects are also derived for a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

A self consistent theory of Gaussian Processes captures feature learning effects in finite CNNs· slideslive

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Neural Networks and Applications

MethodsGaussian Process