Critical feature learning in deep neural networks

Kirsten Fischer; Javed Lindner; David Dahmen; Zohar Ringel; Michael; Kr\"amer; Moritz Helias

arXiv:2405.10761·cond-mat.dis-nn·May 20, 2024·ICML

Critical feature learning in deep neural networks

Kirsten Fischer, Javed Lindner, David Dahmen, Zohar Ringel, Michael, Kr\"amer, Moritz Helias

PDF

Open Access

TL;DR

This paper develops a theoretical framework for understanding feature learning in deep neural networks by analyzing finite-width effects, kernel evolution, and the role of fluctuations in the Bayesian prior.

Contribution

It introduces a systematic theory of network kernels in finite-width deep networks, linking feature learning to criticality and prior fluctuations.

Findings

01

Kernel distribution depends inversely on network width N.

02

Backward propagation aligns kernels with target features.

03

Finite-width fluctuations enable kernel adaptation to data.

Abstract

A key property of neural networks driving their success is their ability to learn features from data. Understanding feature learning from a theoretical viewpoint is an emerging field with many open questions. In this work we capture finite-width effects with a systematic theory of network kernels in deep non-linear neural networks. We show that the Bayesian prior of the network can be written in closed form as a superposition of Gaussian processes, whose kernels are distributed with a variance that depends inversely on the network width N . A large deviation approach, which is exact in the proportional limit for the number of data points $P = α N \to \infty$ , yields a pair of forward-backward equations for the maximum a posteriori kernels in all layers at once. We study their solutions perturbatively to demonstrate how the backward propagation across layers aligns kernels…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications