Extended critical regimes of deep neural networks

Cheng Kevin Qu; Asem Wardak; Pulin Gong

arXiv:2203.12967·cs.LG·March 25, 2022

Extended critical regimes of deep neural networks

Cheng Kevin Qu, Asem Wardak, Pulin Gong

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new mean field theory for deep neural networks that incorporates heavy-tailed weight distributions, revealing an extended critical regime that enhances computational capabilities and training efficiency.

Contribution

It develops a novel theoretical framework combining heavy-tailed random matrix theory and non-equilibrium physics to explain extended criticality in DNNs without parameter fine-tuning.

Findings

01

Heavy-tailed weights lead to an extended critical regime in DNNs.

02

Extended criticality improves propagation dynamics and computational efficiency.

03

The theory guides the design of more effective neural architectures.

Abstract

Deep neural networks (DNNs) have been successfully applied to many real-world problems, but a complete understanding of their dynamical and computational principles is still lacking. Conventional theoretical frameworks for analysing DNNs often assume random networks with coupling weights obeying Gaussian statistics. However, non-Gaussian, heavy-tailed coupling is a ubiquitous phenomenon in DNNs. Here, by weaving together theories of heavy-tailed random matrices and non-equilibrium statistical physics, we develop a new type of mean field theory for DNNs which predicts that heavy-tailed weights enable the emergence of an extended critical regime without fine-tuning parameters. In this extended critical regime, DNNs exhibit rich and complex propagation dynamics across layers. We further elucidate that the extended criticality endows DNNs with profound computational advantages: balancing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

CKQu1/extended-criticality-dnn
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Mechanics and Entropy · Neural Networks and Applications · Model Reduction and Neural Networks