Statistical Physics of Deep Neural Networks: Initialization toward Optimal Channels
Kangyu Weng, Aohua Cheng, Ziyang Zhang, Pei Sun, Yang Tian

TL;DR
This paper develops a statistical physics-based theory showing that initializing neural networks at dynamic isometry maximizes mutual information transfer, linking information theory and deep learning for optimal channel performance.
Contribution
It introduces a corrected mean-field framework and proves that mutual information is maximized at dynamic isometry, a novel insight for neural network initialization.
Findings
Mutual information maximization occurs at dynamic isometry.
The corrected mean-field theory accurately predicts information propagation.
Experimental validation confirms the theory's robustness.
Abstract
In deep learning, neural networks serve as noisy channels between input data and its representation. This perspective naturally relates deep learning with the pursuit of constructing channels with optimal performance in information transmission and representation. While considerable efforts are concentrated on realizing optimal channel properties during network optimization, we study a frequently overlooked possibility that neural networks can be initialized toward optimal channels. Our theory, consistent with experimental validation, identifies primary mechanics underlying this unknown possibility and suggests intrinsic connections between statistical physics and deep learning. Unlike the conventional theories that characterize neural networks applying the classic mean-filed approximation, we offer analytic proof that this extensively applied simplification scheme is not valid in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Stochastic Gradient Optimization Techniques · Statistical Mechanics and Entropy
