Emergence of Globally Attracting Fixed Points in Deep Neural Networks With Nonlinear Activations
Amir Joudaki, Thomas Hofmann

TL;DR
This paper presents a theoretical framework analyzing how the similarity of hidden representations in deep neural networks evolves across layers, revealing convergence to fixed points influenced by activation functions and architecture.
Contribution
It introduces a deterministic kernel evolution model under the mean-field regime and fully characterizes the fixed points for nonlinear activations, extending to residual and normalization layers.
Findings
Kernel sequence converges to a unique fixed point.
Activation functions determine the nature of the fixed point.
Residual and normalization layers exhibit similar convergence behaviors.
Abstract
Understanding how neural networks transform input data across layers is fundamental to unraveling their learning and generalization capabilities. Although prior work has used insights from kernel methods to study neural networks, a global analysis of how the similarity between hidden representations evolves across layers remains underexplored. In this paper, we introduce a theoretical framework for the evolution of the kernel sequence, which measures the similarity between the hidden representation for two different inputs. Operating under the mean-field regime, we show that the kernel sequence evolves deterministically via a kernel map, which only depends on the activation function. By expanding activation using Hermite polynomials and using their algebraic properties, we derive an explicit form for kernel map and fully characterize its fixed points. Our analysis reveals that for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Neural Networks and Reservoir Computing
