Weight decay induced phase transitions in multilayer neural networks
M. Ahr, M. Biehl, and E. Schloesser

TL;DR
This paper studies how weight decay influences phase transitions in multilayer neural networks, revealing critical points where vector lengths and specialization states change abruptly, using statistical physics methods.
Contribution
It introduces a statistical physics framework to analyze the effects of weight decay on neural network phase transitions, highlighting the emergence of long vector phases and specialization phenomena.
Findings
Weight decay induces a first order phase transition in vector lengths.
Existence of an anti-specialized phase with long vectors in networks with few hidden units.
Identification of phase transitions between specialized and unspecialized states.
Abstract
We investigate layered neural networks with differentiable activation function and student vectors without normalization constraint by means of equilibrium statistical physics. We consider the learning of perfectly realizable rules and find that the length of student vectors becomes infinite, unless a proper weight decay term is added to the energy. Then, the system undergoes a first order phase transition between states with very long student vectors and states where the lengths are comparable to those of the teacher vectors. Additionally in both configurations there is a phase transition between a specialized and an unspecialized phase. An anti-specialized phase with long student vectors exists in networks with a small number of hidden units.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
