Absorbing state dynamics of stochastic gradient descent
Guanming Zhang, Stefano Martiniani

TL;DR
This paper models stochastic gradient descent (SGD) as a biased random organization process, revealing an absorbing phase transition and universality class behavior that elucidates neural manifold packing dynamics in deep learning.
Contribution
It introduces a minimal physical model of SGD dynamics using particle packing and connects it to nonequilibrium absorbing state models, providing new insights into neural manifold separation.
Findings
SGD undergoes an absorbing phase transition similar to BRO models.
Near the critical point, SGD exhibits Manna universality class behavior.
SGD dynamics converge to a critical packing fraction of approximately 0.64.
Abstract
Stochastic gradient descent (SGD) is a fundamental tool for training deep neural networks across a variety of tasks. In self-supervised learning, different input categories map to distinct manifolds in the embedded neural state space. Accurate classification is achieved by separating these manifolds during learning, akin to a packing problem. We investigate the dynamics of ``neural manifold packing'' by employing a minimal model in which SGD is applied to spherical particles in physical space. In this model, SGD minimizes the system's energy (classification loss) by stochastically reducing overlaps between particles (manifolds). We observe that this process undergoes an absorbing phase transition, prompting us to use the framework of biased random organization (BRO), a nonequilibrium absorbing state model, to describe SGD behavior. We show that BRO dynamics can be approximated by those…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Materials Characterization Techniques · Enhanced Oil Recovery Techniques
