Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
Zhenfeng Tu, Santiago Aranguri, Arthur Jacot

TL;DR
This paper unifies the lazy and active regimes of linear network training dynamics into a single formula, revealing a mixed regime that combines advantages of both and improves convergence properties.
Contribution
It introduces a simple unifying formula capturing lazy, active, and mixed regimes, providing a comprehensive phase diagram of training behaviors for linear networks.
Findings
Mixed regime combines lazy and balanced dynamics.
Network converges from any random initialization in mixed regime.
Phase diagram characterizes training behavior based on initialization variance and width.
Abstract
The training dynamics of linear networks are well studied in two distinct setups: the lazy regime and balanced/active regime, depending on the initialization and width of the network. We provide a surprisingly simple unifying formula for the evolution of the learned matrix that contains as special cases both lazy and balanced regimes but also a mixed regime in between the two. In the mixed regime, a part of the network is lazy while the other is balanced. More precisely the network is lazy along singular values that are below a certain threshold and balanced along those that are above the same threshold. At initialization, all singular values are lazy, allowing for the network to align itself with the task, so that later in time, when some of the singular value cross the threshold and become active they will converge rapidly (convergence in the balanced regime is notoriously difficult…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsOpinion Dynamics and Social Influence
MethodsALIGN
