Towards Understanding Gradient Flow Dynamics of Homogeneous Neural Networks Beyond the Origin
Akshay Kumar, Jarvis Haupt

TL;DR
This paper analyzes the gradient flow dynamics of homogeneous neural networks beyond the origin, revealing how weights evolve after escaping initial small values and characterizing saddle points and sparsity preservation during training.
Contribution
It provides a theoretical analysis of gradient flow dynamics beyond the origin for homogeneous neural networks, including saddle point characterization and sparsity structure preservation.
Findings
Weights remain near the origin initially and then escape, following specific dynamics.
The first saddle point encountered by gradient flow is characterized.
Sparsity structures among weights are preserved after escaping the origin.
Abstract
Recent works exploring the training dynamics of homogeneous neural network weights under gradient flow with small initialization have established that in the early stages of training, the weights remain small and near the origin, but converge in direction. Building on this, the current paper studies the gradient flow dynamics of homogeneous neural networks with locally Lipschitz gradients, after they escape the origin. Insights gained from this analysis are used to characterize the first saddle point encountered by gradient flow after escaping the origin. Also, it is shown that for homogeneous feed-forward neural networks, under certain conditions, the sparsity structure emerging among the weights before the escape is preserved after escaping the origin and until reaching the next saddle point.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Neural Networks and Applications
