Towards Understanding Gradient Flow Dynamics of Homogeneous Neural Networks Beyond the Origin

Akshay Kumar; Jarvis Haupt

arXiv:2502.15952·cs.LG·May 19, 2025

Towards Understanding Gradient Flow Dynamics of Homogeneous Neural Networks Beyond the Origin

Akshay Kumar, Jarvis Haupt

PDF

Open Access

TL;DR

This paper analyzes the gradient flow dynamics of homogeneous neural networks beyond the origin, revealing how weights evolve after escaping initial small values and characterizing saddle points and sparsity preservation during training.

Contribution

It provides a theoretical analysis of gradient flow dynamics beyond the origin for homogeneous neural networks, including saddle point characterization and sparsity structure preservation.

Findings

01

Weights remain near the origin initially and then escape, following specific dynamics.

02

The first saddle point encountered by gradient flow is characterized.

03

Sparsity structures among weights are preserved after escaping the origin.

Abstract

Recent works exploring the training dynamics of homogeneous neural network weights under gradient flow with small initialization have established that in the early stages of training, the weights remain small and near the origin, but converge in direction. Building on this, the current paper studies the gradient flow dynamics of homogeneous neural networks with locally Lipschitz gradients, after they escape the origin. Insights gained from this analysis are used to characterize the first saddle point encountered by gradient flow after escaping the origin. Also, it is shown that for homogeneous feed-forward neural networks, under certain conditions, the sparsity structure emerging among the weights before the escape is preserved after escaping the origin and until reaching the next saddle point.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Neural Networks and Applications