On the weight dynamics of learning networks

Nahal Sharafi; Christoph Martin; and Sarah Hallerberg

arXiv:2405.00743·cs.LG·May 3, 2024

On the weight dynamics of learning networks

Nahal Sharafi, Christoph Martin, and Sarah Hallerberg

PDF

Open Access

TL;DR

This paper uses local stability analysis to understand the learning dynamics of feedforward neural networks, showing how stability indicators can predict final training loss across different configurations.

Contribution

It derives general equations for the tangent operator of learning dynamics in three-layer networks and links stability measures to training outcomes.

Findings

01

Stability indicators can predict final training loss.

02

Equations are valid for arbitrary nodes and activation functions.

03

Numerical analysis shows the relation between stability and training performance.

Abstract

Neural networks have become a widely adopted tool for tackling a variety of problems in machine learning and artificial intelligence. In this contribution we use the mathematical framework of local stability analysis to gain a deeper understanding of the learning dynamics of feed forward neural networks. Therefore, we derive equations for the tangent operator of the learning dynamics of three-layer networks learning regression tasks. The results are valid for an arbitrary numbers of nodes and arbitrary choices of activation functions. Applying the results to a network learning a regression task, we investigate numerically, how stability indicators relate to the final training-loss. Although the specific results vary with different choices of initial conditions and activation functions, we demonstrate that it is possible to predict the final training loss, by monitoring finite-time…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications