On-line learning dynamics of ReLU neural networks using statistical physics techniques
Michiel Straat, Michael Biehl

TL;DR
This paper develops an exact mathematical framework for understanding how two-layer ReLU neural networks learn over time, using statistical physics methods, and compares their behavior to sigmoidal networks.
Contribution
It introduces a novel differential equation-based approach to analyze ReLU network learning dynamics, highlighting differences from sigmoidal networks in various scenarios.
Findings
ReLU networks exhibit distinctive learning behaviors compared to sigmoidal networks.
Theoretical predictions align well with numerical simulations.
ReLU networks show different dynamics in realizable and unrealizable learning scenarios.
Abstract
We introduce exact macroscopic on-line learning dynamics of two-layer neural networks with ReLU units in the form of a system of differential equations, using techniques borrowed from statistical physics. For the first experiments, numerical solutions reveal similar behavior compared to sigmoidal activation researched in earlier work. In these experiments the theoretical results show good correspondence with simulations. In ove-rrealizable and unrealizable learning scenarios, the learning behavior of ReLU networks shows distinctive characteristics compared to sigmoidal networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and ELM · Model Reduction and Neural Networks
Methods*Communicated@Fast*How Do I Communicate to Expedia?
