An analytic theory of shallow networks dynamics for hinge loss   classification

Franco Pellegrini; Giulio Biroli

arXiv:2006.11209·stat.ML·January 12, 2022

An analytic theory of shallow networks dynamics for hinge loss classification

Franco Pellegrini, Giulio Biroli

PDF

1 Repo 1 Video

TL;DR

This paper develops a mean-field theoretical framework to analyze the training dynamics of shallow neural networks with hinge loss, revealing phenomena like training slowdown, learning regimes, and overfitting.

Contribution

It introduces a mean-field limit approach to understand shallow network training dynamics, explicitly solving for linearly separable data with hinge loss.

Findings

01

Explicit solutions for training dynamics in the mean-field limit

02

Identification of slowdown and crossover phenomena during training

03

Assessment of mean-field theory limitations for finite networks

Abstract

Neural networks have been shown to perform incredibly well in classification tasks over structured high-dimensional datasets. However, the learning dynamics of such networks is still poorly understood. In this paper we study in detail the training dynamics of a simple type of neural network: a single hidden layer trained to perform a classification task. We show that in a suitable mean-field limit this case maps to a single-node learning problem with a time-dependent dataset determined self-consistently from the average nodes population. We specialize our theory to the prototypical case of a linearly separable dataset and a linear hinge loss, for which the dynamics can be explicitly solved. This allow us to address in a simple setting several phenomena appearing in modern networks such as slowing down of training dynamics, crossover between rich and lazy learning, and overfitting.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

phiandark/DynHingeLoss
tfOfficial

Videos

An analytic theory of shallow networks dynamics for hinge loss classification· slideslive