A Principle of Least Action for the Training of Neural Networks

Skander Karkar; Ibrahim Ayed; Emmanuel de B\'ezenac; Patrick Gallinari

arXiv:2009.08372·stat.ML·June 16, 2021

A Principle of Least Action for the Training of Neural Networks

Skander Karkar, Ibrahim Ayed, Emmanuel de B\'ezenac, Patrick Gallinari

PDF

1 Repo

TL;DR

This paper introduces a novel perspective on neural network training by viewing it as an optimal transport problem, leading to a new algorithm that enhances generalization, especially with limited data.

Contribution

It reformulates neural network training as an optimal transport problem and proposes a new adaptive learning algorithm based on this principle.

Findings

01

Networks exhibit low kinetic energy displacement bias linked to better generalization.

02

The proposed method adapts to task complexity and improves performance in low-data scenarios.

03

Regularity results for solutions are derived using Optimal Transport theory.

Abstract

Neural networks have been achieving high generalization performance on many tasks despite being highly over-parameterized. Since classical statistical learning theory struggles to explain this behavior, much effort has recently been focused on uncovering the mechanisms behind it, in the hope of developing a more adequate theoretical framework and having a better control over the trained models. In this work, we adopt an alternate perspective, viewing the neural network as a dynamical system displacing input particles over time. We conduct a series of experiments and, by analyzing the network's behavior through its displacements, we show the presence of a low kinetic energy displacement bias in the transport map of the network, and link this bias with generalization performance. From this observation, we reformulate the learning problem as follows: finding neural networks which solve the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

skander-karkar/LAP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.