TL;DR
This paper introduces a novel perspective on neural network training by viewing it as an optimal transport problem, leading to a new algorithm that enhances generalization, especially with limited data.
Contribution
It reformulates neural network training as an optimal transport problem and proposes a new adaptive learning algorithm based on this principle.
Findings
Networks exhibit low kinetic energy displacement bias linked to better generalization.
The proposed method adapts to task complexity and improves performance in low-data scenarios.
Regularity results for solutions are derived using Optimal Transport theory.
Abstract
Neural networks have been achieving high generalization performance on many tasks despite being highly over-parameterized. Since classical statistical learning theory struggles to explain this behavior, much effort has recently been focused on uncovering the mechanisms behind it, in the hope of developing a more adequate theoretical framework and having a better control over the trained models. In this work, we adopt an alternate perspective, viewing the neural network as a dynamical system displacing input particles over time. We conduct a series of experiments and, by analyzing the network's behavior through its displacements, we show the presence of a low kinetic energy displacement bias in the transport map of the network, and link this bias with generalization performance. From this observation, we reformulate the learning problem as follows: finding neural networks which solve the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
