Maximum Principle Based Algorithms for Deep Learning
Qianxiao Li, Long Chen, Cheng Tai, Weinan E

TL;DR
This paper introduces a control-theoretic framework for deep learning training using the Pontryagin's maximum principle, offering potential advantages over traditional gradient-based methods in convergence and handling complex landscapes.
Contribution
It formulates deep learning training as an optimal control problem and develops a PMP-based algorithm with theoretical guarantees and improved convergence properties.
Findings
Provides a new control-based training algorithm for deep learning.
Shows potential to avoid saddle point issues common in gradient methods.
Demonstrates favorable initial convergence rates under certain conditions.
Abstract
The continuous dynamical system approach to deep learning is explored in order to devise alternative frameworks for training algorithms. Training is recast as a control problem and this allows us to formulate necessary optimality conditions in continuous time using the Pontryagin's maximum principle (PMP). A modification of the method of successive approximations is then used to solve the PMP, giving rise to an alternative training algorithm for deep learning. This approach has the advantage that rigorous error estimates and convergence results can be established. We also show that it may avoid some pitfalls of gradient-based methods, such as slow convergence on flat landscapes near saddle points. Furthermore, we demonstrate that it obtains favorable initial convergence rate per-iteration, provided Hamiltonian maximization can be efficiently carried out - a step which is still in need…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Gaussian Processes and Bayesian Inference · Mathematical Biology Tumor Growth
