Maximum Principle Based Algorithms for Deep Learning

Qianxiao Li; Long Chen; Cheng Tai; Weinan E

arXiv:1710.09513·cs.LG·June 5, 2018·84 cites

Maximum Principle Based Algorithms for Deep Learning

Qianxiao Li, Long Chen, Cheng Tai, Weinan E

PDF

Open Access 2 Repos

TL;DR

This paper introduces a control-theoretic framework for deep learning training using the Pontryagin's maximum principle, offering potential advantages over traditional gradient-based methods in convergence and handling complex landscapes.

Contribution

It formulates deep learning training as an optimal control problem and develops a PMP-based algorithm with theoretical guarantees and improved convergence properties.

Findings

01

Provides a new control-based training algorithm for deep learning.

02

Shows potential to avoid saddle point issues common in gradient methods.

03

Demonstrates favorable initial convergence rates under certain conditions.

Abstract

The continuous dynamical system approach to deep learning is explored in order to devise alternative frameworks for training algorithms. Training is recast as a control problem and this allows us to formulate necessary optimality conditions in continuous time using the Pontryagin's maximum principle (PMP). A modification of the method of successive approximations is then used to solve the PMP, giving rise to an alternative training algorithm for deep learning. This approach has the advantage that rigorous error estimates and convergence results can be established. We also show that it may avoid some pitfalls of gradient-based methods, such as slow convergence on flat landscapes near saddle points. Furthermore, we demonstrate that it obtains favorable initial convergence rate per-iteration, provided Hamiltonian maximization can be efficiently carried out - a step which is still in need…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Gaussian Processes and Bayesian Inference · Mathematical Biology Tumor Growth