Review: Ordinary Differential Equations For Deep Learning
Xinshi Chen

TL;DR
This paper reviews the connection between ordinary differential equations and deep neural networks, exploring how ODE-based models can improve understanding, design, and training of neural networks with practical applications.
Contribution
It provides a comprehensive overview of how ODEs relate to neural network architecture design and training, highlighting recent advances and applications.
Findings
Continuous ODE models can outperform traditional DNNs in specific tasks.
ODE discretization schemes inspire new neural network architectures.
Optimal control methods improve neural network training efficiency.
Abstract
To better understand and improve the behavior of neural networks, a recent line of works bridged the connection between ordinary differential equations (ODEs) and deep neural networks (DNNs). The connections are made in two folds: (1) View DNN as ODE discretization; (2) View the training of DNN as solving an optimal control problem. The former connection motivates people either to design neural architectures based on ODE discretization schemes or to replace DNN by a continuous model characterized by ODEs. Several works demonstrated distinct advantages of using a continuous model instead of traditional DNN in some specific applications. The latter connection is inspiring. Based on Pontryagin's maximum principle, which is popular in the optimal control literature, some developed new optimization methods for training neural networks and some developed algorithms to train the infinite-deep…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Neural Networks and Applications · Gaussian Processes and Bayesian Inference
