Neural Ordinary Differential Equations
Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, David Duvenaud

TL;DR
This paper introduces neural ordinary differential equations (ODEs), a new deep learning framework that models hidden states as continuous functions, enabling memory-efficient, adaptable, and scalable training of neural networks with continuous depth.
Contribution
It proposes a novel continuous-depth neural network architecture using ODEs, with scalable backpropagation methods for end-to-end training, and introduces continuous normalizing flows for generative modeling.
Findings
Models have constant memory cost during training.
The approach enables adaptive evaluation strategies for different inputs.
Continuous normalizing flows can be trained via maximum likelihood without data partitioning.
Abstract
We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black-box differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions. For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Neural Ordinary Differential Equations· youtube
Taxonomy
TopicsModel Reduction and Neural Networks · Generative Adversarial Networks and Image Synthesis · Computational Physics and Python Applications
