Continuous-in-Depth Neural Networks

Alejandro F. Queiruga; N. Benjamin Erichson; Dane Taylor; Michael; W. Mahoney

arXiv:2008.02389·cs.LG·August 7, 2020·26 cites

Continuous-in-Depth Neural Networks

Alejandro F. Queiruga, N. Benjamin Erichson, Dane Taylor, Michael, W. Mahoney

PDF

Open Access 4 Repos

TL;DR

ContinuousNet introduces a continuous-in-depth neural network architecture inspired by advanced numerical integration schemes, enabling flexible evaluation, efficient training, and inference with minimal accuracy loss.

Contribution

It proposes ContinuousNet, a novel continuous-in-depth neural network model that leverages higher-order numerical schemes for improved flexibility and efficiency.

Findings

01

ContinuousNets can be evaluated with different step sizes and schemes.

02

Incremental-in-depth training improves model quality and reduces training time.

03

Decreasing units in the graph allows faster inference with little accuracy loss.

Abstract

Recent work has attempted to interpret residual networks (ResNets) as one step of a forward Euler discretization of an ordinary differential equation, focusing mainly on syntactic algebraic similarities between the two systems. Discrete dynamical integrators of continuous dynamical systems, however, have a much richer structure. We first show that ResNets fail to be meaningful dynamical integrators in this richer sense. We then demonstrate that neural network models can learn to represent continuous dynamical systems, with this richer structure and properties, by embedding them into higher-order numerical integration schemes, such as the Runge Kutta schemes. Based on these insights, we introduce ContinuousNet as a continuous-in-depth generalization of ResNet architectures. ContinuousNets exhibit an invariance to the particular computational graph manifestation. That is, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Neural Networks and Applications · Adversarial Robustness in Machine Learning

MethodsAverage Pooling · Convolution · Residual Connection · Batch Normalization · 1x1 Convolution · Global Average Pooling · Kaiming Initialization · *Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Residual Block