Beyond Finite Layer Neural Networks: Bridging Deep Architectures and   Numerical Differential Equations

Yiping Lu; Aoxiao Zhong; Quanzheng Li; Bin Dong

arXiv:1710.10121·cs.CV·March 24, 2020·153 cites

Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations

Yiping Lu, Aoxiao Zhong, Quanzheng Li, Bin Dong

PDF

Open Access

TL;DR

This paper connects deep neural network architectures with numerical differential equations, proposing a new multi-step design inspired by numerical methods that improves efficiency and accuracy in image classification tasks.

Contribution

It introduces the LM-architecture inspired by linear multi-step methods, enabling more efficient and accurate deep networks with fewer parameters, and links stochastic control to training strategies.

Findings

01

LM-ResNet/LM-ResNeXt outperform ResNet/ResNeXt in accuracy.

02

Networks can be compressed by over 50% while maintaining performance.

03

Stochastic depth enhances generalization of LM-ResNet.

Abstract

In our work, we bridge deep neural network design with numerical differential equations. We show that many effective networks, such as ResNet, PolyNet, FractalNet and RevNet, can be interpreted as different numerical discretizations of differential equations. This finding brings us a brand new perspective on the design of effective deep architectures. We can take advantage of the rich knowledge in numerical analysis to guide us in designing new and potentially more effective deep networks. As an example, we propose a linear multi-step architecture (LM-architecture) which is inspired by the linear multi-step method solving ordinary differential equations. The LM-architecture is an effective structure that can be used on any ResNet-like networks. In particular, we demonstrate that LM-ResNet and LM-ResNeXt (i.e. the networks obtained by applying the LM-architecture on ResNet and ResNeXt…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Neural Networks and Applications · Stochastic Gradient Optimization Techniques

MethodsReversible Residual Block · Average Pooling · ResNeXt Block · Fractal Block · Pointwise Convolution · RevNet · Dense Connections · Softmax · FractalNet · Grouped Convolution