Translating Numerical Concepts for PDEs into Neural Architectures

Tobias Alt; Pascal Peter; Joachim Weickert; Karl Schrader

arXiv:2103.15419·math.NA·May 18, 2021

Translating Numerical Concepts for PDEs into Neural Architectures

Tobias Alt, Pascal Peter, Joachim Weickert, Karl Schrader

PDF

TL;DR

This paper explores how translating numerical algorithms for PDEs into neural network architectures can inform design choices, ensuring stability and efficiency, and providing a numerical perspective on neural network success.

Contribution

It introduces a framework connecting numerical PDE schemes with neural network architectures, offering stability guarantees and novel design insights.

Findings

01

ResNets with transposed convolution layers are Euclidean stable.

02

Skip connections can be interpreted as time discretizations and extrapolation mechanisms.

03

Uncommon design choices like nonmonotone activations are justified numerically.

Abstract

We investigate what can be learned from translating numerical algorithms into neural networks. On the numerical side, we consider explicit, accelerated explicit, and implicit schemes for a general higher order nonlinear diffusion equation in 1D, as well as linear multigrid methods. On the neural network side, we identify corresponding concepts in terms of residual networks (ResNets), recurrent networks, and U-nets. These connections guarantee Euclidean stability of specific ResNets with a transposed convolution layer structure in each block. We present three numerical justifications for skip connections: as time discretisations in explicit schemes, as extrapolation mechanisms for accelerating those methods, and as recurrent connections in fixed point solvers for implicit schemes. Last but not least, we also motivate uncommon design choices such as nonmonotone activation functions. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDiffusion · Transposed convolution · Convolution