Stochastic Taylor Derivative Estimator: Efficient amortization for arbitrary differential operators
Zekun Shi, Zheyuan Hu, Min Lin, Kenji Kawaguchi

TL;DR
This paper introduces an efficient method for computing high-order derivatives of multivariate functions, significantly speeding up neural network training involving complex differential operators, especially in large-scale physics-informed problems.
Contribution
It presents a novel approach to perform arbitrary contraction of derivative tensors using high-order auto-differentiation, enabling efficient randomization for multivariate functions.
Findings
>1000× speed-up in PINNs training
>30× memory reduction compared to first-order AD
Solved 1-million-dimensional PDEs in 8 minutes on a single GPU
Abstract
Optimizing neural networks with loss that contain high-dimensional and high-order differential operators is expensive to evaluate with back-propagation due to scaling of the derivative tensor size and the scaling in the computation graph, where is the dimension of the domain, is the number of ops in the forward computation graph, and is the derivative order. In previous works, the polynomial scaling in was addressed by amortizing the computation over the optimization process via randomization. Separately, the exponential scaling in for univariate functions () was addressed with high-order auto-differentiation (AD). In this work, we show how to efficiently perform arbitrary contraction of the derivative tensor of arbitrary order for multivariate functions, by properly constructing the input tangents to univariate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFluid Dynamics and Turbulent Flows
