Trust Region Method for Coupled Systems of PDE Solvers and Deep Neural Networks
Kailai Xu, Eric Darve

TL;DR
This paper introduces trust region methods for optimizing coupled PDE systems and neural networks, demonstrating faster convergence and higher accuracy than traditional first- and quasi-Newton methods in physics-informed machine learning.
Contribution
The paper proposes a novel trust region optimization algorithm tailored for coupled PDE-DNN systems, addressing convergence and accuracy issues of existing methods.
Findings
Trust region methods converge faster than ADAM, BFGS, and L-BFGS.
The new approach achieves higher accuracy in coupled PDE-DNN problems.
The algorithm efficiently computes Hessians using computational graphs.
Abstract
Physics-informed machine learning and inverse modeling require the solution of ill-conditioned non-convex optimization problems. First-order methods, such as SGD and ADAM, and quasi-Newton methods, such as BFGS and L-BFGS, have been applied with some success to optimization problems involving deep neural networks in computational engineering inverse problems. However, empirical evidence shows that convergence and accuracy for these methods remain a challenge. Our study unveiled at least two intrinsic defects of these methods when applied to coupled systems of partial differential equations (PDEs) and deep neural networks (DNNs): (1) convergence is often slow with long plateaus that make it difficult to determine whether the method has converged or not; (2) quasi-Newton methods do not provide a sufficiently accurate approximation of the Hessian matrix; this typically leads to early…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Numerical methods in inverse problems · Gaussian Processes and Bayesian Inference
