On the training and generalization of deep operator networks
Sanghyun Lee, Yeonjong Shin

TL;DR
This paper introduces a two-step training method for DeepONets that improves stability and generalization, supported by theoretical error estimates and numerical experiments on complex flow problems.
Contribution
A novel two-step training approach for DeepONets that enhances stability and generalization, with theoretical analysis and practical demonstrations.
Findings
Two-step training improves DeepONet stability.
The method enhances generalization performance.
Numerical tests validate effectiveness on Darcy flow.
Abstract
We present a novel training method for deep operator networks (DeepONets), one of the most popular neural network models for operators. DeepONets are constructed by two sub-networks, namely the branch and trunk networks. Typically, the two sub-networks are trained simultaneously, which amounts to solving a complex optimization problem in a high dimensional space. In addition, the nonconvex and nonlinear nature makes training very challenging. To tackle such a challenge, we propose a two-step training method that trains the trunk network first and then sequentially trains the branch network. The core mechanism is motivated by the divide-and-conquer paradigm and is the decomposition of the entire complex training task into two subtasks with reduced complexity. Therein the Gram-Schmidt orthonormalization process is introduced which significantly improves stability and generalization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEnhanced Oil Recovery Techniques · Machine Learning and ELM · Image and Signal Denoising Methods
