On the training and generalization of deep operator networks

Sanghyun Lee; Yeonjong Shin

arXiv:2309.01020·math.NA·September 6, 2023·1 cites

On the training and generalization of deep operator networks

Sanghyun Lee, Yeonjong Shin

PDF

Open Access

TL;DR

This paper introduces a two-step training method for DeepONets that improves stability and generalization, supported by theoretical error estimates and numerical experiments on complex flow problems.

Contribution

A novel two-step training approach for DeepONets that enhances stability and generalization, with theoretical analysis and practical demonstrations.

Findings

01

Two-step training improves DeepONet stability.

02

The method enhances generalization performance.

03

Numerical tests validate effectiveness on Darcy flow.

Abstract

We present a novel training method for deep operator networks (DeepONets), one of the most popular neural network models for operators. DeepONets are constructed by two sub-networks, namely the branch and trunk networks. Typically, the two sub-networks are trained simultaneously, which amounts to solving a complex optimization problem in a high dimensional space. In addition, the nonconvex and nonlinear nature makes training very challenging. To tackle such a challenge, we propose a two-step training method that trains the trunk network first and then sequentially trains the branch network. The core mechanism is motivated by the divide-and-conquer paradigm and is the decomposition of the entire complex training task into two subtasks with reduced complexity. Therein the Gram-Schmidt orthonormalization process is introduced which significantly improves stability and generalization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEnhanced Oil Recovery Techniques · Machine Learning and ELM · Image and Signal Denoising Methods