Size Lowerbounds for Deep Operator Networks

Anirbit Mukherjee; Amartya Roy

arXiv:2308.06338·cs.LG·February 26, 2024·1 cites

Size Lowerbounds for Deep Operator Networks

Anirbit Mukherjee, Amartya Roy

PDF

Open Access

TL;DR

This paper establishes a data-dependent lower bound on the size of Deep Operator Networks needed to achieve low empirical error, revealing that the output dimension must scale with the fourth root of the data size, impacting training data requirements.

Contribution

It provides the first theoretical lower bound on DeepONet size based on data, linking output dimension scaling to data and error reduction, and validates this with PDE experiments.

Findings

01

Output dimension must scale as the fourth root of data points for low error.

02

Training data size may need to scale quadratically with output dimension.

03

Empirical results support the theoretical lower bounds.

Abstract

Deep Operator Networks are an increasingly popular paradigm for solving regression in infinite dimensions and hence solve families of PDEs in one shot. In this work, we aim to establish a first-of-its-kind data-dependent lowerbound on the size of DeepONets required for them to be able to reduce empirical error on noisy data. In particular, we show that for low training errors to be obtained on $n$ data points it is necessary that the common output dimension of the branch and the trunk net be scaling as $Ω (\leftroot - 1 \uproot - 1 4 n)$ . This inspires our experiments with DeepONets solving the advection-diffusion-reaction PDE, where we demonstrate the possibility that at a fixed model size, to leverage increase in this common output dimension and get monotonic lowering of training error, the size of the training data might necessarily need to scale at least…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Stochastic Gradient Optimization Techniques · Generative Adversarial Networks and Image Synthesis