Optimization for Neural Operators can Benefit from Width

Pedro Cisneros-Velarde; Bhavesh Shrimali; Arindam Banerjee

arXiv:2502.00705·cs.LG·February 4, 2025

Optimization for Neural Operators can Benefit from Width

Pedro Cisneros-Velarde, Bhavesh Shrimali, Arindam Banerjee

PDF

Open Access 1 Video

TL;DR

This paper establishes convergence guarantees for gradient descent in neural operators like DONs and FNOs, showing that wider networks improve optimization, supported by theoretical analysis and empirical experiments.

Contribution

The paper introduces a unified framework for analyzing GD convergence in neural operators and demonstrates that wider networks enhance optimization performance.

Findings

01

GD convergence is guaranteed under RSC and smoothness conditions.

02

Wider neural operators lead to better optimization convergence.

03

Empirical results support the theoretical findings.

Abstract

Neural Operators that directly learn mappings between function spaces, such as Deep Operator Networks (DONs) and Fourier Neural Operators (FNOs), have received considerable attention. Despite the universal approximation guarantees for DONs and FNOs, there is currently no optimization convergence guarantee for learning such networks using gradient descent (GD). In this paper, we address this open problem by presenting a unified framework for optimization based on GD and applying it to establish convergence guarantees for both DONs and FNOs. In particular, we show that the losses associated with both of these neural operators satisfy two conditions -- restricted strong convexity (RSC) and smoothness -- that guarantee a decrease on their loss values due to GD. Remarkably, these two conditions are satisfied for each neural operator due to different reasons associated with the architectural…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Optimization for Neural Operators can Benefit from Width· slideslive

Taxonomy

TopicsNeural Networks and Applications