Convergence analysis of wide shallow neural operators within the   framework of Neural Tangent Kernel

Xianliang Xu; Ye Li; Zhongyi Huang

arXiv:2412.05545·cs.LG·January 13, 2025

Convergence analysis of wide shallow neural operators within the framework of Neural Tangent Kernel

Xianliang Xu, Ye Li, Zhongyi Huang

PDF

Open Access

TL;DR

This paper analyzes the training convergence of wide shallow neural operators using Neural Tangent Kernel theory, showing that over-parameterization guarantees global convergence of gradient descent.

Contribution

It provides the first convergence analysis of gradient descent for neural operators within the NTK framework, highlighting the role of over-parameterization.

Findings

01

Gradient descent converges linearly to a global minimum.

02

Over-parameterization ensures training success regardless of initial conditions.

03

The analysis applies to both continuous and discrete time settings.

Abstract

Neural operators are aiming at approximating operators mapping between Banach spaces of functions, achieving much success in the field of scientific computing. Compared to certain deep learning-based solvers, such as Physics-Informed Neural Networks (PINNs), Deep Ritz Method (DRM), neural operators can solve a class of Partial Differential Equations (PDEs). Although much work has been done to analyze the approximation and generalization error of neural operators, there is still a lack of analysis on their training error. In this work, we conduct the convergence analysis of gradient descent for the wide shallow neural operators and physics-informed shallow neural operators within the framework of Neural Tangent Kernel (NTK). The core idea lies on the fact that over-parameterization and random initialization together ensure that each weight vector remains near its initialization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications