Convergence analysis of wide shallow neural operators within the framework of Neural Tangent Kernel
Xianliang Xu, Ye Li, Zhongyi Huang

TL;DR
This paper analyzes the training convergence of wide shallow neural operators using Neural Tangent Kernel theory, showing that over-parameterization guarantees global convergence of gradient descent.
Contribution
It provides the first convergence analysis of gradient descent for neural operators within the NTK framework, highlighting the role of over-parameterization.
Findings
Gradient descent converges linearly to a global minimum.
Over-parameterization ensures training success regardless of initial conditions.
The analysis applies to both continuous and discrete time settings.
Abstract
Neural operators are aiming at approximating operators mapping between Banach spaces of functions, achieving much success in the field of scientific computing. Compared to certain deep learning-based solvers, such as Physics-Informed Neural Networks (PINNs), Deep Ritz Method (DRM), neural operators can solve a class of Partial Differential Equations (PDEs). Although much work has been done to analyze the approximation and generalization error of neural operators, there is still a lack of analysis on their training error. In this work, we conduct the convergence analysis of gradient descent for the wide shallow neural operators and physics-informed shallow neural operators within the framework of Neural Tangent Kernel (NTK). The core idea lies on the fact that over-parameterization and random initialization together ensure that each weight vector remains near its initialization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
