Convergence Analysis of Natural Gradient Descent for Over-parameterized Physics-Informed Neural Networks

Xianliang Xu; Ting Du; Wang Kong; Bin Shan; Ye Li; Zhongyi Huang

arXiv:2408.00573·cs.LG·June 16, 2025

Convergence Analysis of Natural Gradient Descent for Over-parameterized Physics-Informed Neural Networks

Xianliang Xu, Ting Du, Wang Kong, Bin Shan, Ye Li, Zhongyi Huang

PDF

Open Access

TL;DR

This paper analyzes the convergence of natural gradient descent for over-parameterized physics-informed neural networks, showing it achieves faster, Gram matrix-independent convergence rates, especially with smooth activation functions.

Contribution

It introduces a convergence analysis for NGD in training two-layer PINNs, demonstrating improved learning rates and independence from the Gram matrix compared to standard gradient descent.

Findings

01

NGD achieves a convergence rate independent of the Gram matrix.

02

For smooth activation functions, NGD's convergence is quadratic.

03

Numerical experiments confirm theoretical convergence improvements.

Abstract

In the context of over-parameterization, there is a line of work demonstrating that randomly initialized (stochastic) gradient descent (GD) converges to a globally optimal solution at a linear convergence rate for the quadratic loss function. However, the learning rate of GD for training two-layer neural networks exhibits poor dependence on the sample size and the Gram matrix, leading to a slow training process. In this paper, we show that for training two-layer $ReLU^{3}$ Physics-Informed Neural Networks (PINNs), the learning rate can be improved from $O (λ_{0})$ to $O (1/∥ H^{\infty} ∥_{2})$ , implying that GD actually enjoys a faster convergence rate. Despite such improvements, the convergence rate is still tied to the least eigenvalue of the Gram matrix, leading to slow convergence. We then develop the positive definiteness of Gram matrices with general…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and ELM

MethodsSparse Evolutionary Training · Natural Gradient Descent