Convergence Analysis of Natural Gradient Descent for Over-parameterized Physics-Informed Neural Networks
Xianliang Xu, Ting Du, Wang Kong, Bin Shan, Ye Li, Zhongyi Huang

TL;DR
This paper analyzes the convergence of natural gradient descent for over-parameterized physics-informed neural networks, showing it achieves faster, Gram matrix-independent convergence rates, especially with smooth activation functions.
Contribution
It introduces a convergence analysis for NGD in training two-layer PINNs, demonstrating improved learning rates and independence from the Gram matrix compared to standard gradient descent.
Findings
NGD achieves a convergence rate independent of the Gram matrix.
For smooth activation functions, NGD's convergence is quadratic.
Numerical experiments confirm theoretical convergence improvements.
Abstract
In the context of over-parameterization, there is a line of work demonstrating that randomly initialized (stochastic) gradient descent (GD) converges to a globally optimal solution at a linear convergence rate for the quadratic loss function. However, the learning rate of GD for training two-layer neural networks exhibits poor dependence on the sample size and the Gram matrix, leading to a slow training process. In this paper, we show that for training two-layer Physics-Informed Neural Networks (PINNs), the learning rate can be improved from to , implying that GD actually enjoys a faster convergence rate. Despite such improvements, the convergence rate is still tied to the least eigenvalue of the Gram matrix, leading to slow convergence. We then develop the positive definiteness of Gram matrices with general…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and ELM
MethodsSparse Evolutionary Training · Natural Gradient Descent
