Quantifying Training Difficulty and Accelerating Convergence in Neural   Network-Based PDE Solvers

Chuqi Chen; Qixuan Zhou; Yahong Yang; Yang Xiang; Tao Luo

arXiv:2410.06308·math.NA·October 10, 2024

Quantifying Training Difficulty and Accelerating Convergence in Neural Network-Based PDE Solvers

Chuqi Chen, Qixuan Zhou, Yahong Yang, Yang Xiang, Tao Luo

PDF

Open Access

TL;DR

This paper analyzes how initialization techniques affect the training difficulty and convergence speed of neural network-based PDE solvers, proposing methods to improve training efficiency through eigenvalue and effective rank analysis.

Contribution

It introduces a theoretical framework linking eigenvalue distribution and effective rank to training difficulty, and demonstrates how PoU and VS initializations accelerate convergence in PDE solvers.

Findings

01

Effective rank correlates with faster training convergence.

02

PoU and VS initializations improve effective rank and training speed.

03

Experimental results confirm theoretical predictions across multiple PDE frameworks.

Abstract

Neural network-based methods have emerged as powerful tools for solving partial differential equations (PDEs) in scientific and engineering applications, particularly when handling complex domains or incorporating empirical data. These methods leverage neural networks as basis functions to approximate PDE solutions. However, training such networks can be challenging, often resulting in limited accuracy. In this paper, we investigate the training dynamics of neural network-based PDE solvers with a focus on the impact of initialization techniques. We assess training difficulty by analyzing the eigenvalue distribution of the kernel and apply the concept of effective rank to quantify this difficulty, where a larger effective rank correlates with faster convergence of the training error. Building upon this, we discover through theoretical analysis and numerical experiments that two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Focus