Grassmannian Geometry and Global Convergence of Variable Projection for Neural Networks
Mathias Dus (IRMA)

TL;DR
This paper explores the use of Grassmannian geometry to analyze and improve the convergence of the variable projection method in training neural networks, especially for problems like PINNs and heat equations.
Contribution
It introduces a geometric Grassmannian framework for analyzing VarPro in neural networks and addresses rank-deficiency issues with regularization.
Findings
Effective convergence analysis of VarPro on neural networks.
Robustness of properties except in rank-deficient regimes.
Numerical experiments demonstrate practical effectiveness.
Abstract
Training deep neural networks and Physics-Informed Neural Networks (PINNs) often leads to ill-conditioned and stiff optimization problems. A key structural feature of these models is that they are linear in the output-layer parameters and nonlinear in the hiddenlayer parameters, yielding a separable nonlinear least-squares formulation. In this work, we study the classical variable projection (VarPro) method for such problems in the context of deep neural networks. We provide a geometric formulation on the Grassmannian and analyze the structure of critical points and convergence properties of the reduced problem. When the feature map is parametrized by a neural network, we show that these properties persist except in rank-deficient regimes, which we address via a regularized Grassmannian framework. Numerical experiments for regression and PINNs, including an efficient solver for the heat…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Stochastic Gradient Optimization Techniques · Machine Learning in Materials Science
