Regularized Gauss-Newton for Optimizing Overparameterized Neural Networks
Adeyemi D. Adeoye, Philipp Christian Petersen, Alberto Bemporad

TL;DR
This paper introduces a regularized Gauss-Newton method tailored for overparameterized two-layer neural networks, leveraging generalized self-concordant functions for adaptive learning rates and improved generalization, with theoretical convergence analysis and empirical validation.
Contribution
It develops a GGN optimization framework with GSC regularization for overparameterized neural networks, enabling adaptive learning rates and enhanced generalization, supported by convergence analysis and experiments.
Findings
GSC regularization improves neural network generalization.
The proposed method converges effectively for overparameterized networks.
Adaptive learning rates require minimal tuning.
Abstract
The generalized Gauss-Newton (GGN) optimization method incorporates curvature estimates into its solution steps, and provides a good approximation to the Newton method for large-scale optimization problems. GGN has been found particularly interesting for practical training of deep neural networks, not only for its impressive convergence speed, but also for its close relation with neural tangent kernel regression, which is central to recent studies that aim to understand the optimization and generalization properties of neural networks. This work studies a GGN method for optimizing a two-layer neural network with explicit regularization. In particular, we consider a class of generalized self-concordant (GSC) functions that provide smooth approximations to commonly-used penalty terms in the objective function of the optimization problem. This approach provides an adaptive learning rate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Control Systems and Identification
