An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes
Antonio Orvieto, Lin Xiao

TL;DR
This paper introduces an adaptive stochastic gradient method utilizing non-negative Gauss-Newton stepsizes, which automatically adjusts learning rates and guarantees convergence without needing Lipschitz constants, showing improved empirical performance.
Contribution
The paper presents a novel adaptive stochastic gradient algorithm based on non-negative Gauss-Newton stepsizes, with a rigorous convergence analysis for both convex and non-convex problems.
Findings
Algorithm is as efficient as vanilla stochastic gradient methods.
Method automatically warmups and decays stepsizes.
Empirical results outperform classical and other adaptive methods.
Abstract
We consider the problem of minimizing the average of a large number of smooth but possibly non-convex functions. In the context of most machine learning applications, each loss function is non-negative and thus can be expressed as the composition of a square and its real-valued square root. This reformulation allows us to apply the Gauss-Newton method, or the Levenberg-Marquardt method when adding a quadratic regularization. The resulting algorithm, while being computationally as efficient as the vanilla stochastic gradient method, is highly adaptive and can automatically warmup and decay the effective stepsize while tracking the non-negative loss landscape. We provide a tight convergence analysis, leveraging new techniques, in the stochastic convex and non-convex settings. In particular, in the convex case, the method does not require access to the gradient Lipshitz constant for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Advanced Numerical Analysis Techniques · Sparse and Compressive Sensing Techniques
