Asymptotic behaviour of learning rates in Armijo's condition
Tuyen Trung Truong, Tuan Hang Nguyen

TL;DR
This paper analyzes the asymptotic behavior of learning rates in Armijo's condition within Backtracking Gradient Descent, showing boundedness near non-degenerate critical points and exploring differences at degenerate points, supported by experiments.
Contribution
It provides a theoretical characterization of learning rate bounds in Backtracking GD near critical points and clarifies the units of the learning rate in this context.
Findings
Learning rates are bounded near non-degenerate critical points.
Behavior differs significantly at degenerate critical points.
Backtracking GD's learning rate has a meaningful physical unit.
Abstract
Fix a constant . For a function , a point and a positive number , we say that Armijo's condition is satisfied if . It is a basis for the well known Backtracking Gradient Descent (Backtracking GD) algorithm. Consider a sequence defined by , for positive numbers for which Armijo's condition is satisfied. We show that if converges to a non-degenerate critical point, then must be bounded. Moreover this boundedness can be quantified in terms of the norms of the Hessian and its inverse at the limit point. This complements the first author's results on Unbounded Backtracking GD, and shows that in case of convergence to a non-degenerate critical point the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Model Reduction and Neural Networks
