Complexities of Armijo-like algorithms in Deep Learning context

Bensaid Bilel (IMB)

arXiv:2412.14637·math.OC·December 20, 2024

Complexities of Armijo-like algorithms in Deep Learning context

Bensaid Bilel (IMB)

PDF

Open Access

TL;DR

This paper investigates Armijo-like algorithms in Deep Learning, demonstrating their potential for acceleration and optimal complexity under conditions relevant to non-convex, highly non-smooth optimization problems.

Contribution

The work extends Armijo algorithm analysis to Deep Learning contexts, establishing new complexity bounds under (L0, L1) smoothness and analyticity assumptions.

Findings

01

Armijo variants achieve acceleration in Deep Learning optimization.

02

New complexity bounds depend on smoothness constants and initial gap.

03

Armijo-like conditions are effective for highly non-convex problems.

Abstract

The classical Armijo backtracking algorithm achieves the optimal complexity for smooth functions like gradient descent but without any hyperparameter tuning. However, the smoothness assumption is not suitable for Deep Learning optimization. In this work, we show that some variants of the Armijo optimizer achieves acceleration and optimal complexities under assumptions more suited for Deep Learning: the (L 0 , L 1 ) smoothness condition and analyticity. New dependences on the smoothness constants and the initial gap are established. The results theoretically highlight the powerful efficiency of Armijo-like conditions for highly non-convex problems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Statistical Methods and Models