Adaptive First- and Second-Order Algorithms for Large-Scale Machine   Learning

Sanae Lotfi; Tiphaine Bonniot de Ruisselet; Dominique Orban; Andrea; Lodi

arXiv:2111.14761·cs.LG·November 30, 2021

Adaptive First- and Second-Order Algorithms for Large-Scale Machine Learning

Sanae Lotfi, Tiphaine Bonniot de Ruisselet, Dominique Orban, Andrea, Lodi

PDF

TL;DR

This paper introduces adaptive first- and second-order optimization algorithms for large-scale machine learning, improving efficiency and performance in nonconvex deep learning tasks through novel stochastic regularization and L-BFGS methods.

Contribution

It presents a new framework for adaptive first-order methods with stochastic regularization and a stochastic damped L-BFGS algorithm tailored for deep learning.

Findings

01

Algorithms show promising performance on deep learning datasets

02

Adaptive sampling and step size improve convergence

03

Second-order method enhances optimization in nonconvex settings

Abstract

In this paper, we consider both first- and second-order techniques to address continuous optimization problems arising in machine learning. In the first-order case, we propose a framework of transition from deterministic or semi-deterministic to stochastic quadratic regularization methods. We leverage the two-phase nature of stochastic optimization to propose a novel first-order algorithm with adaptive sampling and adaptive step size. In the second-order case, we propose a novel stochastic damped L-BFGS method that improves on previous algorithms in the highly nonconvex context of deep learning. Both algorithms are evaluated on well-known deep learning datasets and exhibit promising performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.