Improving the Adaptive Moment Estimation (ADAM) stochastic optimizer   through an Implicit-Explicit (IMEX) time-stepping approach

Abhinab Bhattacharjee; Andrey A. Popov; Arash Sarshar; Adrian Sandu

arXiv:2403.13704·cs.CE·September 17, 2024·1 cites

Improving the Adaptive Moment Estimation (ADAM) stochastic optimizer through an Implicit-Explicit (IMEX) time-stepping approach

Abhinab Bhattacharjee, Andrey A. Popov, Arash Sarshar, Adrian Sandu

PDF

Open Access 1 Repo

TL;DR

This paper reinterprets Adam as an IMEX Euler discretization of an underlying ODE and introduces higher-order IMEX methods to improve neural network training performance.

Contribution

It presents a novel perspective of Adam as an IMEX scheme and develops higher-order IMEX-based optimizers that outperform classical Adam.

Findings

01

Higher-order IMEX methods improve training results.

02

New algorithms outperform classical Adam on regression tasks.

03

Enhanced convergence properties observed in experiments.

Abstract

The Adam optimizer, often used in Machine Learning for neural network training, corresponds to an underlying ordinary differential equation (ODE) in the limit of very small learning rates. This work shows that the classical Adam algorithm is a first-order implicit-explicit (IMEX) Euler discretization of the underlying ODE. Employing the time discretization point of view, we propose new extensions of the Adam scheme obtained by using higher-order IMEX methods to solve the ODE. Based on this approach, we derive a new optimization algorithm for neural network training that performs better than classical Adam on several regression and classification problems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

computationalsciencelaboratory/control-pinns
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSimulation Techniques and Applications · Neural Networks and Applications · Model Reduction and Neural Networks

MethodsAdam