Convergence and Dynamical Behavior of the ADAM Algorithm for Non-Convex Stochastic Optimization
Anas Barakat, Pascal Bianchi

TL;DR
This paper analyzes the convergence and dynamical behavior of the Adam optimization algorithm in non-convex stochastic settings, introducing a continuous-time approximation and a decreasing stepsize variant, with theoretical guarantees and fluctuation analysis.
Contribution
It provides a novel continuous-time ODE framework for Adam, establishes convergence results, and introduces a decreasing stepsize version with almost sure convergence.
Findings
Convergence of Adam iterates to stationary points under stability conditions.
Weak convergence of the interpolated Adam process to the ODE solution.
Almost sure convergence of the decreasing stepsize Adam to critical points.
Abstract
Adam is a popular variant of stochastic gradient descent for finding a local minimizer of a function. In the constant stepsize regime, assuming that the objective function is differentiable and non-convex, we establish the convergence in the long run of the iterates to a stationary point under a stability condition. The key ingredient is the introduction of a continuous-time version of Adam, under the form of a non-autonomous ordinary differential equation. This continuous-time system is a relevant approximation of the Adam iterates, in the sense that the interpolated Adam process converges weakly towards the solution to the ODE. The existence and the uniqueness of the solution are established. We further show the convergence of the solution towards the critical points of the objective function and quantify its convergence rate under a Lojasiewicz assumption. Then, we introduce a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Stochastic processes and financial applications · Risk and Portfolio Optimization
MethodsAdam
