Towards Understanding Adam Convergence on Highly Degenerate Polynomials

Zhiwei Bai; Jiajie Zhao; Zhangchen Zhou; Zhi-Qin John Xu; Yaoyu Zhang

arXiv:2603.09581·cs.LG·March 11, 2026

Towards Understanding Adam Convergence on Highly Degenerate Polynomials

Zhiwei Bai, Jiajie Zhao, Zhangchen Zhou, Zhi-Qin John Xu, Yaoyu Zhang

PDF

Open Access

TL;DR

This paper explores Adam's natural convergence properties on a specific class of highly degenerate polynomials, revealing conditions for automatic convergence and demonstrating its superior local linear convergence compared to other methods.

Contribution

It identifies a class of functions where Adam converges automatically without external schedulers and provides theoretical conditions for its local stability and convergence.

Findings

01

Adam converges automatically on certain degenerate polynomials.

02

Adam achieves local linear convergence, outperforming Gradient Descent.

03

Hyperparameter analysis reveals three behavioral regimes for Adam.

Abstract

Adam is a widely used optimization algorithm in deep learning, yet the specific class of objective functions where it exhibits inherent advantages remains underexplored. Unlike prior studies requiring external schedulers and $β_{2}$ near 1 for convergence, this work investigates the "natural" auto-convergence properties of Adam. We identify a class of highly degenerate polynomials where Adam converges automatically without additional schedulers. Specifically, we derive theoretical conditions for local asymptotic stability on degenerate polynomials and demonstrate strong alignment between theoretical bounds and experimental results. We prove that Adam achieves local linear convergence on these degenerate functions, significantly outperforming the sub-linear convergence of Gradient Descent and Momentum. This acceleration stems from a decoupling mechanism between the second moment $v_{t}$ …

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning in Materials Science · Advanced Optimization Algorithms Research