An improvement of the convergence proof of the ADAM-Optimizer

Sebastian Bock; Josef Goppold; Martin Wei{\ss}

arXiv:1804.10587·cs.LG·April 30, 2018·125 cites

An improvement of the convergence proof of the ADAM-Optimizer

Sebastian Bock, Josef Goppold, Martin Wei{\ss}

PDF

Open Access

TL;DR

This paper identifies errors in the original convergence proof of the ADAM-Optimizer and provides an improved proof to ensure its correctness for neural network training.

Contribution

The paper offers a corrected and improved convergence proof for the widely used ADAM-Optimizer, addressing previous inaccuracies.

Findings

01

Corrected convergence proof for ADAM-Optimizer

02

Enhanced theoretical understanding of optimizer stability

03

Supports reliable neural network training

Abstract

A common way to train neural networks is the Backpropagation. This algorithm includes a gradient descent method, which needs an adaptive step size. In the area of neural networks, the ADAM-Optimizer is one of the most popular adaptive step size methods. It was invented in \cite{Kingma.2015} by Kingma and Ba. The $5865$ citations in only three years shows additionally the importance of the given paper. We discovered that the given convergence proof of the optimizer contains some mistakes, so that the proof will be wrong. In this paper we give an improvement to the convergence proof of the ADAM-Optimizer.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Ferroelectric and Negative Capacitance Devices