On Convergence-Diagnostic based Step Sizes for Stochastic Gradient   Descent

Scott Pesme; Aymeric Dieuleveut; Nicolas Flammarion

arXiv:2007.00534·cs.LG·July 2, 2020·5 cites

On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent

Scott Pesme, Aymeric Dieuleveut, Nicolas Flammarion

PDF

Open Access 1 Video

TL;DR

This paper introduces a new statistical method to detect the transition from transient to stationary phase in stochastic gradient descent, enabling more efficient step size adjustments for faster convergence.

Contribution

It proposes a novel, simple statistical procedure for accurately detecting stationarity, improving upon classical tests like Pflug's in SGD convergence diagnostics.

Findings

01

The new test accurately detects stationarity in SGD.

02

Experimental results show state-of-the-art performance on synthetic datasets.

03

The method outperforms classical tests in convergence detection.

Abstract

Constant step-size Stochastic Gradient Descent exhibits two phases: a transient phase during which iterates make fast progress towards the optimum, followed by a stationary phase during which iterates oscillate around the optimal point. In this paper, we show that efficiently detecting this transition and appropriately decreasing the step size can lead to fast convergence rates. We analyse the classical statistical test proposed by Pflug (1983), based on the inner product between consecutive stochastic gradients. Even in the simple case where the objective function is quadratic we show that this test cannot lead to an adequate convergence diagnostic. We then propose a novel and simple statistical procedure that accurately detects stationarity and we provide experimental results showing state-of-the-art performance on synthetic and real-world datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent· slideslive

Taxonomy

TopicsNeural Networks and Applications · Sparse and Compressive Sensing Techniques · Blind Source Separation Techniques