Predictive Local Smoothness for Stochastic Gradient Methods

Jun Li; Hongfu Liu; Bineng Zhong; Yue Wu; and Yun Fu

arXiv:1805.09386·cs.LG·May 25, 2018·1 cites

Predictive Local Smoothness for Stochastic Gradient Methods

Jun Li, Hongfu Liu, Bineng Zhong, Yue Wu, and Yun Fu

PDF

Open Access

TL;DR

This paper introduces Predictive Local Smoothness (PLS), a method that adaptively adjusts learning rates based on local gradient smoothness to improve convergence in stochastic gradient methods for nonconvex optimization.

Contribution

The paper proposes a novel PLS method that predicts local smoothness to adapt learning rates, enhancing convergence and performance of stochastic gradient algorithms.

Findings

01

PLS variants achieve faster convergence than traditional methods.

02

Empirical results show improved stability and reduced gradient issues.

03

Theoretical proofs confirm linear convergence of the proposed methods.

Abstract

Stochastic gradient methods are dominant in nonconvex optimization especially for deep models but have low asymptotical convergence due to the fixed smoothness. To address this problem, we propose a simple yet effective method for improving stochastic gradient methods named predictive local smoothness (PLS). First, we create a convergence condition to build a learning rate which varies adaptively with local smoothness. Second, the local smoothness can be predicted by the latest gradients. Third, we use the adaptive learning rate to update the stochastic gradients for exploring linear convergence rates. By applying the PLS method, we implement new variants of three popular algorithms: PLS-stochastic gradient descent (PLS-SGD), PLS-accelerated SGD (PLS-AccSGD), and PLS-AMSGrad. Moreover, we provide much simpler proofs to ensure their linear convergence. Empirical results show that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and ELM

MethodsStochastic Gradient Descent