Predictive Local Smoothness for Stochastic Gradient Methods
Jun Li, Hongfu Liu, Bineng Zhong, Yue Wu, and Yun Fu

TL;DR
This paper introduces Predictive Local Smoothness (PLS), a method that adaptively adjusts learning rates based on local gradient smoothness to improve convergence in stochastic gradient methods for nonconvex optimization.
Contribution
The paper proposes a novel PLS method that predicts local smoothness to adapt learning rates, enhancing convergence and performance of stochastic gradient algorithms.
Findings
PLS variants achieve faster convergence than traditional methods.
Empirical results show improved stability and reduced gradient issues.
Theoretical proofs confirm linear convergence of the proposed methods.
Abstract
Stochastic gradient methods are dominant in nonconvex optimization especially for deep models but have low asymptotical convergence due to the fixed smoothness. To address this problem, we propose a simple yet effective method for improving stochastic gradient methods named predictive local smoothness (PLS). First, we create a convergence condition to build a learning rate which varies adaptively with local smoothness. Second, the local smoothness can be predicted by the latest gradients. Third, we use the adaptive learning rate to update the stochastic gradients for exploring linear convergence rates. By applying the PLS method, we implement new variants of three popular algorithms: PLS-stochastic gradient descent (PLS-SGD), PLS-accelerated SGD (PLS-AccSGD), and PLS-AMSGrad. Moreover, we provide much simpler proofs to ensure their linear convergence. Empirical results show that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and ELM
MethodsStochastic Gradient Descent
