Efficient Sign-Based Optimization: Accelerating Convergence via Variance   Reduction

Wei Jiang; Sifan Yang; Wenhao Yang; Lijun Zhang

arXiv:2406.00489·cs.LG·December 16, 2024·1 cites

Efficient Sign-Based Optimization: Accelerating Convergence via Variance Reduction

Wei Jiang, Sifan Yang, Wenhao Yang, Lijun Zhang

PDF

Open Access 1 Video

TL;DR

This paper introduces a variance-reduced sign stochastic gradient descent method that accelerates convergence rates for high-dimensional optimization and distributed learning tasks, outperforming existing signSGD approaches.

Contribution

The paper proposes the Sign-based Stochastic Variance Reduction (SSVR) method, achieving faster convergence rates for signSGD and extending improvements to distributed heterogeneous settings.

Findings

01

Improved convergence rate to $oldsymbol{ ext{O}(d^{1/2}T^{-1/3})}$ for signSGD.

02

Enhanced finite-sum problem convergence to $oldsymbol{ ext{O}(m^{1/4}d^{1/2}T^{-1/2})}$.

03

Distributed algorithms with convergence rates of $oldsymbol{ ext{O}(d^{1/2}T^{-1/2} + dn^{-1/2})}$ and $oldsymbol{ ext{O}(d^{1/4}T^{-1/4})}$.

Abstract

Sign stochastic gradient descent (signSGD) is a communication-efficient method that transmits only the sign of stochastic gradients for parameter updating. Existing literature has demonstrated that signSGD can achieve a convergence rate of $O (d^{1/2} T^{- 1/4})$ , where $d$ represents the dimension and $T$ is the iteration number. In this paper, we improve this convergence rate to $O (d^{1/2} T^{- 1/3})$ by introducing the Sign-based Stochastic Variance Reduction (SSVR) method, which employs variance reduction estimators to track gradients and leverages their signs to update. For finite-sum problems, our method can be further enhanced to achieve a convergence rate of $O (m^{1/4} d^{1/2} T^{- 1/2})$ , where $m$ denotes the number of component functions. Furthermore, we investigate the heterogeneous majority vote in distributed settings and introduce two novel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Efficient Sign-Based Optimization: Accelerating Convergence via Variance Reduction· slideslive

Taxonomy

TopicsEvolutionary Algorithms and Applications