Zeroth-Order Algorithms for Stochastic Distributed Nonconvex   Optimization

Xinlei Yi; Shengjun Zhang; Tao Yang; and Karl H. Johansson

arXiv:2106.02958·math.OC·January 11, 2022·Autom.·1 cites

Zeroth-Order Algorithms for Stochastic Distributed Nonconvex Optimization

Xinlei Yi, Shengjun Zhang, Tao Yang, and Karl H. Johansson

PDF

Open Access

TL;DR

This paper introduces two distributed zeroth-order algorithms for stochastic nonconvex optimization, achieving the first linear speedup convergence rates under general variance assumptions, with applications demonstrated in neural network adversarial example generation.

Contribution

The paper presents the first distributed ZO algorithms with proven linear speedup convergence rates under general variance conditions, extending the applicability of ZO methods in distributed machine learning.

Findings

01

Achieved linear speedup convergence rate of (\u221a{p/(nT)}) for smooth functions.

02

Established convergence rate of (p/(nT)) under Polyak-ojasiewicz condition.

03

Demonstrated efficiency in neural network adversarial example generation.

Abstract

In this paper, we consider a stochastic distributed nonconvex optimization problem with the cost function being distributed over $n$ agents having access only to zeroth-order (ZO) information of the cost. This problem has various machine learning applications. As a solution, we propose two distributed ZO algorithms, in which at each iteration each agent samples the local stochastic ZO oracle at two points with a time-varying smoothing parameter. We show that the proposed algorithms achieve the linear speedup convergence rate $O (p / (n T))$ for smooth cost functions under the state-dependent variance assumptions which are more general than the commonly used bounded variance and Lipschitz assumptions, and $O (p / (n T))$ convergence rate when the global cost function additionally satisfies the Polyak--{\L}ojasiewicz (P--{\L}) condition in addition, where $p$ and $T$ …

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and ELM