Variance-Reduced Gradient Estimator for Nonconvex Zeroth-Order Distributed Optimization
Huaiyi Mu, Yujie Tang, Jie Song, Zhongkui Li

TL;DR
This paper introduces a variance-reduced gradient estimator for distributed zeroth-order nonconvex optimization, improving convergence and reducing sampling costs.
Contribution
It proposes a novel variance reduction technique combining orthogonal direction renovation and gradient estimation across all dimensions, integrated with gradient tracking.
Findings
Oracle complexity is bounded by O(d/ε) for smooth nonconvex functions.
Oracle complexity is bounded by O(dκ ln(1/ε)) for gradient dominated nonconvex functions.
Numerical simulations demonstrate improved efficiency over existing methods.
Abstract
This paper investigates distributed zeroth-order optimization for smooth nonconvex problems, targeting the trade-off between convergence rate and sampling cost per zeroth-order gradient estimation in current algorithms that use either the -point or -point gradient estimators. We propose a novel variance-reduced gradient estimator that either randomly renovates a single orthogonal direction of the true gradient or calculates the gradient estimation across all dimensions for variance correction, based on a Bernoulli distribution. Integrating this estimator with gradient tracking mechanism allows us to address the trade-off. We show that the oracle complexity of our proposed algorithm is upper bounded by for smooth nonconvex functions and by for smooth and gradient dominated nonconvex functions, where denotes the problem dimension and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
