Hybrid Decentralized Optimization: Leveraging Both First- and Zeroth-Order Optimizers for Faster Convergence
Matin Ansaripour, Shayan Talaei, Giorgi Nadiradze, Dan Alistarh

TL;DR
This paper introduces a hybrid decentralized optimization framework combining first- and zeroth-order methods, demonstrating improved convergence and robustness in distributed machine learning, especially with resource-constrained nodes.
Contribution
It pioneers the study of hybrid decentralized optimization with both first- and zeroth-order agents, providing new analysis and practical algorithms for such systems.
Findings
Hybrid systems can tolerate noisier zeroth-order agents.
Integrating zeroth-order agents can improve convergence.
Experimental results confirm practicality on neural networks.
Abstract
Distributed optimization is the standard way of speeding up machine learning training, and most of the research in the area focuses on distributed first-order, gradient-based methods. Yet, there are settings where some computationally-bounded nodes may not be able to implement first-order, gradient-based optimization, while they could still contribute to joint optimization tasks. In this paper, we initiate the study of hybrid decentralized optimization, studying settings where nodes with zeroth-order and first-order optimization capabilities co-exist in a distributed system, and attempt to jointly solve an optimization task over some data distribution. We essentially show that, under reasonable parameter settings, such a system can not only withstand noisier zeroth-order agents but can even benefit from integrating such agents into the optimization process, rather than ignoring their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Distributed Control Multi-Agent Systems · Molecular Communication and Nanonetworks
