Distributed Gradient Methods with Variable Number of Working Nodes
Dusan Jakovetic, Dragana Bajovic, Natasa Krejic, and Natasa, Krklec-Jerinkic

TL;DR
This paper introduces a distributed gradient method where nodes probabilistically participate in updates, reducing communication and computation costs while maintaining convergence to the optimal solution under certain conditions.
Contribution
It proposes a novel probabilistic activation scheme for distributed gradient methods that ensures convergence and reduces computational and communication costs.
Findings
Converges in mean square sense when activation probability approaches one.
Achieves linear convergence rate with linearly increasing activation probability.
Significantly reduces communication and gradient evaluations compared to standard methods.
Abstract
We consider distributed optimization where nodes in a connected network minimize the sum of their local costs subject to a common constraint set. We propose a distributed projected gradient method where each node, at each iteration , performs an update (is active) with probability , and stays idle (is inactive) with probability . Whenever active, each node performs an update by weight-averaging its solution estimate with the estimates of its active neighbors, taking a negative gradient step with respect to its local cost, and performing a projection onto the constraint set; inactive nodes perform no updates. Assuming that nodes' local costs are strongly convex, with Lipschitz continuous gradients, we show that, as long as activation probability grows to one asymptotically, our algorithm converges in the mean square sense (MSS) to the same solution as the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
