Straggler-Robust Distributed Optimization with the Parameter Server   Utilizing Coded Gradient

Elie Atallah; Nazanin Rahnavard; Chinwendu Enyioha

arXiv:2007.13688·math.OC·July 28, 2020

Straggler-Robust Distributed Optimization with the Parameter Server Utilizing Coded Gradient

Elie Atallah, Nazanin Rahnavard, Chinwendu Enyioha

PDF

Open Access

TL;DR

This paper introduces a coded gradient-based distributed optimization algorithm that is robust to network delays, node failures, and asynchrony, improving efficiency in complex distributed machine learning systems.

Contribution

It proposes a novel coded gradient method for distributed optimization that adapts to network failures and delays, enhancing robustness and convergence.

Findings

01

Algorithm demonstrates robustness to stragglers and node failures.

02

Simulation results show improved convergence and efficiency.

03

Framework effectively adapts to dynamic network topologies.

Abstract

Optimization in distributed networks plays a central role in almost all distributed machine learning problems. In principle, the use of distributed task allocation has reduced the computational time, allowing better response rates and higher data reliability. However, for these computational algorithms to run effectively in complex distributed systems, the algorithms ought to compensate for communication asynchrony, and network node failures and delays known as stragglers. These issues can change the effective connection topology of the network, which may vary through time, thus hindering the optimization process. In this paper, we propose a new distributed unconstrained optimization algorithm for minimizing a strongly convex function which is adaptable to a parameter server network. In particular, the network worker nodes solve their local optimization problems, allowing the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Distributed Control Multi-Agent Systems