Straggler-Robust Distributed Optimization in Parameter-Server Networks

Elie Atallah; Nazanin Rahnavard; Chinwendu Enyioha

arXiv:2108.09173·math.OC·August 23, 2021·1 cites

Straggler-Robust Distributed Optimization in Parameter-Server Networks

Elie Atallah, Nazanin Rahnavard, Chinwendu Enyioha

PDF

Open Access

TL;DR

This paper introduces a robust distributed optimization algorithm designed for parameter-server networks that effectively handles network failures, delays, and stragglers through coding techniques, improving convergence and reliability.

Contribution

The paper proposes a novel distributed convex optimization algorithm that is resilient to network failures and delays by integrating coding strategies within a parameter-server framework.

Findings

01

Algorithm demonstrates robustness to node failures and delays.

02

Coding techniques improve convergence in dynamic network topologies.

03

Experimental results show enhanced performance over existing methods.

Abstract

Optimization in distributed networks plays a central role in almost all distributed machine learning problems. In principle, the use of distributed task allocation has reduced the computational time, allowing better response rates and higher data reliability. However, for these computational algorithms to run effectively in complex distributed systems, the algorithms ought to compensate for communication asynchrony, network node failures and delays known as stragglers. These issues can change the effective connection topology of the network, which may vary over time, thus hindering the optimization process. In this paper, we propose a new distributed unconstrained optimization algorithm for minimizing a convex function which is adaptable to a parameter server network. In particular, the network worker nodes solve their local optimization problems, allowing the computation of their local…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Distributed Control Multi-Agent Systems