Redundancy Techniques for Straggler Mitigation in Distributed   Optimization and Learning

Can Karakus; Yifan Sun; Suhas Diggavi; Wotao Yin

arXiv:1803.05397·stat.ML·March 15, 2018·41 cites

Redundancy Techniques for Straggler Mitigation in Distributed Optimization and Learning

Can Karakus, Yifan Sun, Suhas Diggavi, Wotao Yin

PDF

Open Access

TL;DR

This paper introduces a redundancy-based distributed optimization framework that mitigates straggler effects by encoding data with over-complete representations, ensuring convergence despite delays and node failures.

Contribution

It proposes a novel encoding scheme using equiangular tight frames and demonstrates convergence guarantees for various optimization algorithms under arbitrary delay patterns.

Findings

01

Redundancy encoding improves robustness against stragglers.

02

The method converges deterministically regardless of delay distributions.

03

Experimental results show performance gains over traditional strategies.

Abstract

Performance of distributed optimization and learning systems is bottlenecked by "straggler" nodes and slow communication links, which significantly delay computation. We propose a distributed optimization framework where the dataset is "encoded" to have an over-complete representation with built-in redundancy, and the straggling nodes in the system are dynamically left out of the computation at every iteration, whose loss is compensated by the embedded redundancy. We show that oblivious application of several popular optimization algorithms on encoded data, including gradient descent, L-BFGS, proximal gradient under data parallelism, and coordinate descent under model parallelism, converge to either approximate or exact solutions of the original problem when stragglers are treated as erasures. These convergence results are deterministic, i.e., they establish sample path convergence for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Machine Learning and ELM