Coded Computing for Resilient Distributed Computing: A Learning-Theoretic Framework

Parsa Moradi; Behrooz Tahmasebi; Mohammad Ali Maddah-Ali

arXiv:2406.00300·cs.LG·March 26, 2026

Coded Computing for Resilient Distributed Computing: A Learning-Theoretic Framework

Parsa Moradi, Behrooz Tahmasebi, Mohammad Ali Maddah-Ali

PDF

Open Access 1 Video

TL;DR

This paper introduces a learning-theoretic framework for coded computing in distributed machine learning, optimizing encoder and decoder functions to improve resilience and accuracy in the presence of slow or faulty servers.

Contribution

It develops a novel learning-based approach to coded computing, bridging the gap between coding theory and machine learning workloads, with explicit optimal encoder-decoder derivation.

Findings

01

Error decay rate improves with number of workers

02

Framework outperforms state-of-the-art in accuracy

03

Effective in noisy and noiseless settings

Abstract

Coded computing has emerged as a promising framework for tackling significant challenges in large-scale distributed computing, including the presence of slow, faulty, or compromised servers. In this approach, each worker node processes a combination of the data, rather than the raw data itself. The final result then is decoded from the collective outputs of the worker nodes. However, there is a significant gap between current coded computing approaches and the broader landscape of general distributed computing, particularly when it comes to machine learning workloads. To bridge this gap, we propose a novel foundation for coded computing, integrating the principles of learning theory, and developing a framework that seamlessly adapts with machine learning applications. In this framework, the objective is to find the encoder and decoder functions that minimize the loss function, defined…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Coded Computing for Resilient Distributed Computing: A Learning-Theoretic Framework· slideslive

Taxonomy

TopicsOnline Learning and Analytics · Innovative Teaching and Learning Methods