Multivariate Polynomial Codes for Efficient Matrix Chain Multiplication in Distributed Systems
Jes\'us G\'omez-Vilardeb\`o

TL;DR
This paper introduces multivariate polynomial coding schemes for distributed matrix chain multiplication, effectively balancing computational and storage costs while mitigating straggler effects in large-scale systems.
Contribution
It proposes two novel multivariate polynomial codes that reduce storage overhead and address scalability issues in distributed matrix chain multiplication.
Findings
Multivariate codes significantly lower storage overhead compared to univariate extensions.
The proposed schemes effectively mitigate straggler effects in distributed matrix computations.
Trade-offs between computation and storage are characterized and exploited.
Abstract
We study the problem of computing matrix chain multiplications in a distributed computing cluster. In such systems, performance is often limited by the straggler problem, where the slowest worker dominates the overall computation latency. To resolve this issue, several coded computing strategies have been proposed, primarily focusing on the simplest case: the multiplication of two matrices. These approaches successfully alleviate the straggler effect, but they do so at the expense of higher computational complexity and increased storage needs at the workers. However, in many real-world applications, computations naturally involve long chains of matrix multiplications rather than just a single two-matrix product. Extending univariate polynomial coding to this setting has been shown to amplify the costs -- both computation and storage overheads grow significantly, limiting scalability. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Parallel Computing and Optimization Techniques · Advanced Data Storage Technologies
