Coded Alternating Least Squares for Straggler Mitigation in Distributed Recommendations
Siyuan Wang, Qifa Yan, Jingjing Zhang, Jianping Wang, Linqi Song

TL;DR
This paper introduces a coded ALS algorithm for distributed matrix factorization that mitigates straggler effects by using entangled polynomial codes, improving robustness and efficiency in recommender system computations.
Contribution
It proposes a novel coded ALS method utilizing EPC to tolerate stragglers and provides theoretical analysis of its recovery threshold and computational complexity.
Findings
The coded ALS algorithm can tolerate a specified number of stragglers.
Theoretical bounds on recovery threshold are established.
Numerical experiments validate the effectiveness of the proposed method.
Abstract
Matrix factorization is an important representation learning algorithm, e.g., recommender systems, where a large matrix can be factorized into the product of two low dimensional matrices termed as latent representations. This paper investigates the problem of matrix factorization in distributed computing systems with stragglers, those compute nodes that are slow to return computation results. A computation procedure, called coded Alternative Least Square (ALS), is proposed for mitigating the effect of stragglers in such systems. The coded ALS algorithm iteratively computes two low dimensional latent matrices by solving various linear equations, with the Entangled Polynomial Code (EPC) as a building block. We theoretically characterize the maximum number of stragglers that the algorithm can tolerate (or the recovery threshold) in relation to the redundancy of coding (or the code rate).…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Error Correcting Code Techniques
