Empirical Study of Straggler Problem in Parameter Server on Iterative Convergent Distributed Machine Learning
Benjamin Wong

TL;DR
This study empirically evaluates the effectiveness of current straggler mitigation techniques in distributed machine learning systems using parameter server architecture across various algorithms, highlighting their impact and areas for improvement.
Contribution
It provides a comprehensive empirical analysis of straggler mitigation strategies in parameter server-based distributed ML, focusing on iterative convergent algorithms and experimental setups.
Findings
Current mitigation techniques vary in effectiveness across algorithms.
Straggler patterns significantly influence the performance of distributed ML.
The study offers a platform for future research and comparison of mitigation methods.
Abstract
The purpose of this study is to test the effectiveness of current straggler mitigation techniques over different important iterative convergent machine learning(ML) algorithm including Matrix Factorization (MF), Multinomial Logistic Regression (MLR), and Latent Dirichlet Allocation (LDA) . The experiment was conducted to implemented using the FlexPS system, which is the latest system implementation that employ parameter server architecture. The experiment employed the Bulk Synchronous Parallel (BSP) computational model to examine the straggler problem in Parameter Server on Iterative Convergent Distributed Machine Learning. Moreover, the current research analyzes the experimental arrangement of the parameter server strategy concerning the parallel learning problems by injecting universal straggler patterns and executing latest mitigation techniques. The findings of the study are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSmart Systems and Machine Learning
