Probabilistic Synchronous Parallel
Liang Wang, Ben Catterall, Richard Mortier

TL;DR
This paper introduces Probabilistic Synchronous Parallel (PSP), a novel barrier control method for distributed learning that enhances convergence speed and scalability of SGD by incorporating sampling, with theoretical analysis and practical implementation.
Contribution
The paper proposes PSP, a new barrier control technique that improves distributed SGD performance by integrating sampling, and demonstrates its effectiveness through theoretical analysis and a full-featured framework.
Findings
PSP improves convergence speed over BSP, SSP, and ASP.
PSP enhances scalability in distributed SGD.
Theoretical convergence guarantees are established for PSP-based SGD.
Abstract
Most machine learning and deep neural network algorithms rely on certain iterative algorithms to optimise their utility/cost functions, e.g. Stochastic Gradient Descent. In distributed learning, the networked nodes have to work collaboratively to update the model parameters, and the way how they proceed is referred to as synchronous parallel design (or barrier control). Synchronous parallel protocol is the building block of any distributed learning framework, and its design has direct impact on the performance and scalability of the system. In this paper, we propose a new barrier control technique - Probabilistic Synchronous Parallel (PSP). Com- paring to the previous Bulk Synchronous Parallel (BSP), Stale Synchronous Parallel (SSP), and (Asynchronous Parallel) ASP, the proposed solution e ectively improves both the convergence speed and the scalability of the SGD algorithm by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Age of Information Optimization · Advanced Graph Neural Networks
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Stochastic Gradient Descent
