Probabilistic Synchronous Parallel

Liang Wang; Ben Catterall; Richard Mortier

arXiv:1709.07772·cs.DC·October 6, 2017·5 cites

Probabilistic Synchronous Parallel

Liang Wang, Ben Catterall, Richard Mortier

PDF

Open Access

TL;DR

This paper introduces Probabilistic Synchronous Parallel (PSP), a novel barrier control method for distributed learning that enhances convergence speed and scalability of SGD by incorporating sampling, with theoretical analysis and practical implementation.

Contribution

The paper proposes PSP, a new barrier control technique that improves distributed SGD performance by integrating sampling, and demonstrates its effectiveness through theoretical analysis and a full-featured framework.

Findings

01

PSP improves convergence speed over BSP, SSP, and ASP.

02

PSP enhances scalability in distributed SGD.

03

Theoretical convergence guarantees are established for PSP-based SGD.

Abstract

Most machine learning and deep neural network algorithms rely on certain iterative algorithms to optimise their utility/cost functions, e.g. Stochastic Gradient Descent. In distributed learning, the networked nodes have to work collaboratively to update the model parameters, and the way how they proceed is referred to as synchronous parallel design (or barrier control). Synchronous parallel protocol is the building block of any distributed learning framework, and its design has direct impact on the performance and scalability of the system. In this paper, we propose a new barrier control technique - Probabilistic Synchronous Parallel (PSP). Com- paring to the previous Bulk Synchronous Parallel (BSP), Stale Synchronous Parallel (SSP), and (Asynchronous Parallel) ASP, the proposed solution e ectively improves both the convergence speed and the scalability of the SGD algorithm by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Age of Information Optimization · Advanced Graph Neural Networks

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Stochastic Gradient Descent