Slim-DP: A Light Communication Data Parallelism for DNN

Shizhao Sun; Wei Chen; Jiang Bian; Xiaoguang Liu; Tie-Yan Liu

arXiv:1709.09393·cs.DC·September 28, 2017

Slim-DP: A Light Communication Data Parallelism for DNN

Shizhao Sun, Wei Chen, Jiang Bian, Xiaoguang Liu, Tie-Yan Liu

PDF

Open Access

TL;DR

Slim-DP introduces a dynamic, significance-based parameter communication scheme for DNN training that reduces communication overhead and accelerates training without sacrificing accuracy.

Contribution

The paper proposes a novel Explore-Exploit framework for selective parameter communication based on significance, improving training speed and efficiency in data parallelism.

Findings

01

Achieves faster training than standard data parallelism.

02

Reduces communication time significantly.

03

Maintains model accuracy despite reduced communication.

Abstract

Data parallelism has emerged as a necessary technique to accelerate the training of deep neural networks (DNN). In a typical data parallelism approach, the local workers push the latest updates of all the parameters to the parameter server and pull all merged parameters back periodically. However, with the increasing size of DNN models and the large number of workers in practice, this typical data parallelism cannot achieve satisfactory training acceleration, since it usually suffers from the heavy communication cost due to transferring huge amount of information between workers and the parameter server. In-depth understanding on DNN has revealed that it is usually highly redundant, that deleting a considerable proportion of the parameters will not significantly decline the model performance. This redundancy property exposes a great opportunity to reduce the communication cost by only…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIoT and Edge/Fog Computing · Energy Efficient Wireless Sensor Networks · Software-Defined Networks and 5G