Ensemble-Compression: A New Method for Parallel Training of Deep Neural   Networks

Shizhao Sun; Wei Chen; Jiang Bian; Xiaoguang Liu; Tie-Yan Liu

arXiv:1606.00575·cs.DC·July 19, 2017·5 cites

Ensemble-Compression: A New Method for Parallel Training of Deep Neural Networks

Shizhao Sun, Wei Chen, Jiang Bian, Xiaoguang Liu, Tie-Yan Liu

PDF

Open Access

TL;DR

The paper introduces Ensemble-Compression, a parallel training framework for deep neural networks that aggregates local models via ensemble outputs and compresses the global model, outperforming traditional parameter averaging in accuracy and speed.

Contribution

It proposes a novel ensemble-based aggregation method with model compression for parallel DNN training, addressing limitations of parameter averaging.

Findings

01

EC-DNN achieves higher accuracy than MA-DNN.

02

EC-DNN provides faster training speed.

03

Model compression maintains performance while controlling size.

Abstract

Parallelization framework has become a necessity to speed up the training of deep neural networks (DNN) recently. Such framework typically employs the Model Average approach, denoted as MA-DNN, in which parallel workers conduct respective training based on their own local data while the parameters of local models are periodically communicated and averaged to obtain a global model which serves as the new start of local models. However, since DNN is a highly non-convex model, averaging parameters cannot ensure that such global model can perform better than those local models. To tackle this problem, we introduce a new parallel training framework called Ensemble-Compression, denoted as EC-DNN. In this framework, we propose to aggregate the local models by ensemble, i.e., averaging the outputs of local models instead of the parameters. As most of prevalent loss functions are convex to the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings