Distributed Learning with Low Communication Cost via Gradient Boosting   Untrained Neural Network

Xiatian Zhang; Xunshi He; Nan Wang; Rong Chen

arXiv:2011.05022·cs.LG·November 11, 2020

Distributed Learning with Low Communication Cost via Gradient Boosting Untrained Neural Network

Xiatian Zhang, Xunshi He, Nan Wang, Rong Chen

PDF

Open Access

TL;DR

This paper introduces GBUN, a novel distributed gradient boosting algorithm using an untrained neural network to significantly reduce communication costs while maintaining accuracy, enabling faster training especially in low-bandwidth environments.

Contribution

The paper proposes GBUN, a new gradient boosting method that employs an untrained neural network to lower communication costs in distributed learning, extending Simhash for high-dimensional data.

Findings

01

GBUN achieves comparable accuracy to traditional GBDT.

02

GBUN speeds up training by up to 13 times on large clusters.

03

GBUN drastically reduces communication bandwidth requirements.

Abstract

For high-dimensional data, there are huge communication costs for distributed GBDT because the communication volume of GBDT is related to the number of features. To overcome this problem, we propose a novel gradient boosting algorithm, the Gradient Boosting Untrained Neural Network(GBUN). GBUN ensembles the untrained randomly generated neural network that softly distributes data samples to multiple neuron outputs and dramatically reduces the communication costs for distributed learning. To avoid creating huge neural networks for high-dimensional data, we extend Simhash algorithm to mimic forward calculation of the neural network. Our experiments on multiple public datasets show that GBUN is as good as conventional GBDT in terms of prediction accuracy and much better than it in scaling property for distributed learning. Comparing to conventional GBDT varieties, GBUN speeds up the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Privacy-Preserving Technologies in Data