Partitioning Large Scale Deep Belief Networks Using Dropout

Yanping Huang; Sai Zhang

arXiv:1508.07096·stat.ML·August 31, 2015

Partitioning Large Scale Deep Belief Networks Using Dropout

Yanping Huang, Sai Zhang

PDF

Open Access

TL;DR

This paper introduces a distributed training approach for large-scale deep belief networks (DBNs) that employs dropout to improve scalability and performance, leveraging GPU acceleration and multiple machines.

Contribution

It proposes a novel distributed training method for DBNs using dropout and GPU acceleration, with four strategies to combine results from multiple machines.

Findings

01

Outperforms existing methods on MNIST digit recognition

02

Enables training of larger DBNs in distributed environments

03

Improves test error rates with dropout-based partitioning

Abstract

Deep learning methods have shown great promise in many practical applications, ranging from speech recognition, visual object recognition, to text processing. However, most of the current deep learning methods suffer from scalability problems for large-scale applications, forcing researchers or users to focus on small-scale problems with fewer parameters. In this paper, we consider a well-known machine learning model, deep belief networks (DBNs) that have yielded impressive classification performance on a large number of benchmark machine learning tasks. To scale up DBN, we propose an approach that can use the computing clusters in a distributed environment to train large models, while the dense matrix computations within a single machine are sped up using graphics processors (GPU). When training a DBN, each machine randomly drops out a portion of neurons in each hidden layer, for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis