Partitioning Large Scale Deep Belief Networks Using Dropout
Yanping Huang, Sai Zhang

TL;DR
This paper introduces a distributed training approach for large-scale deep belief networks (DBNs) that employs dropout to improve scalability and performance, leveraging GPU acceleration and multiple machines.
Contribution
It proposes a novel distributed training method for DBNs using dropout and GPU acceleration, with four strategies to combine results from multiple machines.
Findings
Outperforms existing methods on MNIST digit recognition
Enables training of larger DBNs in distributed environments
Improves test error rates with dropout-based partitioning
Abstract
Deep learning methods have shown great promise in many practical applications, ranging from speech recognition, visual object recognition, to text processing. However, most of the current deep learning methods suffer from scalability problems for large-scale applications, forcing researchers or users to focus on small-scale problems with fewer parameters. In this paper, we consider a well-known machine learning model, deep belief networks (DBNs) that have yielded impressive classification performance on a large number of benchmark machine learning tasks. To scale up DBN, we propose an approach that can use the computing clusters in a distributed environment to train large models, while the dense matrix computations within a single machine are sped up using graphics processors (GPU). When training a DBN, each machine randomly drops out a portion of neurons in each hidden layer, for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis
