Approximate Random Dropout
Zhuoran Song, Ru Wang, Dongyu Ru, Hongru Huang, Zhenghao Peng, Jing, Ke, Xiaoyao Liang, Li Jiang

TL;DR
This paper introduces Approximate Random Dropout, a method that uses predefined patterns to replace random dropout in neural network training, significantly reducing training time while maintaining accuracy.
Contribution
It proposes a novel dropout technique with regular patterns and a search algorithm to optimize these patterns, enabling faster training of DNNs.
Findings
Reduces training time by up to 77% on MLPs and 60% on LSTMs.
Maintains comparable accuracy with marginal drops.
Proves statistical equivalence to traditional dropout.
Abstract
The training phases of Deep neural network~(DNN) consumes enormous processing time and energy. Compression techniques utilizing the sparsity of DNNs can effectively accelerate the inference phase of DNNs. However, it can be hardly used in the training phase because the training phase involves dense matrix-multiplication using General Purpose Computation on Graphics Processors (GPGPU), which endorse regular and structural data layout. In this paper, we propose the Approximate Random Dropout that replaces the conventional random dropout of neurons and synapses with a regular and predefined patterns to eliminate the unnecessary computation and data access. To compensate the potential performance loss we develop a SGD-based Search Algorithm to produce the distribution of dropout patterns. We prove our approach is statistically equivalent to the previous dropout method. Experiments results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Neural Networks and Applications · Stochastic Gradient Optimization Techniques
MethodsDropout
