DNN Training Acceleration via Exploring GPGPU Friendly Sparsity

Zhuoran Song; Yihong Xu; Han Li; Naifeng Jing; Xiaoyao Liang; Li Jiang

arXiv:2203.05705·cs.LG·March 14, 2022·1 cites

DNN Training Acceleration via Exploring GPGPU Friendly Sparsity

Zhuoran Song, Yihong Xu, Han Li, Naifeng Jing, Xiaoyao Liang, Li Jiang

PDF

Open Access

TL;DR

This paper introduces GPGPU-friendly sparsity techniques for DNN training, including regular dropout patterns and sensitivity-aware dropout, to accelerate training without sacrificing accuracy, supported by a unified software framework.

Contribution

It proposes novel dropout methods and a training framework that enable GPGPU-efficient sparse DNN training, improving speed while maintaining accuracy.

Findings

01

Significant training acceleration achieved across DNN models.

02

Dropout pattern search maintains accuracy with reduced computation.

03

Unified framework simplifies implementation of sparse training techniques.

Abstract

The training phases of Deep neural network~(DNN) consumes enormous processing time and energy. Compression techniques utilizing the sparsity of DNNs can effectively accelerate the inference phase of DNNs. However, it is hardly used in the training phase because the training phase involves dense matrix-multiplication using General-Purpose Computation on Graphics Processors (GPGPU), which endorse the regular and structural data layout. In this paper, we first propose the Approximate Random Dropout that replaces the conventional random dropout of neurons and synapses with a regular and online generated row-based or tile-based dropout patterns to eliminate the unnecessary computation and data access for the multilayer perceptron~(MLP) and long short-term memory~(LSTM). We then develop a SGD-based Search Algorithm that produces the distribution of row-based or tile-based dropout patterns to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Advanced Neural Network Applications · Stochastic Gradient Optimization Techniques

MethodsConvolution · Dropout