SparseTrain: Exploiting Dataflow Sparsity for Efficient Convolutional Neural Networks Training
Pengcheng Dai, Jianlei Yang, Xucheng Ye, Xingzhou Cheng, Junyu Luo,, Linghao Song, Yiran Chen, Weisheng Zhao

TL;DR
SparseTrain accelerates CNN training by exploiting both natural and artificial sparsity through innovative pruning, dataflow, and architecture, achieving significant speedup and energy efficiency improvements.
Contribution
The paper introduces a comprehensive approach combining sparsity exploitation, a sparse-aware architecture, and a compiler for efficient CNN training.
Findings
Achieves 2.7x speedup on AlexNet/ResNet.
Provides 2.2x energy efficiency improvement.
Maintains training accuracy and convergence rate.
Abstract
Training Convolutional Neural Networks (CNNs) usually requires a large number of computational resources. In this paper, \textit{SparseTrain} is proposed to accelerate CNN training by fully exploiting the sparsity. It mainly involves three levels of innovations: activation gradients pruning algorithm, sparse training dataflow, and accelerator architecture. By applying a stochastic pruning algorithm on each layer, the sparsity of back-propagation gradients can be increased dramatically without degrading training accuracy and convergence rate. Moreover, to utilize both \textit{natural sparsity} (resulted from ReLU or Pooling layers) and \textit{artificial sparsity} (brought by pruning algorithm), a sparse-aware architecture is proposed for training acceleration. This architecture supports forward and back-propagation of CNN by adopting 1-Dimensional convolution dataflow. We have built %a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
MethodsPruning · Convolution · *Communicated@Fast*How Do I Communicate to Expedia?
