Stacked Filters Stationary Flow For Hardware-Oriented Acceleration Of Deep Convolutional Neural Networks
Yuechao Gao, Nianhong Liu, Sheng Zhang

TL;DR
This paper introduces a novel computation flow and data format for hardware acceleration of CNNs, significantly reducing storage needs and improving processing efficiency without sacrificing accuracy.
Contribution
It proposes the stacked filters stationary flow (SFS), a new data encoding format (CSF), and a 3D-SIMD processor architecture to enhance CNN acceleration on hardware.
Findings
Achieves 1.11x and 1.09x storage reduction for AlexNet and SqueezeNet.
Improves PE array utilization rate to 96.5%.
Reduces chip area for handling sparse data.
Abstract
To address memory and computation resource limitations for hardware-oriented acceleration of deep convolutional neural networks (CNNs), we present a computation flow, stacked filters stationary flow (SFS), and a corresponding data encoding format, relative indexed compressed sparse filter format (CSF), to make the best of data sparsity, and simplify data handling at execution time. And we also propose a three dimensional Single Instruction Multiple Data (3D-SIMD) processor architecture to illustrate how to accelerate deep CNNs by taking advantage of SFS flow and CSF format. Comparing with the state-of-the-art result (Han et al., 2016b), our methods achieve 1.11x improvement in reducing the storage required by AlexNet, and 1.09x improvement in reducing the storage required by SqueezeNet, without loss of accuracy on the ImageNet dataset. Moreover, using these approaches, chip area for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Anomaly Detection Techniques and Applications
MethodsResidual Connection · Convolution · Average Pooling · Fire Module · Local Response Normalization · Global Average Pooling · Grouped Convolution · 1x1 Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout
