Chain-NN: An Energy-Efficient 1D Chain Architecture for Accelerating   Deep Convolutional Neural Networks

Shihao Wang; Dajiang Zhou; Xushen Han; Takeshi Yoshimura

arXiv:1703.01457·cs.AR·March 7, 2017·6 cites

Chain-NN: An Energy-Efficient 1D Chain Architecture for Accelerating Deep Convolutional Neural Networks

Shihao Wang, Dajiang Zhou, Xushen Han, Takeshi Yoshimura

PDF

Open Access

TL;DR

Chain-NN introduces an energy-efficient 1D chain architecture with systolic primitives and input reuse to accelerate deep CNNs, significantly reducing power consumption and improving throughput.

Contribution

The paper proposes a novel 1D chain architecture with reconfigurable systolic primitives and input reuse for energy-efficient CNN acceleration.

Findings

01

Achieves 806.4GOPS peak throughput at 700MHz.

02

Consumes 567.5mW power, with 1421.0GOPS/W efficiency.

03

Outperforms state-of-the-art in power efficiency by 2.5 to 4.1 times.

Abstract

Deep convolutional neural networks (CNN) have shown their good performances in many computer vision tasks. However, the high computational complexity of CNN involves a huge amount of data movements between the computational processor core and memory hierarchy which occupies the major of the power consumption. This paper presents Chain-NN, a novel energy-efficient 1D chain architecture for accelerating deep CNNs. Chain-NN consists of the dedicated dual-channel process engines (PE). In Chain-NN, convolutions are done by the 1D systolic primitives composed of a group of adjacent PEs. These systolic primitives, together with the proposed column-wise scan input pattern, can fully reuse input operand to reduce the memory bandwidth requirement for energy saving. Moreover, the 1D chain architecture allows the systolic primitives to be easily reconfigured according to specific CNN parameters…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · CCD and CMOS Imaging Sensors · Advanced Memory and Neural Computing

Methods1x1 Convolution · Convolution · Local Response Normalization · Grouped Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Dense Connections · Max Pooling · Softmax · How do I speak to a person at Expedia?-/+/