CirCNN: Accelerating and Compressing Deep Neural Networks Using   Block-CirculantWeight Matrices

Caiwen Ding; Siyu Liao; Yanzhi Wang; Zhe Li; Ning Liu; Youwei Zhuo,; Chao Wang; Xuehai Qian; Yu Bai; Geng Yuan; Xiaolong Ma; Yipeng Zhang; Jian; Tang; Qinru Qiu; Xue Lin; Bo Yuan

arXiv:1708.08917·cs.CV·September 11, 2017

CirCNN: Accelerating and Compressing Deep Neural Networks Using Block-CirculantWeight Matrices

Caiwen Ding, Siyu Liao, Yanzhi Wang, Zhe Li, Ning Liu, Youwei Zhuo,, Chao Wang, Xuehai Qian, Yu Bai, Geng Yuan, Xiaolong Ma, Yipeng Zhang, Jian, Tang, Qinru Qiu, Xue Lin, Bo Yuan

PDF

TL;DR

CirCNN introduces a mathematically rigorous method using block-circulant matrices and FFT to significantly accelerate and compress deep neural networks, achieving high energy efficiency with negligible accuracy loss across various hardware platforms.

Contribution

This paper presents a novel approach employing block-circulant matrices for DNN weights, enabling efficient training and inference with reduced complexity and guaranteed accuracy, unlike prior pruning methods.

Findings

01

Achieves 6-102X energy efficiency improvements over state-of-the-art.

02

Reduces computational complexity from O(n^2) to O(nlogn).

03

Maintains accuracy comparable to uncompressed DNNs.

Abstract

Large-scale deep neural networks (DNNs) are both compute and memory intensive. As the size of DNNs continues to grow, it is critical to improve the energy efficiency and performance while maintaining accuracy. For DNNs, the model size is an important factor affecting performance, scalability and energy efficiency. Weight pruning achieves good compression ratios but suffers from three drawbacks: 1) the irregular network structure after pruning; 2) the increased training complexity; and 3) the lack of rigorous guarantee of compression ratio and inference accuracy. To overcome these limitations, this paper proposes CirCNN, a principled approach to represent weights and process neural networks using block-circulant matrices. CirCNN utilizes the Fast Fourier Transform (FFT)-based fast multiplication, simultaneously reducing the computational complexity (both in inference and training) from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning