Crossbar-aware neural network pruning
Ling Liang, Lei Deng, Yueling Zeng, Xing Hu, Yu Ji, Xin Ma, Guoqi Li, and Yuan Xie

TL;DR
This paper introduces a crossbar-aware pruning framework for neural networks that optimizes model sparsity considering crossbar architecture constraints, significantly reducing resource overhead and energy consumption while maintaining accuracy.
Contribution
It proposes a novel L0-norm constrained optimization and pruning method tailored for crossbar-based neural network accelerators, integrating architecture-aware sparsity and feature map reordering.
Findings
Reduces crossbar overhead by 44%-72%
Maintains accuracy with minimal degradation
Enhances efficiency of CNN mapping on crossbar devices
Abstract
Crossbar architecture based devices have been widely adopted in neural network accelerators by taking advantage of the high efficiency on vector-matrix multiplication (VMM) operations. However, in the case of convolutional neural networks (CNNs), the efficiency is compromised dramatically due to the large amounts of data reuse. Although some mapping methods have been designed to achieve a balance between the execution throughput and resource overhead, the resource consumption cost is still huge while maintaining the throughput. Network pruning is a promising and widely studied leverage to shrink the model size. Whereas, previous work didn`t consider the crossbar architecture and the corresponding mapping method, which cannot be directly utilized by crossbar-based neural network accelerators. Tightly combining the crossbar structure and its mapping, this paper proposes a crossbar-aware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning · Average Pooling · Dropout · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Global Average Pooling · Residual Block · Dense Connections · *Communicated@Fast*How Do I Communicate to Expedia?
