SME: ReRAM-based Sparse-Multiplication-Engine to Squeeze-Out Bit   Sparsity of Neural Network

Fangxin Liu; Wenbo Zhao; Yilong Zhao; Zongwu Wang; Tao Yang; Zhezhi; He; Naifeng Jing; Xiaoyao Liang; Li Jiang

arXiv:2103.01705·cs.AR·March 3, 2021·1 cites

SME: ReRAM-based Sparse-Multiplication-Engine to Squeeze-Out Bit Sparsity of Neural Network

Fangxin Liu, Wenbo Zhao, Yilong Zhao, Zongwu Wang, Tao Yang, Zhezhi, He, Naifeng Jing, Xiaoyao Liang, Li Jiang

PDF

Open Access

TL;DR

This paper introduces SME, a ReRAM-based neural network accelerator that exploits sparsity through hardware-software co-design, significantly reducing crossbar usage while maintaining high accuracy on ImageNet.

Contribution

The paper presents a novel ReRAM-based accelerator with a new weigh mapping and squeeze-out scheme to effectively utilize sparsity in neural networks.

Findings

01

Reduces crossbar usage by up to 8.7x on ResNet-50

02

Reduces crossbar usage by up to 2.1x on MobileNet-v2

03

Achieves less than 0.3% accuracy drop on ImageNet

Abstract

Resistive Random-Access-Memory (ReRAM) crossbar is a promising technique for deep neural network (DNN) accelerators, thanks to its in-memory and in-situ analog computing abilities for Vector-Matrix Multiplication-and-Accumulations (VMMs). However, it is challenging for crossbar architecture to exploit the sparsity in the DNN. It inevitably causes complex and costly control to exploit fine-grained sparsity due to the limitation of tightly-coupled crossbar structure. As the countermeasure, we developed a novel ReRAM-based DNN accelerator, named Sparse-Multiplication-Engine (SME), based on a hardware and software co-design framework. First, we orchestrate the bit-sparse pattern to increase the density of bit-sparsity based on existing quantization methods. Second, we propose a novel weigh mapping mechanism to slice the bits of a weight across the crossbars and splice the activation results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices · Machine Learning and ELM