Exploring Bit-Slice Sparsity in Deep Neural Networks for Efficient ReRAM-Based Deployment
Jingyang Zhang, Huanrui Yang, Fan Chen, Yitu Wang, Hai Li

TL;DR
This paper introduces a novel training algorithm that induces bit-slice sparsity in deep neural networks, enabling more efficient ReRAM-based deployment by reducing ADC resolution and power consumption.
Contribution
It is the first to induce bit-slice sparsity during training, significantly improving ReRAM deployment efficiency and reducing ADC overhead.
Findings
Achieves 2x sparsity improvement over previous methods.
Reduces ADC resolution to 1-bit for most significant bits.
Speeds up processing and cuts power and area overhead.
Abstract
Emerging resistive random-access memory (ReRAM) has recently been intensively investigated to accelerate the processing of deep neural networks (DNNs). Due to the in-situ computation capability, analog ReRAM crossbars yield significant throughput improvement and energy reduction compared to traditional digital methods. However, the power hungry analog-to-digital converters (ADCs) prevent the practical deployment of ReRAM-based DNN accelerators on end devices with limited chip area and power budget. We observe that due to the limited bit-density of ReRAM cells, DNN weights are bit sliced and correspondingly stored on multiple ReRAM bitlines. The accumulated current on bitlines resulted by weights directly dictates the overhead of ADCs. As such, bitwise weight sparsity rather than the sparsity of the full weight, is desirable for efficient ReRAM deployment. In this work, we propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices · Advanced Neural Network Applications
