MF-Net: Compute-In-Memory SRAM for Multibit Precision Inference using Memory-immersed Data Conversion and Multiplication-free Operators
Shamma Nasrin, Diaa Badawi, Ahmet Enis Cetin, Wilfred Gomes, and Amit, Ranjan Trivedi

TL;DR
This paper introduces MF-Net, a compute-in-memory SRAM architecture for neural network inference that uses multiplication-free operators, a novel SRAM-immersed ADC, and achieves high efficiency and multi-bit support without DACs.
Contribution
It presents a co-designed SRAM-based compute-in-memory system with multiplication-free functions and an SRAM-immersed ADC, overcoming key limitations of prior in-SRAM DNN processing.
Findings
Achieves 105 TOPS/W with 8-bit input/weight processing.
Supports multi-bit weights without DACs.
Uses SRAM parasitic capacitance for low-area ADC implementation.
Abstract
We propose a co-design approach for compute-in-memory inference for deep neural networks (DNN). We use multiplication-free function approximators based on ell_1 norm along with a co-adapted processing array and compute flow. Using the approach, we overcame many deficiencies in the current art of in-SRAM DNN processing such as the need for digital-to-analog converters (DACs) at each operating SRAM row/column, the need for high precision analog-to-digital converters (ADCs), limited support for multi-bit precision weights, and limited vector-scale parallelism. Our co-adapted implementation seamlessly extends to multi-bit precision weights, it doesn't require DACs, and it easily extends to higher vector-scale parallelism. We also propose an SRAM-immersed successive approximation ADC (SA-ADC), where we exploit the parasitic capacitance of bit lines of SRAM array as a capacitive DAC. Since…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
