In-Memory ADC-Based Nonlinear Activation Quantization for Efficient In-Memory Computing

Shuai Dong; Junyi Yang; Biyan Zhou; Hongyang Shang; Gourav Datta; Arindam Basu

arXiv:2603.10540·cs.AR·March 12, 2026

In-Memory ADC-Based Nonlinear Activation Quantization for Efficient In-Memory Computing

Shuai Dong, Junyi Yang, Biyan Zhou, Hongyang Shang, Gourav Datta, Arindam Basu

PDF

Open Access

TL;DR

This paper presents BS-KMQ, a novel nonlinear quantization method that reduces ADC resolution needs in in-memory computing, leading to significant improvements in accuracy, area, speed, and energy efficiency.

Contribution

Introduction of Boundary Suppressed K-Means Quantization (BS-KMQ), a new NL quantization approach that suppresses outliers for better ADC efficiency and system performance.

Findings

01

Achieves 7x area reduction in NL-ADC design.

02

Reduces quantization error by at least 3x over existing methods.

03

Provides up to 24x energy efficiency improvement in system simulations.

Abstract

In deep networks, operations such as ReLU and hardware-driven clamping often cause activations to accumulate near the edges of the distribution, leading to biased clustering and suboptimal quantization in existing nonlinear (NL) quantization methods. This paper introduces Boundary Suppressed K-Means Quantization (BS-KMQ), a novel NL quantization approach designed to reduce the resolution requirements of analog-to-digital converters (ADCs) in in-memory computing (IMC) systems. By suppressing boundary outliers before clustering, BS-KMQ achieves more balanced and informative NL quantization levels. The resulting NL references are implemented using a reconfigurable in-memory NL-ADC, achieving a 7x area improvement over prior NL-ADC designs. When evaluated on ResNet-18, VGG-16, Inception-V3, and DistilBERT, BS-KMQ achieves at least 3x lower quantization error compared to linear, Lloyd-Max,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFerroelectric and Negative Capacitance Devices · Advanced Memory and Neural Computing · Parallel Computing and Optimization Techniques