Sensitivity-Aware Mixed-Precision Quantization for ReRAM-based Computing-in-Memory

Guan-Cheng Chen; Chieh-Lin Tsai; Pei-Hsuan Tsai; Yuan-Hao Chang

arXiv:2512.19445·cs.AR·December 23, 2025

Sensitivity-Aware Mixed-Precision Quantization for ReRAM-based Computing-in-Memory

Guan-Cheng Chen, Chieh-Lin Tsai, Pei-Hsuan Tsai, Yuan-Hao Chang

PDF

Open Access

TL;DR

This paper introduces a sensitivity-aware mixed-precision quantization method tailored for ReRAM-based Compute-In-Memory systems, significantly improving efficiency and power consumption while maintaining high neural network accuracy.

Contribution

It proposes a novel structured quantization approach that combines sensitivity analysis with mixed-precision strategies specifically for ReRAM CIM architectures.

Findings

01

Achieves 86.33% accuracy at 70% compression

02

Reduces power consumption by 40%

03

Enhances ReRAM Crossbar utilization

Abstract

Compute-In-Memory (CIM) systems, particularly those utilizing ReRAM and memristive technologies, offer a promising path toward energy-efficient neural network computation. However, conventional quantization and compression techniques often fail to fully optimize performance and efficiency in these architectures. In this work, we present a structured quantization method that combines sensitivity analysis with mixed-precision strategies to enhance weight storage and computational performance on ReRAM-based CIM systems. Our approach improves ReRAM Crossbar utilization, significantly reducing power consumption, latency, and computational load, while maintaining high accuracy. Experimental results show 86.33% accuracy at 70% compression, alongside a 40% reduction in power consumption, demonstrating the method's effectiveness for power-constrained applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices · Parallel Computing and Optimization Techniques