OSA-HCIM: On-The-Fly Saliency-Aware Hybrid SRAM CIM with Dynamic Precision Configuration
Yung-Chin Chen, Shimpei Ando, Daichi Fujiki, Shinya, Takamaeda-Yamazaki, Kentaro Yoshioka

TL;DR
This paper introduces OSA-HCIM, a flexible CIM architecture that dynamically adjusts precision based on input importance, significantly improving energy efficiency while maintaining accuracy for deep neural network computations.
Contribution
The paper presents a novel on-the-fly saliency-aware precision configuration scheme and a hybrid CIM array that enables flexible, resource-efficient neural network processing.
Findings
Achieves 1.95x energy efficiency improvement
Maintains minimal accuracy loss on CIFAR100
First CIM with dynamic digital-analog boundary
Abstract
Computing-in-Memory (CIM) has shown great potential for enhancing efficiency and performance for deep neural networks (DNNs). However, the lack of flexibility in CIM leads to an unnecessary expenditure of computational resources on less critical operations, and a diminished Signal-to-Noise Ratio (SNR) when handling more complex tasks, significantly hindering the overall performance. Hence, we focus on the integration of CIM with Saliency-Aware Computing -- a paradigm that dynamically tailors computing precision based on the importance of each input. We propose On-the-fly Saliency-Aware Hybrid CIM (OSA-HCIM) offering three primary contributions: (1) On-the-fly Saliency-Aware (OSA) precision configuration scheme, which dynamically sets the precision of each MAC operation based on its saliency, (2) Hybrid CIM Array (HCIMA), which enables simultaneous operation of digital-domain CIM (DCIM)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices · Parallel Computing and Optimization Techniques
