TSB: Tiny Shared Block for Efficient DNN Deployment on NVCIM Accelerators
Yifan Qin, Zheyu Yan, Zixuan Pan, Wujie Wen, Xiaobo Sharon Hu, Yiyu, Shi

TL;DR
This paper introduces TSB, a small shared block that stabilizes DNN inference on NVCIM accelerators, significantly improving accuracy, training speed, and reducing mapping costs by mitigating device variation effects.
Contribution
The paper proposes the Tiny Shared Block (TSB), a novel method that enhances DNN deployment on NVCIM accelerators by reducing device variation impact and improving efficiency.
Findings
Over 20x inference accuracy gap improvement
Over 5x training speedup
Significant reduction in weights-to-device mapping cost
Abstract
Compute-in-memory (CIM) accelerators using non-volatile memory (NVM) devices offer promising solutions for energy-efficient and low-latency Deep Neural Network (DNN) inference execution. However, practical deployment is often hindered by the challenge of dealing with the massive amount of model weight parameters impacted by the inherent device variations within non-volatile computing-in-memory (NVCIM) accelerators. This issue significantly offsets their advantages by increasing training overhead, the time and energy needed for mapping weights to device states, and diminishing inference accuracy. To mitigate these challenges, we propose the "Tiny Shared Block (TSB)" method, which integrates a small shared 1x1 convolution block into the DNN architecture. This block is designed to stabilize feature processing across the network, effectively reducing the impact of device variation.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiation Detection and Scintillator Technologies · Particle Detector Development and Performance · Advanced Neural Network Applications
Methods1x1 Convolution · Convolution
