TSB: Tiny Shared Block for Efficient DNN Deployment on NVCIM   Accelerators

Yifan Qin; Zheyu Yan; Zixuan Pan; Wujie Wen; Xiaobo Sharon Hu; Yiyu; Shi

arXiv:2406.06544·cs.AR·August 23, 2024

TSB: Tiny Shared Block for Efficient DNN Deployment on NVCIM Accelerators

Yifan Qin, Zheyu Yan, Zixuan Pan, Wujie Wen, Xiaobo Sharon Hu, Yiyu, Shi

PDF

Open Access

TL;DR

This paper introduces TSB, a small shared block that stabilizes DNN inference on NVCIM accelerators, significantly improving accuracy, training speed, and reducing mapping costs by mitigating device variation effects.

Contribution

The paper proposes the Tiny Shared Block (TSB), a novel method that enhances DNN deployment on NVCIM accelerators by reducing device variation impact and improving efficiency.

Findings

01

Over 20x inference accuracy gap improvement

02

Over 5x training speedup

03

Significant reduction in weights-to-device mapping cost

Abstract

Compute-in-memory (CIM) accelerators using non-volatile memory (NVM) devices offer promising solutions for energy-efficient and low-latency Deep Neural Network (DNN) inference execution. However, practical deployment is often hindered by the challenge of dealing with the massive amount of model weight parameters impacted by the inherent device variations within non-volatile computing-in-memory (NVCIM) accelerators. This issue significantly offsets their advantages by increasing training overhead, the time and energy needed for mapping weights to device states, and diminishing inference accuracy. To mitigate these challenges, we propose the "Tiny Shared Block (TSB)" method, which integrates a small shared 1x1 convolution block into the DNN architecture. This block is designed to stabilize feature processing across the network, effectively reducing the impact of device variation.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRadiation Detection and Scintillator Technologies · Particle Detector Development and Performance · Advanced Neural Network Applications

Methods1x1 Convolution · Convolution