Special Session: Sustainable Deployment of Deep Neural Networks on Non-Volatile Compute-in-Memory Accelerators

Yifan Qin; Zheyu Yan; Wujie Wen; Xiaobo Sharon Hu; Yiyu Shi

arXiv:2508.12195·cs.AR·August 19, 2025

Special Session: Sustainable Deployment of Deep Neural Networks on Non-Volatile Compute-in-Memory Accelerators

Yifan Qin, Zheyu Yan, Wujie Wen, Xiaobo Sharon Hu, Yiyu Shi

PDF

TL;DR

This paper introduces a negative optimization training method called OVF to improve the robustness and accuracy of deep neural networks deployed on non-volatile compute-in-memory accelerators, reducing energy costs and performance degradation.

Contribution

It proposes a novel negative optimization training mechanism and the OVF method to enhance DNN deployment on NVCIM accelerators, addressing stochastic device variations.

Findings

01

Up to 46.71% improvement in inference accuracy

02

Reduces reliance on energy-intensive write-verify operations

03

Enhances robustness against device variations

Abstract

Non-volatile memory (NVM) based compute-in-memory (CIM) accelerators have emerged as a sustainable solution to significantly boost energy efficiency and minimize latency for Deep Neural Networks (DNNs) inference due to their in-situ data processing capabilities. However, the performance of NVCIM accelerators degrades because of the stochastic nature and intrinsic variations of NVM devices. Conventional write-verify operations, which enhance inference accuracy through iterative writing and verification during deployment, are costly in terms of energy and time. Inspired by negative feedback theory, we present a novel negative optimization training mechanism to achieve robust DNN deployment for NVCIM. We develop an Oriented Variational Forward (OVF) training method to implement this mechanism. Experiments show that OVF outperforms existing state-of-the-art techniques with up to a 46.71%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.