Hardware-software co-exploration with racetrack memory based in-memory computing for CNN inference in embedded systems
Benjamin Chen Ming Choong, Tao Luo, Cheng Liu, Bingsheng He, Wei Zhang, Joey Tianyi Zhou

TL;DR
This paper proposes a co-designed hardware-software approach using racetrack memory for efficient CNN inference in embedded systems, optimizing both memory circuits and neural network architectures.
Contribution
It introduces in-memory arithmetic circuits tailored for racetrack memory and explores co-optimization of system architecture and CNN models for enhanced efficiency.
Findings
Significant energy savings in CNN inference
Reduced memory bank area
Improved performance in embedded systems
Abstract
Deep neural networks generate and process large volumes of data, posing challenges for low-resource embedded systems. In-memory computing has been demonstrated as an efficient computing infrastructure and shows promise for embedded AI applications. Among newly-researched memory technologies, racetrack memory is a non-volatile technology that allows high data density fabrication, making it a good fit for in-memory computing. However, integrating in-memory arithmetic circuits with memory cells affects both the memory density and power efficiency. It remains challenging to build efficient in-memory arithmetic circuits on racetrack memory within area and energy constraints. To this end, we present an efficient in-memory convolutional neural network (CNN) accelerator optimized for use with racetrack memory. We design a series of fundamental arithmetic circuits as in-memory computing cells…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
