An In-Memory Analog Computing Co-Processor for Energy-Efficient CNN   Inference on Mobile Devices

Mohammed Elbtity; Abhishek Singh; Brendan Reidy; Xiaochen Guo; Ramtin; Zand

arXiv:2105.13904·cs.AR·September 15, 2021

An In-Memory Analog Computing Co-Processor for Energy-Efficient CNN Inference on Mobile Devices

Mohammed Elbtity, Abhishek Singh, Brendan Reidy, Xiaochen Guo, Ramtin, Zand

PDF

TL;DR

This paper presents an in-memory analog computing co-processor using SOT-MRAM for energy-efficient CNN inference on mobile devices, achieving significant performance and energy improvements over prior digital and mixed-signal approaches.

Contribution

It introduces a novel IMAC architecture with SOT-MRAM devices for neural network acceleration, enabling efficient CNN inference on mobile processors.

Findings

01

Achieves orders of magnitude performance improvement for MLP classifiers.

02

Realizes 6.5% and 10% energy savings for LeNet and VGG CNN models.

03

Demonstrates effective integration of IMAC as a co-processor in mobile architectures.

Abstract

In this paper, we develop an in-memory analog computing (IMAC) architecture realizing both synaptic behavior and activation functions within non-volatile memory arrays. Spin-orbit torque magnetoresistive random-access memory (SOT-MRAM) devices are leveraged to realize sigmoidal neurons as well as binarized synapses. First, it is shown the proposed IMAC architecture can be utilized to realize a multilayer perceptron (MLP) classifier achieving orders of magnitude performance improvement compared to previous mixed-signal and digital implementations. Next, a heterogeneous mixed-signal and mixed-precision CPU-IMAC architecture is proposed for convolutional neural networks (CNNs) inference on mobile processors, in which IMAC is designed as a co-processor to realize fully-connected (FC) layers whereas convolution layers are executed in CPU. Architecture-level analytical models are developed to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMax Pooling · Convolution · Dense Connections · Dropout · Softmax