Mapping of CNNs on multi-core RRAM-based CIM architectures

Rebecca Pelke; Nils Bosbach; Jose Cubero; Felix Staudigl; Rainer; Leupers; Jan Moritz Joseph

arXiv:2309.03805·cs.AR·October 27, 2023

Mapping of CNNs on multi-core RRAM-based CIM architectures

Rebecca Pelke, Nils Bosbach, Jose Cubero, Felix Staudigl, Rainer, Leupers, Jan Moritz Joseph

PDF

Open Access

TL;DR

This paper introduces synchronization techniques and architecture optimizations for CNN inference on RRAM-based CIM multi-core systems, achieving near-theoretical speedup with minimal data transmission overhead.

Contribution

It presents novel synchronization methods and compiler algorithms tailored for RRAM-based CIM architectures, enhancing CNN inference performance.

Findings

01

Achieved over 99% of the theoretical acceleration limit.

02

Reduced data transmission overhead to less than 4%.

03

Optimized architecture setup improves data exchange efficiency.

Abstract

RRAM-based multi-core systems improve the energy efficiency and performance of CNNs. Thereby, the distributed parallel execution of convolutional layers causes critical data dependencies that limit the potential speedup. This paper presents synchronization techniques for parallel inference of convolutional layers on RRAM-based CIM architectures. We propose an architecture optimization that enables efficient data exchange and discuss the impact of different architecture setups on the performance. The corresponding compiler algorithms are optimized for high speedup and low memory consumption during CNN inference. We achieve more than 99% of the theoretical acceleration limit with a marginal data transmission overhead of less than 4% for state-of-the-art CNN benchmarks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices