Breaking Barriers: Maximizing Array Utilization for Compute In-Memory   Fabrics

Brian Crafton; Samuel Spetalnick; Gauthaman Murali; Tushar; Krishna; Sung-Kyu Lim; Arijit Raychowdhury

arXiv:2008.06741·cs.AR·August 18, 2020

Breaking Barriers: Maximizing Array Utilization for Compute In-Memory Fabrics

Brian Crafton, Samuel Spetalnick, Gauthaman Murali, Tushar, Krishna, Sung-Kyu Lim, Arijit Raychowdhury

PDF

TL;DR

This paper introduces a novel data allocation algorithm for compute-in-memory architectures that significantly enhances utilization and performance, addressing synchronization barriers and demonstrating a 7.47-fold improvement on ResNet18.

Contribution

It proposes a new data flow and allocation algorithm based on input data distributions to maximize CIM utilization and performance.

Findings

01

Achieved 7.47× performance improvement over naive methods.

02

Identified synchronization barriers in CIM architectures.

03

Enhanced data utilization in CIM accelerators.

Abstract

Compute in-memory (CIM) is a promising technique that minimizes data transport, the primary performance bottleneck and energy cost of most data intensive applications. This has found wide-spread adoption in accelerating neural networks for machine learning applications. Utilizing a crossbar architecture with emerging non-volatile memories (eNVM) such as dense resistive random access memory (RRAM) or phase change random access memory (PCRAM), various forms of neural networks can be implemented to greatly reduce power and increase on chip memory capacity. However, compute in-memory faces its own limitations at both the circuit and the device levels. Although compute in-memory using the crossbar architecture can greatly reduce data transport, the rigid nature of these large fixed weight matrices forfeits the flexibility of traditional CMOS and SRAM based designs. In this work, we explore…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.