Self-Adaptive Reconfigurable Arrays (SARA): Using ML to Assist Scaling   GEMM Acceleration

Ananda Samajdar; Michael Pellauer; Tushar Krishna

arXiv:2101.04799·cs.AR·April 26, 2022

Self-Adaptive Reconfigurable Arrays (SARA): Using ML to Assist Scaling GEMM Acceleration

Ananda Samajdar, Michael Pellauer, Tushar Krishna

PDF

Open Access

TL;DR

This paper introduces SARA, a self-adaptive accelerator architecture that uses machine learning to optimize configuration at runtime, significantly improving efficiency and density for DNN workloads.

Contribution

The work presents a novel self-adaptive reconfigurable array architecture and a neural network-based recommendation system for dynamic configuration during DNN processing.

Findings

01

SAGAR achieves 3.5x power efficiency and 3.2x compute density over fixed arrays.

02

ADAPTNET runtime achieves 99.93% of optimal runtime.

03

SARA enables flexible configurations comparable to large distributed arrays.

Abstract

With increasing diversity in Deep Neural Network(DNN) models in terms of layer shapes and sizes, the research community has been investigating flexible/reconfigurable accelerator substrates. This line of research has opened up two challenges. The first is to determine the appropriate amount of flexibility within an accelerator array that that can trade-off the performance benefits versus the area overheads of the reconfigurability. The second is being able to determine the right configuration of the array for the current DNN model and/or layer and reconfigure the accelerator at runtime. This work introduces a new class of accelerators that we call Self Adaptive Reconfigurable Array (SARA). SARA architectures comprise of both a reconfigurable array and a hardware unit capable of determining an optimized configuration for the array at runtime. We demonstrate an instance of SARA with an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · Advanced Memory and Neural Computing