MASR: A Modular Accelerator for Sparse RNNs

Udit Gupta; Brandon Reagen; Lillian Pentecost; Marco Donato; Thierry; Tambe; Alexander M. Rush; Gu-Yeon Wei; David Brooks

arXiv:1908.08976·eess.SP·August 27, 2019·PACT

MASR: A Modular Accelerator for Sparse RNNs

Udit Gupta, Brandon Reagen, Lillian Pentecost, Marco Donato, Thierry, Tambe, Alexander M. Rush, Gu-Yeon Wei, David Brooks

PDF

TL;DR

MASR is a modular accelerator designed for sparse bidirectional RNNs, significantly improving energy efficiency and performance for speech recognition tasks by exploiting sparsity and dynamic activation optimizations.

Contribution

The paper introduces MASR, a novel modular architecture that accelerates sparse RNNs by exploiting sparsity in activations and weights, with dynamic optimizations for high efficiency.

Findings

01

MASR achieves 2x area reduction compared to state-of-the-art accelerators.

02

MASR provides 3x energy savings over existing solutions.

03

MASR delivers 1.6x performance improvements in RNN acceleration.

Abstract

Recurrent neural networks (RNNs) are becoming the de facto solution for speech recognition. RNNs exploit long-term temporal relationships in data by applying repeated, learned transformations. Unlike fully-connected (FC) layers with single vector matrix operations, RNN layers consist of hundreds of such operations chained over time. This poses challenges unique to RNNs that are not found in convolutional neural networks (CNNs) or FC models, namely large dynamic activation. In this paper we present MASR, a principled and modular architecture that accelerates bidirectional RNNs for on-chip ASR. MASR is designed to exploit sparsity in both dynamic activations and static weights. The architecture is enhanced by a series of dynamic activation optimizations that enable compact storage, ensure no energy is wasted computing null operations, and maintain high MAC utilization for highly parallel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.