MASR: A Modular Accelerator for Sparse RNNs
Udit Gupta, Brandon Reagen, Lillian Pentecost, Marco Donato, Thierry, Tambe, Alexander M. Rush, Gu-Yeon Wei, David Brooks

TL;DR
MASR is a modular accelerator designed for sparse bidirectional RNNs, significantly improving energy efficiency and performance for speech recognition tasks by exploiting sparsity and dynamic activation optimizations.
Contribution
The paper introduces MASR, a novel modular architecture that accelerates sparse RNNs by exploiting sparsity in activations and weights, with dynamic optimizations for high efficiency.
Findings
MASR achieves 2x area reduction compared to state-of-the-art accelerators.
MASR provides 3x energy savings over existing solutions.
MASR delivers 1.6x performance improvements in RNN acceleration.
Abstract
Recurrent neural networks (RNNs) are becoming the de facto solution for speech recognition. RNNs exploit long-term temporal relationships in data by applying repeated, learned transformations. Unlike fully-connected (FC) layers with single vector matrix operations, RNN layers consist of hundreds of such operations chained over time. This poses challenges unique to RNNs that are not found in convolutional neural networks (CNNs) or FC models, namely large dynamic activation. In this paper we present MASR, a principled and modular architecture that accelerates bidirectional RNNs for on-chip ASR. MASR is designed to exploit sparsity in both dynamic activations and static weights. The architecture is enhanced by a series of dynamic activation optimizations that enable compact storage, ensure no energy is wasted computing null operations, and maintain high MAC utilization for highly parallel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
