SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural   Network

Reza Yazdani; Olatunji Ruwase; Minjia Zhang; Yuxiong He; Jose-Maria; Arnau; Antonio Gonzalez

arXiv:1911.01258·cs.LG·May 23, 2023

SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural Network

Reza Yazdani, Olatunji Ruwase, Minjia Zhang, Yuxiong He, Jose-Maria, Arnau, Antonio Gonzalez

PDF

Open Access

TL;DR

SHARP is a reconfigurable, energy-efficient hardware accelerator for RNNs that adaptively handles diverse model sizes and data dependencies, significantly improving speed and power efficiency over prior architectures.

Contribution

This work introduces SHARP, a novel adaptable RNN accelerator with dynamic reconfiguration and intelligent scheduling, addressing the limitations of previous fixed architectures.

Findings

01

Achieves up to 82x speedup over GPU implementations.

02

Reduces energy consumption significantly compared to prior solutions.

03

Provides adaptable architecture for diverse RNN model configurations.

Abstract

The effectiveness of Recurrent Neural Networks (RNNs) for tasks such as Automatic Speech Recognition has fostered interest in RNN inference acceleration. Due to the recurrent nature and data dependencies of RNN computations, prior work has designed customized architectures specifically tailored to the computation pattern of RNN, getting high computation efficiency for certain chosen model sizes. However, given that the dimensionality of RNNs varies a lot for different tasks, it is crucial to generalize this efficiency to diverse configurations. In this work, we identify adaptiveness as a key feature that is missing from today's RNN accelerators. In particular, we first show the problem of low resource-utilization and low adaptiveness for the state-of-the-art RNN implementations on GPU, FPGA and ASIC architectures. To solve these issues, we propose an intelligent tiled-based dispatching…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Parallel Computing and Optimization Techniques · Neural Networks and Applications

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory