SHARP: An Adaptable, Energy-Efficient Accelerator for Recurrent Neural Network
Reza Yazdani, Olatunji Ruwase, Minjia Zhang, Yuxiong He, Jose-Maria, Arnau, Antonio Gonzalez

TL;DR
SHARP is a reconfigurable, energy-efficient hardware accelerator for RNNs that adaptively handles diverse model sizes and data dependencies, significantly improving speed and power efficiency over prior architectures.
Contribution
This work introduces SHARP, a novel adaptable RNN accelerator with dynamic reconfiguration and intelligent scheduling, addressing the limitations of previous fixed architectures.
Findings
Achieves up to 82x speedup over GPU implementations.
Reduces energy consumption significantly compared to prior solutions.
Provides adaptable architecture for diverse RNN model configurations.
Abstract
The effectiveness of Recurrent Neural Networks (RNNs) for tasks such as Automatic Speech Recognition has fostered interest in RNN inference acceleration. Due to the recurrent nature and data dependencies of RNN computations, prior work has designed customized architectures specifically tailored to the computation pattern of RNN, getting high computation efficiency for certain chosen model sizes. However, given that the dimensionality of RNNs varies a lot for different tasks, it is crucial to generalize this efficiency to diverse configurations. In this work, we identify adaptiveness as a key feature that is missing from today's RNN accelerators. In particular, we first show the problem of low resource-utilization and low adaptiveness for the state-of-the-art RNN implementations on GPU, FPGA and ASIC architectures. To solve these issues, we propose an intelligent tiled-based dispatching…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Parallel Computing and Optimization Techniques · Neural Networks and Applications
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
