Saving RNN Computations with a Neuron-Level Fuzzy Memoization Scheme
Franyell Silfa, Jose-Maria Arnau, Antonio Gonz\'alez

TL;DR
This paper introduces a neuron-level fuzzy memoization scheme for RNNs that caches and reuses neuron outputs based on a correlation with a simplified BNN, significantly reducing computations and energy consumption.
Contribution
The paper proposes a novel neuron-level fuzzy memoization method using BNNs to efficiently cache RNN outputs, achieving substantial computational savings.
Findings
Avoids over 26.7% of RNN computations
Achieves 21% energy savings
Provides 1.4x speedup on average
Abstract
Recurrent Neural Networks (RNNs) are a key technology for applications such as automatic speech recognition or machine translation. Unlike conventional feed-forward DNNs, RNNs remember past information to improve the accuracy of future predictions and, therefore, they are very effective for sequence processing problems. For each application run, recurrent layers are executed many times for processing a potentially large sequence of inputs (words, images, audio frames, etc.). In this paper, we observe that the output of a neuron exhibits small changes in consecutive invocations.~We exploit this property to build a neuron-level fuzzy memoization scheme, which dynamically caches each neuron's output and reuses it whenever it is predicted that the current output will be similar to a previously computed result, avoiding in this way the output computations. The main challenge in this scheme…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Fuzzy Logic and Control Systems · Advanced Neural Network Applications
