iRNN: Integer-only Recurrent Neural Network

Eyy\"ub Sari; Vanessa Courville; Vahid Partovi Nia

arXiv:2109.09828·cs.LG·February 16, 2022

iRNN: Integer-only Recurrent Neural Network

Eyy\"ub Sari, Vanessa Courville, Vahid Partovi Nia

PDF

Open Access

TL;DR

This paper introduces iRNN, a quantization-aware training method that enables integer-only RNNs with layer normalization and attention, achieving comparable accuracy to full-precision models while significantly improving efficiency for edge AI applications.

Contribution

The paper presents a novel quantization-aware training approach supporting layer normalization and attention in integer-only RNNs, facilitating efficient deployment on edge devices.

Findings

01

iRNN maintains similar accuracy to full-precision RNNs.

02

Deployment on smartphones doubles runtime performance.

03

Model size is reduced by 4 times.

Abstract

Recurrent neural networks (RNN) are used in many real-world text and speech applications. They include complex modules such as recurrence, exponential-based activation, gate interaction, unfoldable normalization, bi-directional dependence, and attention. The interaction between these elements prevents running them on integer-only operations without a significant performance drop. Deploying RNNs that include layer normalization and attention on integer-only arithmetic is still an open problem. We present a quantization-aware training method for obtaining a highly accurate integer-only recurrent neural network (iRNN). Our approach supports layer normalization, attention, and an adaptive piecewise linear approximation of activations (PWL), to serve a wide range of RNNs on various applications. The proposed method is proven to work on RNN-based language models and challenging automatic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Topic Modeling · Natural Language Processing Techniques

MethodsLayer Normalization