Deep Independently Recurrent Neural Network (IndRNN)
Shuai Li, Wanqing Li, Chris Cook, Yanbo Gao

TL;DR
This paper introduces the Independently Recurrent Neural Network (IndRNN), a new RNN variant that effectively learns long-term dependencies, is easier to train, and is significantly faster than LSTM, enabling very deep architectures and processing long sequences.
Contribution
The paper proposes IndRNN, a novel RNN architecture with independent neurons that addresses gradient issues and enables deeper, faster networks using non-saturated activations.
Findings
IndRNN effectively solves gradient vanishing and exploding problems.
IndRNN can be much deeper than traditional RNNs.
IndRNN outperforms RNN, LSTM, and Transformer on various tasks.
Abstract
Recurrent neural networks (RNNs) are known to be difficult to train due to the gradient vanishing and exploding problems and thus difficult to learn long-term patterns and construct deep networks. To address these problems, this paper proposes a new type of RNNs with the recurrent connection formulated as Hadamard product, referred to as independently recurrent neural network (IndRNN), where neurons in the same layer are independent of each other and connected across layers. Due to the better behaved gradient backpropagation, IndRNN with regulated recurrent weights effectively addresses the gradient vanishing and exploding problems and thus long-term dependencies can be learned. Moreover, an IndRNN can work with non-saturated activation functions such as ReLU (rectified linear unit) and be still trained robustly. Different deeper IndRNN architectures, including the basic stacked IndRNN,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Advanced Neural Network Applications · Human Pose and Action Recognition
MethodsTanh Activation · Sigmoid Activation · *Communicated@Fast*How Do I Communicate to Expedia? · Long Short-Term Memory
