Shuffling Recurrent Neural Networks

Michael Rotman; Lior Wolf

arXiv:2007.07324·cs.LG·July 16, 2020

Shuffling Recurrent Neural Networks

Michael Rotman, Lior Wolf

PDF

Open Access 1 Repo

TL;DR

This paper introduces a simple, efficient recurrent neural network model that permutes hidden state elements to improve training stability and achieves competitive results without vanishing or exploding gradients.

Contribution

The paper presents a novel RNN architecture using permutation of hidden states, offering a new approach that is easy to implement and avoids gradient issues.

Findings

01

Achieves competitive performance on benchmark tasks.

02

Does not suffer from vanishing or exploding gradients.

03

Simple and efficient to implement.

Abstract

We propose a novel recurrent neural network model, where the hidden state $h_{t}$ is obtained by permuting the vector elements of the previous hidden state $h_{t - 1}$ and adding the output of a learned function $b (x_{t})$ of the input $x_{t}$ at time $t$ . In our model, the prediction is given by a second learned function, which is applied to the hidden state $s (h_{t})$ . The method is easy to implement, extremely efficient, and does not suffer from vanishing nor exploding gradients. In an extensive set of experiments, the method shows competitive results, in comparison to the leading literature baselines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rotmanmi/SRNN
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Neural Networks and Applications · Domain Adaptation and Few-Shot Learning