Unitary Evolution Recurrent Neural Networks

Martin Arjovsky; Amar Shah; Yoshua Bengio

arXiv:1511.06464·cs.LG·October 13, 2016·224 cites

Unitary Evolution Recurrent Neural Networks

Martin Arjovsky, Amar Shah, Yoshua Bengio

PDF

Open Access 2 Repos

TL;DR

This paper introduces a novel RNN architecture with unitary weight matrices to effectively learn long-term dependencies, overcoming training difficulties associated with eigenvalue constraints and gradient issues.

Contribution

It proposes a new parametrization of unitary matrices for RNNs that is computationally efficient and effective for learning long-term dependencies.

Findings

01

Achieved state-of-the-art results on tasks with long-term dependencies.

02

Demonstrated feasibility of complex domain optimization for RNN training.

03

Provided a scalable method for unitary matrix parametrization.

Abstract

Recurrent neural networks (RNNs) are notoriously difficult to train. When the eigenvalues of the hidden to hidden weight matrix deviate from absolute value 1, optimization becomes difficult due to the well studied issue of vanishing and exploding gradients, especially when trying to learn long-term dependencies. To circumvent this problem, we propose a new architecture that learns a unitary weight matrix, with eigenvalues of absolute value exactly 1. The challenge we address is that of parametrizing unitary matrices in a way that does not require expensive computations (such as eigendecomposition) after each weight update. We construct an expressive unitary weight matrix by composing several structured matrices that act as building blocks with parameters to be learned. Optimization with this parameterization becomes feasible only when considering hidden states in the complex domain. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Topic Modeling · Machine Learning and ELM

MethodsmodReLU · RMSProp · Unitary RNN