Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections
Zakaria Mhammedi, Andrew Hellicar, Ashfaqur Rahman, James Bailey

TL;DR
This paper introduces a new efficient parametrisation for orthogonal matrices in RNNs, enabling better learning of long-term dependencies without the computational drawbacks of previous unitary approaches.
Contribution
The paper proposes a novel orthogonal parametrisation for RNN transition matrices, improving training efficiency while maintaining benefits of orthogonality.
Findings
Orthogonal constraint improves long-term dependency learning.
New parametrisation matches benefits of unitary constraints.
Method scales better with network size.
Abstract
The problem of learning long-term dependencies in sequences using Recurrent Neural Networks (RNNs) is still a major challenge. Recent methods have been suggested to solve this problem by constraining the transition matrix to be unitary during training which ensures that its norm is equal to one and prevents exploding gradients. These methods either have limited expressiveness or scale poorly with the size of the network when compared with the simple RNN case, especially when using stochastic gradient descent with a small mini-batch size. Our contributions are as follows; we first show that constraining the transition matrix to be unitary is a special case of an orthogonal constraint. Then we present a new parametrisation of the transition matrix which allows efficient training of an RNN while ensuring that the matrix is always orthogonal. Our results show that the orthogonal constraint…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Model Reduction and Neural Networks · Neural Networks and Applications
