Recurrent Orthogonal Networks and Long-Memory Tasks
Mikael Henaff, Arthur Szlam, Yann LeCun

TL;DR
This paper analyzes how RNNs can model long-term dependencies by constructing explicit solutions for synthetic tasks, shedding light on their information storage mechanisms and the effectiveness of unitary constraints.
Contribution
It provides explicit RNN constructions for long-memory tasks, clarifies how RNNs store information, and explains the success of unitary initialization methods.
Findings
Constructed explicit RNN solutions for long-memory tasks
Analyzed how RNNs store different types of information
Explained the success of unitary constraints in RNNs
Abstract
Although RNNs have been shown to be powerful tools for processing sequential data, finding architectures or optimization strategies that allow them to model very long term dependencies is still an active area of research. In this work, we carefully analyze two synthetic datasets originally outlined in (Hochreiter and Schmidhuber, 1997) which are used to evaluate the ability of RNNs to store information over many time steps. We explicitly construct RNN solutions to these problems, and using these constructions, illuminate both the problems themselves and the way in which RNNs store different types of information in their hidden states. These constructions furthermore explain the success of recent methods that specify unitary initializations or constraints on the transition matrices.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
