Approximate Fixed-Points in Recurrent Neural Networks
Zhengxiong Wang, Anton Ragni

TL;DR
This paper introduces a novel approach to reformulating recurrent neural networks as fixed-point problems, enabling parallel computation of approximate fixed-points for training and inference, which maintains competitive performance.
Contribution
It presents a fixed-point reformulation of RNNs allowing parallel approximation, addressing efficiency and inconsistency issues in traditional training and inference methods.
Findings
Approximate fixed-points can be computed in parallel.
The method achieves competitive performance on language modeling tasks.
It enables consistent training and inference for RNNs.
Abstract
Recurrent neural networks are widely used in speech and language processing. Due to dependency on the past, standard algorithms for training these models, such as back-propagation through time (BPTT), cannot be efficiently parallelised. Furthermore, applying these models to more complex structures than sequences requires inference time approximations, which introduce inconsistency between inference and training. This paper shows that recurrent neural networks can be reformulated as fixed-points of non-linear equation systems. These fixed-points can be computed using an iterative algorithm exactly and in as many iterations as the length of any given sequence. Each iteration of this algorithm adds one additional Markovian-like order of dependencies such that upon termination all dependencies modelled by the recurrent neural networks have been incorporated. Although exact fixed-points…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Neural Networks and Applications
