Parallelizable memory recurrent units
Florent De Geeter, Gaspard Lambrechts, Damien Ernst, Guillaume Drion

TL;DR
This paper introduces memory recurrent units (MRUs) that combine the persistent memory of nonlinear RNNs with the parallel processing capabilities of state-space models, enabling efficient long-term sequence modeling.
Contribution
The authors propose a new family of RNNs called MRUs that leverage multistability for persistent memory while maintaining parallelizable computations, and introduce a specific implementation, BMRU.
Findings
BMRU achieves good results on long-term dependency tasks.
Hybrid networks combining MRUs and SSMs are both parallelizable and capable of transient and persistent memory.
MRUs outperform traditional SSMs in tasks requiring long-term memory.
Abstract
With the emergence of massively parallel processing units, parallelization has become a desirable property for new sequence models. The ability to parallelize the processing of sequences with respect to the sequence length during training is one of the main factors behind the uprising of the Transformer architecture. However, Transformers lack efficiency at sequence generation, as they need to reprocess all past timesteps at every generation step. Recently, state-space models (SSMs) emerged as a more efficient alternative. These new kinds of recurrent neural networks (RNNs) keep the efficient update of the RNNs while gaining parallelization by getting rid of nonlinear dynamics (or recurrence). SSMs can reach state-of-the art performance through the efficient training of potentially very large networks, but still suffer from limited representation capabilities. In particular, SSMs cannot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Neural Networks and Reservoir Computing · Embedded Systems Design Techniques
