Efficient Sparse Selective-Update RNNs for Long-Range Sequence Modeling
Bojian Yin, Shurong Wang, Haoyu Tan, Sander Bohte, Federico Corradi, and Guoqi Li

TL;DR
This paper introduces Selective-Update RNNs (suRNNs), a novel architecture that preserves memory during low-information intervals, enabling efficient long-range sequence modeling comparable to Transformers.
Contribution
The paper proposes suRNNs, which use neuron-level binary switches to selectively update, improving long-term memory retention and efficiency over traditional RNNs.
Findings
suRNNs match or outperform complex models like Transformers on benchmarks.
They maintain exact memory during low-information periods, enhancing long-range dependencies.
The approach is more efficient for long sequences compared to standard RNNs.
Abstract
Real-world sequential signals, such as audio or video, contain critical information that is often embedded within long periods of silence or noise. While recurrent neural networks (RNNs) are designed to process such data efficiently, they often suffer from ``memory decay'' due to a rigid update schedule: they typically update their internal state at every time step, even when the input is static. This constant activity forces the model to overwrite its own memory and makes it hard for the learning signal to reach back to distant past events. Here we show that we can overcome this limitation using Selective-Update RNNs (suRNNs), a non-linear architecture that learns to preserve its memory when the input is redundant. By using a neuron-level binary switch that only opens for informative events, suRNNs decouple the recurrent updates from the raw sequence length. This mechanism allows the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
