Uncovering the Computational Roles of Nonlinearity in Sequence Modeling Using Almost-Linear RNNs
Manuel Brenner, Georgia Koppe

TL;DR
This paper introduces Almost Linear RNNs (AL-RNNs) to analyze the functional role of nonlinearity in sequence modeling, revealing that many operations emerge within linear regimes and that sparse nonlinearity enhances interpretability and efficiency.
Contribution
The paper presents a systematic framework using AL-RNNs to dissect when and how nonlinearity is necessary in recurrent networks, bridging theory and practical design.
Findings
Linear regimes can implement key computational primitives.
Sparse nonlinearity improves interpretability and efficiency.
Low-data and switching tasks benefit from sparse nonlinearity.
Abstract
Sequence modeling tasks across domains such as natural language processing, time series forecasting, and control require learning complex input-output mappings. Nonlinear recurrence is theoretically required for universal approximation of sequence-to-sequence functions, yet linear recurrent models often prove surprisingly effective. This raises the question of when nonlinearity is truly required. We present a framework to systematically dissect the functional role of nonlinearity in recurrent networks, identifying when it is computationally necessary and what mechanisms it enables. We address this using Almost Linear Recurrent Neural Networks (AL-RNNs), which allow recurrence nonlinearity to be gradually attenuated and decompose network dynamics into analyzable linear regimes, making computational mechanisms explicit. We illustrate the framework across diverse synthetic and real-world…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Reservoir Computing · Ferroelectric and Negative Capacitance Devices · Neural dynamics and brain function
