TL;DR
This paper introduces Structured Linear Controlled Differential Equations (SLiCEs), a flexible and efficient sequence modeling framework that matches the expressivity of dense matrices and improves computational speed, achieving state-of-the-art results on various benchmarks.
Contribution
The paper presents SLiCEs, a unifying framework that generalizes existing models and introduces novel variants with maximal expressivity and computational efficiency.
Findings
SLiCEs solve the A5 state-tracking benchmark with a single layer.
Achieve best-in-class length generalization on regular language tasks.
Match performance of log neural controlled differential equations while reducing training time by twentyfold.
Abstract
This work introduces Structured Linear Controlled Differential Equations (SLiCEs), a unifying framework for sequence models with structured, input-dependent state-transition matrices that retain the maximal expressivity of dense matrices whilst being cheaper to compute. The framework encompasses existing architectures, such as input-dependent block-diagonal linear recurrent neural networks and DeltaNet's diagonal-plus-low-rank structure, as well as two novel variants based on sparsity and the Walsh-Hadamard transform. We prove that, unlike the diagonal state-transition matrices of S4D and Mamba, SLiCEs employing block-diagonal, sparse, or Walsh-Hadamard matrices match the maximal expressivity of dense matrices. Empirically, SLiCEs solve the state-tracking benchmark with a single layer, achieve best-in-class length generalisation on regular language tasks among parallel-in-time…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces
