State-Free Inference of State-Space Models: The Transfer Function Approach
Rom N. Parnichkun, Stefano Massaroli, Alessandro Moro, Jimmy T.H., Smith, Ramin Hasani, Mathias Lechner, Qi An, Christopher R\'e, Hajime Asama,, Stefano Ermon, Taiji Suzuki, Atsushi Yamashita, Michael Poli

TL;DR
This paper introduces a state-free inference method for state-space models using transfer functions, enabling efficient, scalable sequence processing with improved training speed and performance in language modeling and benchmarks.
Contribution
It proposes a novel frequency domain transfer function parametrization that allows direct spectrum computation, leading to a highly efficient, memory-friendly inference algorithm for deep learning models.
Findings
35% training speed improvement over S4 layers
State-of-the-art performance on Long Range Arena benchmark
Improved perplexity in language modeling
Abstract
We approach designing a state-space model for deep learning applications through its dual representation, the transfer function, and uncover a highly efficient sequence parallel inference algorithm that is state-free: unlike other proposed algorithms, state-free inference does not incur any significant memory or computational cost with an increase in state size. We achieve this using properties of the proposed frequency domain transfer function parametrization, which enables direct computation of its corresponding convolutional kernel's spectrum via a single Fast Fourier Transform. Our experimental results across multiple sequence lengths and state sizes illustrates, on average, a 35% training speed improvement over S4 layers -- parametrized in time-domain -- on the Long Range Arena benchmark, while delivering state-of-the-art downstream performances over other attention-free…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
