Provable Benefits of Complex Parameterizations for Structured State Space Models
Yuval Ran-Milo, Eden Lumbroso, Edo Cohen-Karlik, Raja Giryes, Amir, Globerson, Nadav Cohen

TL;DR
This paper provides a theoretical analysis showing that complex parameterizations in structured state space models (SSMs) are more efficient and practical than real parameterizations, supported by experiments demonstrating improved performance.
Contribution
It establishes formal theoretical gaps between real and complex diagonal SSMs, explaining the benefits of complex parameterizations for expressiveness and learnability.
Findings
Complex SSMs can express all real SSM mappings with moderate dimension.
Real SSMs require exponentially large parameters to match complex SSM expressiveness.
Experiments support the theoretical advantages of complex parameterizations and suggest new architectural improvements.
Abstract
Structured state space models (SSMs), the core engine behind prominent neural networks such as S4 and Mamba, are linear dynamical systems adhering to a specified structure, most notably diagonal. In contrast to typical neural network modules, whose parameterizations are real, SSMs often use complex parameterizations. Theoretically explaining the benefits of complex parameterizations for SSMs is an open problem. The current paper takes a step towards its resolution, by establishing formal gaps between real and complex diagonal SSMs. Firstly, we prove that while a moderate dimension suffices in order for a complex SSM to express all mappings of a real SSM, a much higher dimension is needed for a real SSM to express mappings of a complex SSM. Secondly, we prove that even if the dimension of a real SSM is high enough to express a given mapping, typically, doing so requires the parameters of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSimulation Techniques and Applications
MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces
