On the Generalization Properties of Selective State-Space Models for Filtering Tasks for Unknown Systems
Alex Tang, M. Emrullah Ildiz, Batin Kurt, Samet Oymak, Necmiye Ozay

TL;DR
This paper investigates the ability of Selective State-Space Models (SSMs) to generalize in filtering tasks for unknown systems, providing theoretical bounds and empirical comparisons with transformers.
Contribution
It offers the first theoretical analysis of SSMs' generalization in filtering unknown systems and compares their performance to transformers.
Findings
SSMs can effectively predict outputs of unknown systems after training on related trajectories.
Theoretical bounds explain why SSMs succeed in filtering tasks.
Empirical results demonstrate competitive performance of SSMs versus transformers.
Abstract
Selective State-Space Models (SSMs) such as Mamba have emerged as an alternative architecture to self-attention based transformers in sequence modeling tasks. Recent works have demonstrated the use of transformers in some filtering and output prediction tasks via in-context learning. In this paper, we analyze whether structured SSMs can work equally well for filtering of unknown systems. In particular, we train the SSM on trajectory samples from a set of systems. At run-time, the SSM is given the outputs of an unknown system from the same set and is expected to predict the next output online. Theoretically, under appropriate assumptions, we derive generalization bounds as to why SSMs succeed in such tasks. Empirically, we demonstrate the performance via several numerical examples. We also discuss the advantages and disadvantages of SSMs versus transformers for this task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
