Robustifying State-space Models for Long Sequences via Approximate Diagonalization
Annan Yu, Arnur Nigmetov, Dmitriy Morozov, Michael W. Mahoney, N., Benjamin Erichson

TL;DR
This paper introduces a novel perturb-then-diagonalize methodology for robustly diagonalizing non-normal matrices in state-space models, leading to more resilient and accurate long-sequence learning models like S4-PTD and S5-PTD.
Contribution
The paper proposes a general PTD approach for approximate diagonalization, improving robustness and convergence of state-space models for long sequences.
Findings
S4-PTD/S5-PTD models converge strongly to HiPPO framework
Models show resilience to Fourier-mode noise-perturbed inputs
S5-PTD achieves 87.6% accuracy on Long-Range Arena benchmark
Abstract
State-space models (SSMs) have recently emerged as a framework for learning long-range sequence tasks. An example is the structured state-space sequence (S4) layer, which uses the diagonal-plus-low-rank structure of the HiPPO initialization framework. However, the complicated structure of the S4 layer poses challenges; and, in an effort to address these challenges, models such as S4D and S5 have considered a purely diagonal structure. This choice simplifies the implementation, improves computational efficiency, and allows channel communication. However, diagonalizing the HiPPO framework is itself an ill-posed problem. In this paper, we propose a general solution for this and related ill-posed diagonalization problems in machine learning. We introduce a generic, backward-stable "perturb-then-diagonalize" (PTD) methodology, which is based on the pseudospectral theory of non-normal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBlind Source Separation Techniques · Advanced Memory and Neural Computing · Wireless Signal Modulation Classification
