WaveSSM: Multiscale State-Space Models for Non-stationary Signal Attention
Ruben Solozabal, Velibor Bojkovic, Hilal Alquabeh, Klea Ziu, Kentaro Inui, Martin Takac

TL;DR
WaveSSM introduces wavelet-based state-space models that excel at modeling non-stationary signals with localized and transient features, outperforming traditional polynomial-based SSMs like S4 on real-world datasets.
Contribution
This work presents WaveSSM, a novel class of state-space models using wavelet frames for better localization in non-stationary signal modeling, improving over polynomial-based approaches.
Findings
WaveSSM outperforms S4 on physiological and audio datasets.
Wavelet frames provide better localization for transient signals.
Empirical results demonstrate improved accuracy in real-world tasks.
Abstract
State-space models (SSMs) have emerged as a powerful foundation for long-range sequence modeling, with the HiPPO framework showing that continuous-time projection operators can be used to derive stable, memory-efficient dynamical systems that encode the past history of the input signal. However, existing projection-based SSMs often rely on polynomial bases with global temporal support, whose inductive biases are poorly matched to signals exhibiting localized or transient structure. In this work, we introduce \emph{WaveSSM}, a collection of SSMs constructed over wavelet frames. Our key observation is that wavelet frames yield a localized support on the temporal dimension, useful for tasks requiring precise localization. Empirically, we show that on equal conditions, \textit{WaveSSM} outperforms orthogonal counterparts as S4 on real-world datasets with transient dynamics, including…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · EEG and Brain-Computer Interfaces · Speech and Audio Processing
