Analysis of Long Range Dependency Understanding in State Space Models
Srividya Ravikumar, Abhinav Anand, Shweta Verma, Mira Mezini

TL;DR
This paper systematically analyzes the interpretability of state-space models, specifically S4D, in long-range dependency tasks, revealing how different architectures influence their filtering behavior and performance.
Contribution
It provides the first kernel interpretability study of S4D in real-world tasks, linking architecture choices to filtering properties and model effectiveness.
Findings
S4D kernels can act as low-pass, band-pass, or high-pass filters.
Long-range modeling capability varies significantly with architecture.
Insights can guide the design of improved S4D models.
Abstract
Although state-space models (SSMs) have demonstrated strong performance on long-sequence benchmarks, most research has emphasized predictive accuracy rather than interpretability. In this work, we present the first systematic kernel interpretability study of the diagonalized state-space model (S4D) trained on a real-world task (vulnerability detection in source code). Through time and frequency domain analysis of the S4D kernel, we show that the long-range modeling capability of S4D varies significantly under different model architectures, affecting model performance. For instance, we show that the depending on the architecture, S4D kernel can behave as low-pass, band-pass or high-pass filter. The insights from our analysis can guide future work in designing better S4D-based models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Security and Verification in Computing · Advanced Malware Detection Techniques
