Towards Understanding What State Space Models Learn About Code
Jiali Wu, Abhinav Anand, Shweta Verma, Mira Mezini

TL;DR
This paper systematically analyzes what State Space Models (SSMs) learn about code, comparing them to Transformers, and introduces methods to improve their performance based on these insights.
Contribution
It provides the first detailed analysis of SSMs for code, introduces SSM-Interpret for spectral analysis, and proposes architectural improvements to enhance SSM-based code models.
Findings
SSMs outperform Transformers in capturing code syntax and semantics during pretraining.
SSMs tend to forget certain syntactic and semantic relations during fine-tuning, especially for short-range dependencies.
Architectural modifications based on analysis significantly improve SSM performance.
Abstract
State Space Models (SSMs) have emerged as an efficient alternative to the transformer architecture. Recent studies show that SSMs can match or surpass Transformers on code understanding tasks, such as code retrieval, when trained under similar conditions. However, their internal mechanisms remain a black box. We present the first systematic analysis of what SSM-based code models actually learn and perform the first comparative analysis of SSM and Transformer-based code models. Our analysis reveals that SSMs outperform Transformers at capturing code syntax and semantics in pretraining but forgets certain syntactic and semantic relations during fine-tuning on task, especially when the task emphasizes short-range dependencies. To diagnose this, we introduce SSM-Interpret, a frequency-domain framework that exposes a spectral shift toward short-range dependencies during fine-tuning. Guided…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software System Performance and Reliability · Advanced Software Engineering Methodologies
