CodeSSM: Towards State Space Models for Code Understanding
Shweta Verma, Abhinav Anand, Mira Mezini

TL;DR
CodeSSM introduces a novel State Space Model approach for code understanding, demonstrating improved efficiency, longer context extrapolation, and reduced memory usage compared to transformers.
Contribution
This work is the first to apply SSMs to code understanding, showing their advantages over transformers in efficiency and scalability.
Findings
SSMs are more sample-efficient than transformers.
CodeSSM can handle longer contexts beyond pretraining limits.
Memory usage is reduced by up to 64% at a context length of 2048.
Abstract
Although transformers dominate many code-specific tasks, they have significant limitations. This paper explores State Space Models (SSMs) as a promising alternative for code understanding tasks such as retrieval, classification, and clone detection. We introduce CodeSSM, the first SSM-based model trained on code corpora to assess its effectiveness. Our results demonstrate that SSMs are more sample-efficient and can extrapolate to longer contexts beyond the pretraining length. Extensive experiments show that SSMs offer a viable alternative to transformers, addressing several their limitations. Additionally, CodeSSM reduces memory usage by up to 64\% compared to transformers at a context length of 2048, with greater savings as context length grows.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSoftware Engineering Research · Service-Oriented Architecture and Web Services · Advanced Software Engineering Methodologies
