EMNLP2025

CodeSSM: Towards State Space Models for Code Understanding

Shweta Verma, Abhinav Anand, Mira Mezini

Abstract

Although transformers dominate many codespecific tasks, they have significant limitations. This paper explores State Space Models (SSMs) as a promising alternative for code understanding tasks such as retrieval, classification, and clone detection. We introduce CodeSSM, the first SSM-based model trained on code corpora to assess its effectiveness. Our results demonstrate that SSMs are more sampleefficient and can extrapolate to longer contexts beyond the pretraining length. Extensive experiments show that SSMs offer a viable alternative to transformers, addressing several their limitations. Additionally, CodeSSM reduces memory usage by up to 64% compared to transformers at a context length of 2048, with greater savings as context length grows. The code is available here. CodeSSM In this section, we introduce the basic architecture of CodeSSM and its variations that we investigate. We also explain the pretraining setup. Architecture CodeSSM is an encoder-only model consisting of 12 layers 1 . This model is built upon the Bidirec-