CodeSSM: Towards State Space Models for Code Understanding
By: Shweta Verma, Abhinav Anand, Mira Mezini
Potential Business Impact:
Helps computers understand code better, faster, cheaper.
Although transformers dominate many code-specific tasks, they have significant limitations. This paper explores State Space Models (SSMs) as a promising alternative for code understanding tasks such as retrieval, classification, and clone detection. We introduce CodeSSM, the first SSM-based model trained on code corpora to assess its effectiveness. Our results demonstrate that SSMs are more sample-efficient and can extrapolate to longer contexts beyond the pretraining length. Extensive experiments show that SSMs offer a viable alternative to transformers, addressing several their limitations. Additionally, CodeSSM reduces memory usage by up to 64\% compared to transformers at a context length of 2048, with greater savings as context length grows.
Similar Papers
Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models
Machine Learning (CS)
Helps computers learn faster from long information.
Leveraging State Space Models in Long Range Genomics
Genomics
Helps computers understand long DNA codes better.
A Comparative Analysis of Contextual Representation Flow in State-Space and Transformer Architectures
Computation and Language
Makes computers understand long stories better.