ASE2025

Can Mamba Be Better? An Experimental Evaluation of Mamba in Code Intelligence

Shuo Liu, Jacky Keung, Zhen Yang, Zhenyu Mao, Yicheng Sun

被引用 1 次

摘要

The Transformer architecture and its core attention mechanism form the foundation of Pre-trained Language Models (PLMs) and have driven their remarkable progress across a wide range of code intelligence tasks. However, the quadratic complexity inherent in the attention mechanism poses scalability challenges. Recently, sub-quadratic architectures such as Mamba and Mamba-2 have emerged as compelling alternatives to the Transformer. While they have shown promising results and attracted increasing academic interest, their effectiveness in code intelligence tasks has not yet been fully explored.To fill this gap, we present the first systematic empirical study of Mamba-based PLMs on three typical code tasks (i.e., code completion, code generation, and code clone detection), covering both the code comprehension and generation categories to delve into their effectiveness and efficiency. We first pre-train two Mamba-based PLMs on code based on Mamba and Mamba-2, respectively. Subsequently, we evaluate these four PLMs against typical Transformer-based PLMs (e.g., CodeGPT) with Full fine-Tuning (FT) and Parameter-Efficient Fine-Tuning (PEFT) settings, demonstrating the overall superiority of Mamba-based PLMs across all code tasks. Subsequent experiments involve the architecture analysis via pre-training from scratch to isolate the influence of the training corpora and low-resource analysis via deliberately limiting the fine-tuning data volume. All demonstrate the superiority of Mamba-based PLMs in both efficacy and efficiency. Finally, we also extend the sizes of PLMs to larger scales (7B at most) and make comparisons with more diverse PLMs/LLMs. Experimental results demonstrate that pre-training corpora and tasks also heavily affect the code modeling performance, apart from architectures. This work provides a comprehensive investigation into Mamba-based PLMs in the context of code intelligence, uncovering their strengths, limitations, and potential for future applications.