ASE2025

Hit The Bullseye On The First Shot: Improving LLMs Using Multi-Sample Self-Reward Feedback for Vulnerability Repair

Rui Jiao, Yue Zhang, Jinku Li, Jianfeng Ma

摘要

In recent years, large language models (LLMs) have emerged as powerful tools to assist developers in various coding tasks, including the challenging domain of vulnerability repair. While these models have demonstrated significant potential in generating patches for software vulnerabilities, current approaches often suffer from limitations in precision, requiring multiple attempts to produce accurate fixes. In this paper, we propose MUSSEL (Multi-Sample Self-Reward Feedback), a novel framework designed to address the issue of one-shot vulnerability patching. Inspired by insights from human learning mechanisms, our approach aims to enhance the efficiency and accuracy of LLMs in generating precise patches for software vulnerabilities. We introduce a multi-stage training process, beginning with supervised fine-tuning using domain-specific data to impart foundational knowledge in vulnerability repair to the LLM. Subsequently, we employ self-reward feedback learning to refine the model’s patch generation capabilities, leveraging correct and incorrect patches iteratively to improve performance. We also introduce a novel prompt design tailored to better align with the capabilities of LLMs during inference. Our results demonstrate that MUSSEL consistently outperforms state-of-the-art solutions in one-shot queries. Notably, even with a small beam size, MUSSEL exhibits remarkable efficiency, requiring minimal GPU memory resources. Furthermore, MUSSEL’s effectiveness across diverse CWEs underscores its significant security implications.