ASE2022

A transformer-based IDE plugin for vulnerability detection

Cláudia Mamede, Eduard Pinconschi, Rui Abreu

21 citations

Abstract

With the constant digitalization of our society, software usage is increasing daily. Similarly, vulnerabilities -security flaws found in software code that attackers could exploit -are also growing. Hence, it is in the companies' best interests to ensure their software is free of flaws as soon as possible to reduce the risk of attack. Consequently, companies are shifting left security to earlier stages of the software development cycle. Nevertheless, there are some roadblocks to the adoption of this principle. Traditionally, security only concerns security auditors, who have the expertise to configure and use security tools, such as static and dynamic code analyzers, at a later stage of a project on deployed software. This approach, along with the conventional techniques, such as static and dynamic code analysis, does not suit the shift left principle due to the required knowledge and the substantial gap between software development and security. Recent advancements with Deep Learning (DL) in vulnerability detection permit tackling some of the issues mentioned above as they eliminate the need for expert knowledge to configure and execute security tools. By changing the paradigm from rule-based program analysis tools to lightweight and efficient learning-based scanners integrated into development environments, developers can focus on quality from the start rather than waiting for errors to be discovered late in the Software Development Life Cycle (SDLC). Hence, considering the progress with state-of-the-art DL techniques, namely the transformer model, and the lack of developer-friendly tools in this area that provide feedback on the fly, a proof-of-concept for a transformer-based VS Code plugin that identifies vulnerabilities in Java files was developed. This tool uses the first transformer-based multi-label classification model for vulnerability detection, with an accuracy of almost 99% and f1-score of 94%. This solution must be understood as reliable base for further improvements. By training the model with larger datasets, containing real-word samples, it is possible to improve its accuracy. Moreover, the flexibility of the defined system architecture allows the tool scale easily to other programming languages and development environments.