ICCV2023

Enhancing Fine-Tuning based Backdoor Defense with Sharpness-Aware Minimization

Mingli Zhu, Shaokui Wei, Li Shen, Yanbo Fan, Baoyuan Wu

95 citations

Abstract

Backdoor defense, which aims to detect or mitigate the effect of malicious triggers introduced by attackers, is becoming increasingly critical for machine learning security and integrity. Fine-tuning based on benign data is a natural defense to erase the backdoor effect in a backdoored model. However, recent studies show that, given limited benign data, vanilla fine-tuning has poor defense performance. In this work, we firstly investigate the vanilla fine-tuning process for backdoor mitigation from the neuron weight perspective, and find that backdoor-related neurons are only slightly perturbed in the vanilla fine-tuning process, which explains its poor backdoor defense performance. To enhance the fine-tuning based defense, inspired by the observation that the backdoor-related neurons often have larger weight norms, we propose FT-SAM, a novel backdoor defense paradigm that aims to shrink the norms of backdoor-related neurons by incorporating sharpness-aware minimization with fine-tuning. We demonstrate the effectiveness of our method on several benchmark datasets and network architectures, where it achieves state-of-the-art defense performance, and provide extensive analysis to reveal the FT-SAM’s mechanism. Overall, our work provides a promising avenue for improving the robustness of machine learning models against backdoor attacks. Codes are available at https://github.com/SCLBD/BackdoorBench.