ICLR2025

Robustness Inspired Graph Backdoor Defense

Zhiwei Zhang, Minhua Lin, Junjie Xu, Zongyu Wu, Enyan Dai, Suhang Wang

摘要

Graph Neural Networks (GNNs) have achieved promising results in tasks such as node classification and graph classification. However, recent studies reveal that GNNs are vulnerable to backdoor attacks, posing a significant threat to their real-world adoption. Despite initial efforts to defend against specific graph backdoor attacks, there is no work on defending against various types of backdoor attacks where generated triggers have different properties. Hence, we first empirically verify that prediction variance under edge dropping is a crucial indicator for identifying poisoned nodes. With this observation, we propose using random edge dropping to detect backdoors and theoretically show that it can efficiently distinguish poisoned nodes from clean ones. Furthermore, we introduce a novel robust training strategy to efficiently counteract the impact of the triggers. Extensive experiments on real-world datasets show that our framework can effectively identify poisoned nodes, significantly degrade the attack success rate, and maintain clean accuracy when defending against various types of graph backdoor attacks with different properties. Our code is available at: github.com/zzwjames/RIGBD. Published as a conference paper at ICLR 2025 triggers and target nodes. Zhang et al. (2024) show that backdoor triggers tend to be outliers which can be removed with graph outlier detection (OD). To address this, they further proposed DPGBA which can generate in-distribution triggers. Despite initial efforts on backdoor defense, they generally utilize backdoor specific properties to defend against specific backdoor and are ineffective across various types of backdoor triggers and attack methods. Therefore, in this paper, we study an important problem of developing an effective graph backdoor defense method against various types of backdoor triggers and attack methods. In essence, we are faced with two challenges: (i) How to efficiently and precisely identify poisoned nodes and backdoor triggers, even when those triggers are indistinguishable from clean nodes? (ii) How to minimize the impact of backdoor triggers when some of the triggers are not identified? In an attempt to address these challenges, we propose a novel framework Robustness Inspired Graph Backdoor Defense (RIGBD). To efficiently and precisely identify poisoned nodes, we empirically show in Section 3.2 that removing edges linking backdoor triggers typically leads to large prediction variance for poisoned target nodes. Based on this observation, we propose training a backdoored model with specially designed graph convolution operations on a poisoned graph, performing random edge dropping, and identifying nodes with high prediction variance as poisoned nodes. With candidate poisoned nodes and identified target class, we propose a novel robust GNN training loss, which minimizes model's prediction confidence on the target class for poisoned nodes to efficiently counteract the impact of the triggers. Such strategy is effective even if part of poisoned nodes are not identified in the training set. Our main contributions are: (i) We empirically verify that poisoned nodes typically exhibit large prediction variance under edge dropping. (ii) Theoretical analysis guarantees that our specially designed graph convolution operations can precisely distinguish poisoned nodes from clean nodes through random edge dropping. (iii) We propose a novel training strategy to train a backdoor robust GNN model even though some poisoned nodes are not identified. (iv) Extensive experiments show the effectiveness of RIGBD in defending against backdoor attacks and maintaining clean accuracy. RELATED WORK Graph Backdoor Attacks. SBA (Zhang et al., 2021 ) is a seminal work on graph backdoor attacks, which adopts randomly generated graphs as triggers. GTA (Xi et al., 2021) adopts a backdoor trigger generator to generate more powerful sample-specific triggers to improve the attack success rate. UGBA (Dai et al., 2023) introduces an unnoticeable loss function aimed at maximizing the cosine similarity between backdoor triggers and target nodes to improve the stealthiness of their attack. DPGBA (Zhang et al., 2024) introduces an outlier detector and uses adversarial learning to generate in-distribution triggers, addressing low ASR or outlier issues in existing graph backdoor attacks. More about graph backdoor attacks are in Appendix A. Graph Backdoor Defense. UGBA (Dai et al., 2023) denotes that the attributes of triggers differ significantly from the attached poisoned nodes in GTA (Xi et al., 2021) , thereby violating the homophily property typically observed in real-world graphs. Thus they propose a defense method called Prune, which removes edges that connect nodes with low similarity. DPGBA (Zhang et al., 2024) further indicates that although the triggers in UGBA (Dai et al., 2023) may demonstrate high similarity to target nodes, the triggers in both UGBA (Dai et al., 2023) and GTA (Xi et al.,