WWW2026

Difference-based Sample Selection for Federated Graph Rationalization

Linan Yue, Weibo Gao

摘要

Graph rationalization methods aim to improve the explainability of Graph Neural Networks by identifying critical subgraphs (rationales) for task prediction. Motivated by increasing concerns over data privacy, federated graph rationalization has recently gained traction as a novel research area. However, in federated settings, data heterogeneity across clients exacerbates shortcut learning, where models rely on spurious and client-specific features rather than invariant causal rationales. Existing solutions, such as environment-aware data augmentation, suffer from low-quality environment representations. To address this, we propose DiffGR, a Diff erence-based sample selection strategy for federated Graph Rationalization. DiffGR selects samples where local and global models exhibit the highest prediction discrepancies, as these likely reflect strong shortcut reliance, enabling more accurate environment representations. Additionally, we introduce a mutual information (MI) inspired environment-conditioned data augmentation method that minimizes MI between environments and predictions while maximizing MI between rationales and predictions. Experiments on real-world and synthetic datasets demonstrate the effectiveness of DiffGR in improving rationale quality and model robustness in federated settings. Code is available at https://github.com/yuelinan/Codes-of-DiffGR.