ICLR2025
Learning Graph Invariance by Harnessing Spuriosity
Tianjun Yao, Yongqiang Chen, Kai Hu, Tongliang Liu, Kun Zhang, Zhiqiang Shen
摘要
Recently, graph invariant learning has become the de facto approach to tackle the Out-of-Distribution (OOD) generalization failure in graph representation learning. They generally follow the framework of invariant risk minimization to capture the invariance of graph data from different environments. Despite some success, it remains unclear to what extent existing approaches have captured invariant features for OOD generalization on graphs. In this work, we find that representative OOD methods such as IRM and VRex, and their variants on graph invariant learning may have captured a limited set of invariant features. To tackle this challenge, we propose LIRS, a novel learning framework designed to Learn graph Invariance by Removing Spurious features. Different from most existing approaches that directly learn the invariant features, LIRS takes an indirect approach by first learning the spurious features and then removing them from the ERM-learned features. We demonstrate that learning the invariant graph features in an indirect way enables the model to capture a more comprehensive set of invariant features, leading to better OOD generalization performance in novel environments. Notably, LIRS surpasses the second-best method by as much as 25.50% across all competitive baselines, underscoring its efficacy in OOD generalization. 1 Assumption 1 posits that one or more substructure patterns in G c are not only stably associated with the target label Y across different environments but also possess sufficient predictive power to accurately determine Y . This assumption is well-aligned with real-world scenarios. For instance, in the GOODHIV dataset (Gui et al., 2022; Hu et al., 2020; Wu et al., 2018) , a molecule's ability to inhibit HIV may depend on the presence of several functional groups interacting with various parts of the virus. Moreover, recent study also provides empirical evidence for graph applications, suggesting that multiple substructures remain stable and predictive of the targets (see Appendix E.2 in Bui et al. ( 2024 )), thereby supporting the validity of Assumption 1. Therefore, when the OOD algorithms are able to learn a broader set of invariant substructures (features), they will generalize more effectively across different environments.