ICLR2025

ST-GCond: Self-supervised and Transferable Graph Dataset Condensation

Beining Yang, Qingyun Sun, Cheng Ji, Xingcheng Fu, Jianxin Li

Abstract

The increasing scale of graph datasets significantly enhances deep learning models but also presents substantial training challenges. Graph dataset condensation has emerged to condense large datasets into smaller yet informative ones that maintain similar test performance. However, these methods strictly require downstream usage to match the original dataset and task, leading to failures in crosstask and cross-dataset scenarios. To address such cross-task and cross-dataset challenges, we propose a novel Self-supervised and Transferable Graph dataset Condensation method named ST-GCond, providing effective and transferable condensed datasets. Specifically, for cross-task challenge, we propose a taskdisentangled meta optimization strategy to adaptively update the condensed graph according to the task relevance, encouraging information preservation for various tasks. For cross-dataset challenge, we propose a multi-teacher self-supervised optimization strategy to incorporate auxiliary self-supervised tasks to inject universal knowledge into the condensed graph. Additionally, we incorporate mutual information guided joint condensation mitigating the potential conflicts and ensure the condensing stability. Experiments on both node-level and graph-level datasets show that ST-GCond outperforms existing methods by 2.5% ∼ 18.7% in all cross-task and cross-dataset scenarios, and also achieves state-of-the-art performance on 5 out of 6 datasets in the single dataset and task scenario.