NeurIPS2022

S3GC: Scalable Self-Supervised Graph Clustering

Devvrit, Aditya Sinha, Inderjit S. Dhillon, Prateek Jain

38 citations

Abstract

We study the problem of clustering graphs with additional side-information of node features. The problem is extensively studied, and several existing methods exploit Graph Neural Networks to learn node representations [29]. However, most of the existing methods focus on generic representations instead of their cluster-ability or do not scale to large scale graph datasets. In this work, we propose S 3 GC which uses contrastive learning along with Graph Neural Networks and node features to learn clusterable features. We empirically demonstrate that S 3 GC is able to learn the correct cluster structure even when graph information or node features are individually not informative enough to learn correct clusters. Finally, using extensive evaluation on a variety of benchmarks, we demonstrate that S 3 GC is able to significantly outperform state-of-the-art methods in terms of clustering accuracy – with as much as 5% gain in NMI – while being scalable to graphs of size 100M.