WWW2026

TGSBM: Transformer-Guided Stochastic Block Model for Link Prediction

Zhejian Yang, Songwei Zhao, Zilin Zhao, Hechang Chen

Abstract

Link prediction is a cornerstone of the Web ecosystem, powering applications from recommendation and search to knowledge graph completion and collaboration forecasting. However, largescale networks present unique challenges: they contain hundreds of thousands of nodes and edges with heterogeneous and overlapping community structures that evolve over time. Existing approaches face notable limitations: traditional graph neural networks struggle to capture global structural dependencies, while recent graph transformers achieve strong performance but incur quadratic complexity and lack interpretable latent structure. We propose TGSBM (Transformer-Guided Stochastic Block Model), a framework that integrates the principled generative structure of Overlapping Stochastic Block Models with the representational power of sparse Graph Transformers. TGSBM comprises three main components: (i) expander-augmented sparse attention that enables near-linear complexity and efficient global mixing, (ii) a neural variational encoder that infers structured posteriors over community memberships and strengths, and (iii) a neural edge decoder that reconstructs links via OSBM's generative process, preserving interpretability. Experiments across diverse benchmarks demonstrate competitive performance (mean rank 1.6 under HeaRT protocol), superior scalability (up to 6× faster training), and interpretable community structures. These results position TGSBM as a practical approach that strikes a balance between accuracy, efficiency, and transparency for largescale link prediction. CCS Concepts • Computing methodologies → Latent variable models; • Information systems → Data mining.