WWW2026
Scalable and Provable Biclique-Preserving Clustering: The Power of Counting-based Approaches
Longlong Lin, Zeli Wang, Rong-Hua Li, Xiaohai Dai, Li Ni, Jin Zhao
Abstract
Bipartite graphs are widely used to model relationships between entities of different types, where vertices are divided into two disjoint sets. Biclique-preserving clustering is a fundamental operation that retrieves clusters with dense bicliques, enabling various emerging applications. However, existing methods either fail to accurately capture the unique properties of bipartite graphs or significantly overlook the informative higher-order biclique substructure, leading to compromised clustering quality. Additionally, existing methods are overly dependent on biclique enumeration, resulting in poor scalability. To address these challenges, we propose ECRC, a simple yet provable Edge-Centric Reweighting Clustering framework that provides strict approximation guarantees for any biclique. A key advantage of ECRC is its ability to leverage powerful counting instead of exhaustive enumeration, significantly reducing time and space complexity. To further improve efficiency, we propose several effective graph reduction strategies to eliminate the unqualified vertices and edges before calculating the edge-centric weight. Extensive experiments on five datasets show that our algorithms are more efficient and effective compared to six baselines.