SIGMOD2025
SBSC: A fast Self-tuned Bipartite proximity graph-based Spectral Clustering
Abdul Atif Khan, Rashmi Maheshwari, Mohammad Maksood Akhter, Sraban Kumar Mohanty
被引用 1 次
摘要
Spectral clustering (SC) is well-known for discovering natural groups present in the data by projecting them into Eigen-space based on the proximity graph but incurs cubic time in terms of size (N) of the data as all pair proximity is used. To enhance the efficiency of the SC techniques, the proximity between the data instances and their representatives ( R ) is captured through a bipartite similarity graph. However, extrinsic parameters such as the number of representatives and nearby representatives of data instances, influence the clustering performance, time, and memory usage. Therefore, in this work, we construct a parameter-free bipartite graph to further improve the clustering quality and computational cost of SC by introducing a locality-based sparsification technique. First, the proposed method (SBSC) determines O(√N) numbers of well-distributed representatives in O(N lg N) time by applying Bi-means and K -means partitioning techniques. Next, SBSC utilizes the local neighbors of R to search the nearby representatives, which fastens the search time to O(N). To the best of our knowledge, the proposed bipartite graph is the least (O(N)) sized and therefore, by exploiting the high sparsity of the graph accelerates the Eigen-decomposition step of SBSC. The proposed algorithm takes overall O(N(K 2 +lg N)) time only to detect K clusters, and experimental results on eighteen large-sized diversified datasets suggest that SBSC discovers complex clusters much faster than the competing methods with enhanced clustering quality.