WWW2024
A Similarity-based Approach for Efficient Large Quasi-clique Detection
Jiayang Pang, Chenhao Ma, Yixiang Fang
8 citations
Abstract
Identifying dense subgraphs called quasi-cliques is pivotal in various graph mining tasks across domains like biology, social networks, and e-commerce. However, recent algorithms still suffer from efficiency issues when mining large quasi-cliques in massive and complex graphs. Our key insight is that vertices within a quasi-clique exhibit similar neighborhoods to some extent. Based on this, we introduce NBSim and FastNBSim, efficient algorithms that find near-maximum quasi-cliques by exploiting vertex neighborhood similarity. FastNBSim further uses MinHash approximations to reduce the time complexity for similarity computation. Empirical evaluation on 10 real-world graphs shows that our algorithms deliver up to three orders of magnitude speedup versus the state-of-the-art algorithms, while ensuring high-quality quasi-clique extraction.